.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.16.
.TH DEFINECLONES.PY "1" "October 2020" "DefineClones.py 1.0.1" "User Commands"
.SH NAME
DefineClones.py \- Repertoire clonal assignment toolkit (Python 3)
.SH DESCRIPTION
usage: DefineClones.py [\-\-version] [\-h] \fB\-d\fR DB_FILES [DB_FILES ...]
.TP
[\-o OUT_FILES [OUT_FILES ...]] [\-\-outdir OUT_DIR]
[\-\-outname OUT_NAME] [\-\-log LOG_FILE] [\-\-failed]
[\-\-format {airr,changeo}] [\-\-nproc NPROC]
[\-\-sf SEQ_FIELD] [\-\-vf V_FIELD] [\-\-jf J_FIELD]
[\-\-gf GROUP_FIELDS [GROUP_FIELDS ...]]
[\-\-mode {allele,gene}] [\-\-act {first,set}]
[\-\-model {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}]
[\-\-dist DISTANCE] [\-\-norm {len,mut,none}]
[\-\-sym {avg,min}] [\-\-link {single,average,complete}]
[\-\-maxmiss MAX_MISSING]
.PP
Assign Ig sequences into clones
.SS "help:"
.TP
\fB\-\-version\fR
show program's version number and exit
.TP
\fB\-h\fR, \fB\-\-help\fR
show this help message and exit
.SS "standard arguments:"
.TP
\fB\-d\fR DB_FILES [DB_FILES ...]
A list of tab delimited database files. (default:
None)
.TP
\fB\-o\fR OUT_FILES [OUT_FILES ...]
Explicit output file name. Note, this argument cannot
be used with the \fB\-\-failed\fR, \fB\-\-outdir\fR, or \fB\-\-outname\fR
arguments. If unspecified, then the output filename
will be based on the input filename(s). (default:
None)
.TP
\fB\-\-outdir\fR OUT_DIR
Specify to changes the output directory to the
location specified. The input file directory is used
if this is not specified. (default: None)
.TP
\fB\-\-outname\fR OUT_NAME
Changes the prefix of the successfully processed
output file to the string specified. May not be
specified with multiple input files. (default: None)
.TP
\fB\-\-log\fR LOG_FILE
Specify to write verbose logging to a file. May not be
specified with multiple input files. (default: None)
.TP
\fB\-\-failed\fR
If specified create files containing records that fail
processing. (default: False)
.TP
\fB\-\-format\fR {airr,changeo}
Specify input and output format. (default: airr)
.TP
\fB\-\-nproc\fR NPROC
The number of simultaneous computational processes to
execute (CPU cores to utilized). (default: 8)
.SS "cloning arguments:"
.TP
\fB\-\-sf\fR SEQ_FIELD
Field to be used to calculate distance between
records. Defaults to junction (airr) or JUNCTION
(changeo). (default: None)
.TP
\fB\-\-vf\fR V_FIELD
Field containing the germline V segment call. Defaults
to v_call (airr) or V_CALL (changeo). (default: None)
.TP
\fB\-\-jf\fR J_FIELD
Field containing the germline J segment call. Defaults
to j_call (airr) or J_CALL (changeo). (default: None)
.TP
\fB\-\-gf\fR GROUP_FIELDS [GROUP_FIELDS ...]
Additional fields to use for grouping clones aside
from V, J and junction length. (default: None)
.TP
\fB\-\-mode\fR {allele,gene}
Specifies whether to use the V(D)J allele or gene for
initial grouping. (default: gene)
.TP
\fB\-\-act\fR {first,set}
Specifies how to handle multiple V(D)J assignments for
initial grouping. The "first" action will use only the
first gene listed. The "set" action will use all gene
assignments and construct a larger gene grouping
composed of any sequences sharing an assignment or
linked to another sequence by a common assignment
(similar to single\-linkage). (default: set)
.TP
\fB\-\-model\fR {ham,aa,hh_s1f,hh_s5f,mk_rs1nf,mk_rs5nf,hs1f_compat,m1n_compat}
Specifies which substitution model to use for
calculating distance between sequences. The "ham"
model is nucleotide Hamming distance and "aa" is amino
acid Hamming distance. The "hh_s1f" and "hh_s5f"
models are human specific single nucleotide and 5\-mer
content models, respectively, from Yaari et al, 2013.
The "mk_rs1nf" and "mk_rs5nf" models are mouse
specific single nucleotide and 5\-mer content models,
respectively, from Cui et al, 2016. The "m1n_compat"
and "hs1f_compat" models are deprecated models
provided backwards compatibility with the "m1n" and
"hs1f" models in Change\-O v0.3.3 and SHazaM v0.1.4.
Both 5\-mer models should be considered experimental.
(default: ham)
.TP
\fB\-\-dist\fR DISTANCE
The distance threshold for clonal grouping (default:
0.0)
.TP
\fB\-\-norm\fR {len,mut,none}
Specifies how to normalize distances. One of none (do
not normalize), len (normalize by length), or mut
(normalize by number of mutations between sequences).
(default: len)
.TP
\fB\-\-sym\fR {avg,min}
Specifies how to combine asymmetric distances. One of
avg (average of A\->B and B\->A) or min (minimum of A\->B
and B\->A). (default: avg)
.TP
\fB\-\-link\fR {single,average,complete}
Type of linkage to use for hierarchical clustering.
(default: single)
.TP
\fB\-\-maxmiss\fR MAX_MISSING
The maximum number of non\-ACGT characters (gaps or Ns)
to permit in the junction sequence before excluding
the record from clonal assignment. Note, under single
linkage non\-informative positions can create
artifactual links between unrelated sequences. Use
with caution. (default: 0)
.SS "output files:"
.IP
clone\-pass
.IP
database with assigned clonal group numbers.
.IP
clone\-fail
.IP
database with records failing clonal grouping.
.SS "required fields:"
.IP
sequence_id, v_call, j_call, junction
.SS "output fields:"
.IP
clone_id
.SH AUTHOR
 This manpage was written by Nilesh Patra for the Debian distribution and
 can be used for any other usage of the program.
