.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.47.15.
.TH BUILDCONSENSUS.PY "1" "May 2020" "BuildConsensus.py 0.6.0" "User Commands"
.SH NAME
BuildConsensus.py \- Builds a consensus sequence for each set of input sequences
.SH DESCRIPTION
usage: BuildConsensus.py [\-\-version] [\-h] \fB\-s\fR SEQ_FILES [SEQ_FILES ...]
.TP
[\-o OUT_FILES [OUT_FILES ...]] [\-\-outdir OUT_DIR]
[\-\-outname OUT_NAME] [\-\-log LOG_FILE] [\-\-failed]
[\-\-fasta] [\-\-delim DELIMITER DELIMITER DELIMITER]
[\-\-nproc NPROC] [\-n MIN_COUNT] [\-\-bf BARCODE_FIELD]
[\-q MIN_QUAL] [\-\-freq MIN_FREQ] [\-\-maxgap MAX_GAP]
[\-\-pf PRIMER_FIELD] [\-\-prcons PRIMER_FREQ]
[\-\-cf COPY_FIELDS [COPY_FIELDS ...]]
[\-\-act {min,max,sum,set,majority} [{min,max,sum,set,majority} ...]]
[\-\-dep]
[\-\-maxdiv MAX_DIVERSITY | \fB\-\-maxerror\fR MAX_ERROR]
.PP
Builds a consensus sequence for each set of input sequences
.SS "help:"
.TP
\fB\-\-version\fR
show program's version number and exit
.TP
\fB\-h\fR, \fB\-\-help\fR
show this help message and exit
.SS "standard arguments:"
.TP
\fB\-s\fR SEQ_FILES [SEQ_FILES ...]
A list of FASTA/FASTQ files containing sequences to
process. (default: None)
.TP
\fB\-o\fR OUT_FILES [OUT_FILES ...]
Explicit output file name(s). Note, this argument
cannot be used with the \fB\-\-failed\fR, \fB\-\-outdir\fR, or
\fB\-\-outname\fR arguments. If unspecified, then the output
filename will be based on the input filename(s).
(default: None)
.TP
\fB\-\-outdir\fR OUT_DIR
Specify to changes the output directory to the
location specified. The input file directory is used
if this is not specified. (default: None)
.TP
\fB\-\-outname\fR OUT_NAME
Changes the prefix of the successfully processed
output file to the string specified. May not be
specified with multiple input files. (default: None)
.TP
\fB\-\-log\fR LOG_FILE
Specify to write verbose logging to a file. May not be
specified with multiple input files. (default: None)
.TP
\fB\-\-failed\fR
If specified create files containing records that fail
processing. (default: False)
.TP
\fB\-\-fasta\fR
Specify to force output as FASTA rather than FASTQ.
(default: None)
.TP
\fB\-\-delim\fR DELIMITER DELIMITER DELIMITER
A list of the three delimiters that separate
annotation blocks, field names and values, and values
within a field, respectively. (default: ('|', '=',
\&','))
.TP
\fB\-\-nproc\fR NPROC
The number of simultaneous computational processes to
execute (CPU cores to utilized). (default: 4)
.SS "consensus generation arguments:"
.TP
\fB\-n\fR MIN_COUNT
The minimum number of sequences needed to define a
valid consensus. (default: 1)
.TP
\fB\-\-bf\fR BARCODE_FIELD
Position of description barcode field to group
sequences by. (default: BARCODE)
.TP
\fB\-q\fR MIN_QUAL
Consensus quality score cut\-off under which an
ambiguous character is assigned; does not apply when
quality scores are unavailable. (default: 0)
.TP
\fB\-\-freq\fR MIN_FREQ
Fraction of character occurrences under which an
ambiguous character is assigned. (default: 0.6)
.TP
\fB\-\-maxgap\fR MAX_GAP
If specified, this defines a cut\-off for the frequency
of allowed gap values for each position. Positions
exceeding the threshold are deleted from the
consensus. If not defined, positions are always
retained. (default: None)
.TP
\fB\-\-pf\fR PRIMER_FIELD
Specifies the field name of the primer annotations
(default: None)
.TP
\fB\-\-prcons\fR PRIMER_FREQ
Specify to define a minimum primer frequency required
to assign a consensus primer, and filter out sequences
with minority primers from the consensus building
step. (default: None)
.TP
\fB\-\-cf\fR COPY_FIELDS [COPY_FIELDS ...]
Specifies a set of additional annotation fields to
copy into the consensus sequence annotations.
(default: None)
.TP
\fB\-\-act\fR {min,max,sum,set,majority} [{min,max,sum,set,majority} ...]
List of actions to take for each copy field which
defines how each annotation will be combined into a
single value. The actions "min", "max", "sum" perform
the corresponding mathematical operation on numeric
annotations. The action "set" combines annotations
into a comma delimited list of unique values and adds
an annotation named <FIELD>_COUNT specifying the count
of each item in the set. The action "majority" assigns
the most frequent annotation to the consensus
annotation and adds an annotation named <FIELD>_FREQ
specifying the frequency of the majority value.
(default: None)
.TP
\fB\-\-dep\fR
Specify to calculate consensus quality with a nonindependence assumption (default: False)
.TP
\fB\-\-maxdiv\fR MAX_DIVERSITY
Specify to calculate the nucleotide diversity of each
read group (average pairwise error rate) and remove
groups exceeding the given diversity threshold.
Diversity is calculate for all positions within the
read group, ignoring any character filtering imposed
by the \fB\-q\fR, \fB\-\-freq\fR and \fB\-\-maxgap\fR arguments. Mutually
exclusive with \fB\-\-maxerror\fR. (default: None)
.TP
\fB\-\-maxerror\fR MAX_ERROR
Specify to calculate the error rate of each read group
(rate of mismatches from consensus) and remove groups
exceeding the given error threshold. The error rate is
calculated against the final consensus sequence, which
may include masked positions due to the \fB\-q\fR and \fB\-\-freq\fR
arguments and may have deleted positions due to the
\fB\-\-maxgap\fR argument. Mutually exclusive with \fB\-\-maxdiv\fR.
(default: None)
.SS "output files:"
.IP
consensus\-pass
.IP
consensus reads.
.IP
consensus\-fail
.IP
raw reads failing consensus filtering criteria.
.SS "output annotation fields:"
.IP
PRIMER
.IP
a comma delimited list of unique primer annotations found within the
barcode read group.
.IP
PRCOUNT
.IP
a comma delimited list of the corresponding counts of unique primer
annotations.
.IP
PRCONS
.IP
the majority primer within the barcode read group.
.IP
PRFREQ
.IP
the frequency of the majority primer.
.IP
CONSCOUNT
.IP
the count of reads within the barcode read group which contributed to
the consensus sequence. This is the total size of the read group,
minus sequence excluded due to user defined filtering criteria.
.SH AUTHOR
 This manpage was written by Andreas Tille for the Debian distribution and
 can be used for any other usage of the program.
