Scroll to navigation
GCPP(1) |
User Commands |
GCPP(1) |
NAME¶
gcpp - Compute genomic consensus from alignments and call variants relative to
the reference
SYNOPSIS¶
gcpp [options] INPUT
DESCRIPTION¶
Compute genomic consensus from alignments and call variants relative to the
reference.
Basic required options:¶
- --referenceFilename,--reference,-r
- The filename of the reference FASTA file.
- --outputFilenames,-o
- The output filename(s), as a comma-separated list. Valid output formats
are .fa/.fasta, .fq/.fastq, .gff, .vcf
Parallelism:¶
- --numThreads,-j
- The number of threads to be used. [1]
Output filtering:¶
- --minConfidence,-q
- The minimum confidence for a variant call to be output to
variants.{gff,vcf} [40]
- --minCoverage,-x
- The minimum site coverage that must be achieved for variant calls and
consensus to be calculated for a site. [5]
- --noEvidenceConsensusCall
- The consensus base that will be output for sites with no effective
coverage. ["lowercasereference"]
Read selection/filtering:¶
- --coverage,-X
- A designation of the maximum coverage level to be used for analysis. Exact
interpretation is algorithm-specific. [100]
- --minAccuracy
- The minimum acceptable window-global alignment accuracy for reads that
will be used for the analysis (arrow-only). [0.82]
- --minMapQV,-m
- The minimum MapQV for reads that will be used for analysis. [10]
- --minReadScore
- The minimum ReadScore for reads that will be used for analysis
(arrow-only). [0.65]
- --minSnr
- The minimum acceptable signal-to-noise over all channels for reads that
will be used for analysis (arrow-only). [3.75]
- --minZScore
- The minimum acceptable z-score for reads that will be used for analysis
(arrow-only). [-3.4]
- --barcode,--barcodes
- Comma-separated list of barcode pairs to analyze, either by name, such as
'lbc1--lbc1', or by index, such as '0--0'. NOTE: Filtering barcodes by
name requires a barcode file.
- --barcodeFile
- Fasta file of the barcode sequences used. NOTE: Only used to find barcode
names
- --referenceWindow,--referenceWindows,-w
- The window (or multiple comma-delimited windows) of the reference to be
processed, in the format refGroup:refStart-refEnd (default: entire
reference).
- --referenceWindowsFile,-W
- A file containing reference window designations, one per line
Algorithm and parameter settings:¶
- --algorithm
- The consensus algorithm used. ["arrow"]
- --maskRadius
- Radius of window to use when excluding local regions for exceeding
maskMinErrorRate, where 0 disables any filtering (arrow-only). [0]
- --maskErrorRate
- Maximum local error rate before the local region defined bymaskRadius is
excluded from polishing (arrow-only). [0]
- --parametersFile,-P
- Parameter set filename (such as ArrowParameters.json or
QuiverParameters.ini), or directory D such that either
D/*/GenomicConsensus/QuiverParameters .ini, or
D/GenomicConsensus/QuiverParameters.i ni, is found. In the former case,
the lexically largest path is chosen.
- --parametersSpec,-p
- Name of parameter set (chemistry.model) to select from the parameters
file, or just the name of the chemistry, in which case the best available
model is chosen. Default is 'auto', which selects the best parameter set
from the alignment data ["auto"]
- --maxIterations
- Maximum number of iterations to polish the template. [40]
- --maxPoaCoverage
- Maximum number of sequences to use for consensus calling. [11]
- --mutationSeparation
- Find the best mutations within a separation window for iterative
polishing. [10]
- --mutationNeighborhood
- Find nearby mutations within neighborhood for iterative polishing.
[10]
- --readStumpinessThreshold
- Filter out reads whose aligned length along a subread is lower than a
percentage of its corresponding reference length. [0.1]
Verbosity and debugging:¶
- --logFile
- Log to a file, instead of STDERR.
- --dumpEvidence,-d
- Dump evidence data
- --evidenceDirectory
- Directory to dump evidence into.
- --annotateGFF
- Augment GFF variant records with additional information
- --reportEffectiveCoverage
- Additionally record the *post-filtering* coverage at variant sites
Advanced configuration options:¶
- --referenceChunkSize,-C
- Size of reference chunks. [500]
- --referenceChunkOverlap
- Size of reference chunk overlaps. [5]
- --simpleChunking
- Disable adaptive reference chunking.
- --diploid
- Enable detection of heterozygous variants (experimental)
- --fast
- Cut some corners to run faster. Unsupported!
- --skipUnrecognizedContigs
- Do not abort when told to process a reference window (via
-w/--referenceWindow[s]) that has no aligned coverage. Outputs
emptyish files if there are no remaining non-degenerate windows. Only
intended for use by smrtpipe scatter/gather.
- --sortStrategy
- Read sortiing strategy ["longest_and_strand_balanced"]
- --minPoaCoverage
- Minimum number of reads required within a window to call consensus and
variants using arrow or poa. [3]
OPTIONS¶
- -h,--help
- Output this help.
- --log-level,--logLevel
- Set log level. ["INFO"]
- --version
- Output version info.
- --emit-tool-contract
- Emit tool contract.
- --resolved-tool-contract
- Use args from resolved tool contract.
Arguments:¶
- INPUT
- The input BAM alignment file
AUTHOR¶
This manpage was written by Andreas Tille for the Debian distribution and can be
used for any other usage of the program.