Scroll to navigation

GCPP(1) User Commands GCPP(1)

NAME

gcpp - Compute genomic consensus from alignments and call variants relative to the reference

SYNOPSIS

gcpp [options] INPUT

DESCRIPTION

Compute genomic consensus from alignments and call variants relative to the reference.

Basic required options:

--referenceFilename,--reference,-r
The filename of the reference FASTA file.
--outputFilenames,-o
The output filename(s), as a comma-separated list. Valid output formats are .fa/.fasta, .fq/.fastq, .gff, .vcf

Parallelism:

--numThreads,-j
The number of threads to be used. [1]

Output filtering:

--minConfidence,-q
The minimum confidence for a variant call to be output to variants.{gff,vcf} [40]
--minCoverage,-x
The minimum site coverage that must be achieved for variant calls and consensus to be calculated for a site. [5]
--noEvidenceConsensusCall
The consensus base that will be output for sites with no effective coverage. ["lowercasereference"]

Read selection/filtering:

--coverage,-X
A designation of the maximum coverage level to be used for analysis. Exact interpretation is algorithm-specific. [100]
--minAccuracy
The minimum acceptable window-global alignment accuracy for reads that will be used for the analysis (arrow-only). [0.82]
--minMapQV,-m
The minimum MapQV for reads that will be used for analysis. [10]
--minReadScore
The minimum ReadScore for reads that will be used for analysis (arrow-only). [0.65]
--minSnr
The minimum acceptable signal-to-noise over all channels for reads that will be used for analysis (arrow-only). [3.75]
--minZScore
The minimum acceptable z-score for reads that will be used for analysis (arrow-only). [-3.4]
--barcode,--barcodes
Comma-separated list of barcode pairs to analyze, either by name, such as 'lbc1--lbc1', or by index, such as '0--0'. NOTE: Filtering barcodes by name requires a barcode file.
--barcodeFile
Fasta file of the barcode sequences used. NOTE: Only used to find barcode names
--referenceWindow,--referenceWindows,-w
The window (or multiple comma-delimited windows) of the reference to be processed, in the format refGroup:refStart-refEnd (default: entire reference).
--referenceWindowsFile,-W
A file containing reference window designations, one per line

Algorithm and parameter settings:

--algorithm
The consensus algorithm used. ["arrow"]
--maskRadius
Radius of window to use when excluding local regions for exceeding maskMinErrorRate, where 0 disables any filtering (arrow-only). [0]
--maskErrorRate
Maximum local error rate before the local region defined bymaskRadius is excluded from polishing (arrow-only). [0]
--parametersFile,-P
Parameter set filename (such as ArrowParameters.json or QuiverParameters.ini), or directory D such that either D/*/GenomicConsensus/QuiverParameters .ini, or D/GenomicConsensus/QuiverParameters.i ni, is found. In the former case, the lexically largest path is chosen.
--parametersSpec,-p
Name of parameter set (chemistry.model) to select from the parameters file, or just the name of the chemistry, in which case the best available model is chosen. Default is 'auto', which selects the best parameter set from the alignment data ["auto"]
--maxIterations
Maximum number of iterations to polish the template. [40]
--maxPoaCoverage
Maximum number of sequences to use for consensus calling. [11]
--mutationSeparation
Find the best mutations within a separation window for iterative polishing. [10]
--mutationNeighborhood
Find nearby mutations within neighborhood for iterative polishing. [10]
--readStumpinessThreshold
Filter out reads whose aligned length along a subread is lower than a percentage of its corresponding reference length. [0.1]

Verbosity and debugging:

--logFile
Log to a file, instead of STDERR.
--dumpEvidence,-d
Dump evidence data
--evidenceDirectory
Directory to dump evidence into.
--annotateGFF
Augment GFF variant records with additional information
--reportEffectiveCoverage
Additionally record the *post-filtering* coverage at variant sites

Advanced configuration options:

--referenceChunkSize,-C
Size of reference chunks. [500]
--referenceChunkOverlap
Size of reference chunk overlaps. [5]
--simpleChunking
Disable adaptive reference chunking.
--diploid
Enable detection of heterozygous variants (experimental)
--fast
Cut some corners to run faster. Unsupported!
--skipUnrecognizedContigs
Do not abort when told to process a reference window (via -w/--referenceWindow[s]) that has no aligned coverage. Outputs emptyish files if there are no remaining non-degenerate windows. Only intended for use by smrtpipe scatter/gather.
--sortStrategy
Read sortiing strategy ["longest_and_strand_balanced"]
--minPoaCoverage
Minimum number of reads required within a window to call consensus and variants using arrow or poa. [3]

OPTIONS

-h,--help
Output this help.
--log-level,--logLevel
Set log level. ["INFO"]
--version
Output version info.
--emit-tool-contract
Emit tool contract.
--resolved-tool-contract
Use args from resolved tool contract.

Arguments:

INPUT
The input BAM alignment file

AUTHOR

This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.
October 2018 gcpp 3.1.0