Scroll to navigation

SHASTA(1) User Commands SHASTA(1)

NAME

shasta - nanopore whole genome assembly tool

DESCRIPTION

Options allowed only on the command line:

Write a help message.
Identify the Shasta version.
Configuration file name.
Names of input files containing reads.Specify at least one.
Name of the output directory. Ifcommand is assemble, this directorymust not exist.
Command to run. Must be one of:assemble, saveBinaryData,cleanupBinaryData, explore,createBashCompletionScript
Specify whether allocated memory isanonymous or backed by a filesystem.Allowed values: anonymous, filesystem.
Specify the type of pages used to backmemory.Allowed values: disk, 4K , 2M (for bestperformance). All combinations(memoryMode, memoryBacking) are allowedexcept for (anonymous, disk).Some combinations require rootprivilege, which is obtained using sudoand may result in a password promptingdepending on your sudo set up.
Number of threads, or 0 to use onethread per virtual processor.
Specify allowed access for --command explore. Allowed values: user, local, unrestricted. DO NOT CHANGE FROM DEFAULT VALUE WITHOUT UNDERSTANDING THE SECURITY IMPLICATIONS.
Port to be used by the http server(command --explore).

Options allowed on the command line and in the config file:

Read length cutoff. Shorter reads arediscarded.
If set, skip the Linux cache whenloading reads. This is done byspecifying the O_DIRECT flag whenopening input files containing reads.
Skip flagging palindromic reads. OxfordNanopore reads should be flagged forbetter results.
Used for palindromic read detection.
Used for palindromic read detection.
Used for palindromic read detection.
Used for palindromic read detection.
Used for palindromic read detection.
Used for palindromic read detection.
Method to generate marker k-mers: 0 =random, 1 = random, excluding globallyoverenriched,2 = random, excludingoverenriched even in a single read,3 =read from file.
Length of marker k-mers (in run-lengthspace).
Fraction k-mers used as a marker.
Enrichment threshold forKmers.generationMethod 1 and 2.
The absolute path of a file containingthe k-mers to be used as markers, oneper line. A relative path is notaccepted. Only used ifKmers.generationMethod is 3.
Controls the version of the LowHashalgorithm to use. Can be 0 (default) or1.(experimental).
The number of consecutive markers thatdefine a MinHash/LowHash feature.
Defines how low a hash has to be to beused with the LowHash algorithm.
The number of MinHash/LowHashiterations, or 0 to let--MinHash.alignmentCandidatesPerRead control the number of iterations.
If --MinHash.minHashIterationCount is 0, MinHash iteration is stopped when the average number of alignment candidates that each read is involved in reaches this value. If --MinHash.minHashIterationCount is not 0, this is not used.
The minimum bucket size to be used bythe LowHash algorithm.
The maximum bucket size to be used bythe LowHash algorithm.
The minimum number of times a pair ofreads must be found by theMinHash/LowHash algorithm in order tobe considered a candidate alignment.
Skip the MinHash algorithm and mark allpairs of reads as alignmentcandidateswith both orientation. This should onlybe used for experimentation on verysmall runs because it is very timeconsuming.
The alignment method to be used tocreate the read graph & the markergraph. 0 = old Shasta method, 1 = SeqAn(slow), 3 = banded SeqAn.
The maximum number of markers that analignment is allowed to skip.
The maximum amount of marker drift thatan alignment is allowed to toleratebetween successive markers.
The maximum number of unaligned markerstolerated at the beginning and end ofan alignment.
Marker frequency threshold. Markersmore frequent than this value in eitherof two oriented reads being aligned arediscarded and not used to compute thealignment.
The minimum number of aligned markersfor an alignment to be used.
The minimum fraction of aligned markersfor an alignment to be used.
Match score for marker alignments (onlyused for alignment methods 1 and 3).
Mismatch score for marker alignments(only used for alignment methods 1 and3).
Gap score for marker alignments (onlyused for alignment methods 1 and 3).
Downsampling factor (only used foralignment method 3).
Amount to extend the downsampled band(only used for alignment method 3).
If not zero, alignments between readsfrom the same nanopore channel andclose in time are suppressed. The"read" meta data fields from the FASTAor FASTQ header are checked. If theirdifference, in absolute value, is lessthan the value of this option, thealignment is suppressed. This can helpavoid assembly artifact. This check isonly done if the two reads haveidentical meta data fields "runid","sampleid", and "ch". If any of thesemeta data fields are missing, thischeck is suppressed and this option hasno effect.
Suppress containment alignments, thatis alignments in which one read isentirely contained in another read,except possibly for up to maxTrimmarkers at the beginning and end.
The method used to create the readgraph (0 = undirected, default, 1 =directed, experimental).
The maximum number of alignments to bekept for each read.
The minimum size (number of orientedreads) of a connected component of theread graph to be kept. This iscurrently ignored.
Used for chimeric read detection.
Maximum distance (edges) forflagCrossStrandReadGraphEdges. Set thisto zero to entirely suppressflagCrossStrandReadGraphEdges.
Maximum number of alignments to be keptfor each contained read (only used whencreationMethod is 1).
Maximum number of alignments to be keptin each direction (forward, backward)for each uncontained read (only usedwhen creationMethod is 1).
Remove conflicts from the read graph.Experimental - do not use.
Minimum number of markers for a markergraph vertex.
Maximum number of markers for a markergraph vertex.
Used during approximate transitivereduction. Marker graph edges withcoverage lower than this value arealways marked as removed regardless ofreachability.
Used during approximate transitivereduction. Marker graph edges withcoverage higher than this value arenever marked as removed regardless ofreachability.
Used during approximate transitivereduction.
Used during approximate transitivereduction.
Number of prune iterations.
Maximum lengths (in markers) used ateach iteration of simplifyMarkerGraph.
Experimental. Cross edge coveragethreshold. If this is not zero,assembly graph cross-edges with averageedge coverage less than this value areremoved, together with thecorresponding marker graph edges. Across edge is defined as an edge v0->v1with out-degree(v0)>1, in-degree(v1)>1.
Experimental. Length threshold, inmarkers, for the marker graphrefinement step, or 0 to turn off therefinement step.
Perform approximate reverse transitivereduction of the marker graph.
Maximum average edge coverage for across edge of the assembly graph to beremoved.
Controls assembly of long marker graphedges.
Selects the consensus caller for repeatcounts. See the documentation foravailable choices.
Used to request storing coverage datain binary format.
Used to specify the minimum length ofan assembled segment for which coveragedata in csv format should be stored. If0, no coverage data in csv format isstored.
Used to request writing the reads thatcontributed to assembling each segment.
Experimental. Specify the method usedto detangle the assembly graph. 0 = nodetangling, 1 = basic detangling.
August 2020 shasta