NAME¶

diamond - accelerated BLAST compatible local sequence aligner

SYNOPSIS¶

diamond COMMAND [OPTIONS]

DESCRIPTION¶

DIAMOND is a sequence aligner for protein and translated DNA searches and functions as a drop-in replacement for the NCBI BLAST software tools. It is suitable for protein-protein search as well as DNA-protein search on short reads and longer sequences including contigs and assemblies, providing a speedup of BLAST ranging up to x20,000.

COMMANDS¶

makedb: Build DIAMOND database from a FASTA file
blastp: Align amino acid query sequences against a protein reference database
blastx: Align DNA query sequences against a protein reference database
view: View DIAMOND alignment archive (DAA) formatted file
help: Produce help message
version: Display version information
getseq: Retrieve sequences from a DIAMOND database file

OPTIONS¶

General options:¶

--threads (-p): number of CPU threads
--db (-d): database file
--out (-o): output file
--outfmt (-f): output format

: 0 = BLAST pairwise
: 5 = BLAST XML
: 6 = BLAST tabular
: 100 = DIAMOND alignment archive (DAA)
: 101 = SAM
: Value 6 may be followed by a space-separated list of these keywords:
: qseqid means Query Seq - id
: qlen means Query sequence length
: sseqid means Subject Seq - id
: sallseqid means All subject Seq - id(s), separated by a ';'
: slen means Subject sequence length
: qstart means Start of alignment in query
: qend means End of alignment in query
: sstart means Start of alignment in subject
: send means End of alignment in subject
: qseq means Aligned part of query sequence
: sseq means Aligned part of subject sequence
: evalue means Expect value
: bitscore means Bit score
: score means Raw score
: length means Alignment length
: pident means Percentage of identical matches
: nident means Number of identical matches
: mismatch means Number of mismatches
: positive means Number of positive - scoring matches
: gapopen means Number of gap openings
: gaps means Total number of gaps
: ppos means Percentage of positive - scoring matches
: qframe means Query frame
: btop means Blast traceback operations(BTOP)
: stitle means Subject Title
: salltitles means All Subject Title(s), separated by a '<>'
: qcovhsp means Query Coverage Per HSP
: qtitle means Query title
: Default: qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore

--verbose (-v): verbose console output
--log: enable debug log
--quiet: disable console output

Makedb options:¶

--in: input reference file in FASTA format

Aligner options:¶

--query (-q): input query file
--un: file for unaligned queries
--unal: report unaligned queries (0=no, 1=yes)
--max-target-seqs (-k): maximum number of target sequences to report alignments for
--top: report alignments within this percentage range of top alignment score (overrides --max-target-seqs)
--compress: compression for output files (0=none, 1=gzip)
--evalue (-e): maximum e-value to report alignments
--min-score: minimum bit score to report alignments (overrides e-value setting)
--id: minimum identity% to report an alignment
--query-cover: minimum query cover% to report an alignment
--subject-cover: minimum subject cover% to report an alignment
--sensitive: enable sensitive mode (default: fast)
--more-sensitive: enable more sensitive mode (default: fast)
--block-size (-b): sequence block size in billions of letters (default=2.0)
--index-chunks (-c): number of chunks for index processing
--tmpdir (-t): directory for temporary files
--gapopen: gap open penalty (default=11 for protein)
--gapextend: gap extension penalty (default=1 for protein)
--matrix: score matrix for protein alignment (default=BLOSUM62)
--custom-matrix: file containing custom scoring matrix
--lambda: lambda parameter for custom matrix
--K: K parameter for custom matrix
--comp-based-stats: enable composition based statistics (0/1=default)
--seg: enable SEG masking of queries (yes/no)
--query-gencode: genetic code to use to translate query (see user manual)
--salltitles: print full subject titles in output files
--no-self-hits: suppress reporting of identical self hits

Advanced options:¶

--min-orf (-l): ignore translated sequences without an open reading frame of at least this length
--freq-sd: number of standard deviations for ignoring frequent seeds
--id2: minimum number of identities for stage 1 hit
--window (-w): window size for local hit search
--xdrop (-x): xdrop for ungapped alignment
--ungapped-score: minimum alignment score to continue local extension
--hit-band: band for hit verification
--hit-score: minimum score to keep a tentative alignment
--gapped-xdrop (-X): xdrop for gapped alignment in bits
--band: band for dynamic programming computation
--shapes (-s): number of seed shapes (0 = all available)
--shape-mask: seed shapes
--index-mode: index mode (0=4x12, 1=16x9)
--fetch-size: trace point fetch size
--rank-factor: include subjects within this range of max-target-seqs
--rank-ratio: include subjects within this ratio of last hit
--max-hsps: maximum number of HSPs per subject sequence to save for each query
--dbsize: effective database size (in letters)
--no-auto-append: disable auto appending of DAA and DMND file extensions
--target-fetch-size: number of target sequences to fetch for seed extension

View options¶

--daa (-a): DIAMOND alignment archive (DAA) file
--forwardonly: only show alignments of forward strand

Getseq options¶

--seq: Sequence numbers to display.

AUTHOR¶

This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage of the program.

January 2017

diamond 0.8.31

Source file:	diamond-aligner.1.en.gz (from diamond-aligner 0.9.24+dfsg-1)
Source last updated:	2018-12-29T20:10:22Z
Converted to HTML:	2020-08-08T10:11:15Z