- bookworm 2.4.0+dfsg-15
- testing 2.4.0+dfsg-16
- unstable 2.4.0+dfsg-16
- experimental 2.5.0~rc3+dfsg-1
RAZERS(1) | RAZERS(1) |
NAME¶
razers - Fast Read Mapping with Sensitivity Control
SYNOPSIS¶
razers [OPTIONS] <GENOME FILE>
<READS FILE>
razers [OPTIONS] <GENOME FILE> <MP-READS
FILE1> <MP-READS FILE2>
DESCRIPTION¶
RazerS is a versatile full-sensitive read mapper based on a k-mer counting filter. It supports single and paired-end mapping, and optimally parametrizes the filter based on a user-defined minimal sensitivity. See http://www.seqan.de/projects/razers for more information.
Input to RazerS is a reference genome file and either one file with single-end reads or two files containing left or right mates of paired-end reads. Use - to read single-end reads from stdin.
(c) Copyright 2009 by David Weese.
REQUIRED ARGUMENTS¶
- ARGUMENT 0 INPUT_FILE
- A reference genome file. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.
- READS List of INPUT_FILE's
- Either one (single-end) or two (paired-end) read files. Valid filetypes are: .sam[.*], .raw[.*], .gbk[.*], .frn[.*], .fq[.*], .fna[.*], .ffn[.*], .fastq[.*], .fasta[.*], .faa[.*], .fa[.*], .embl[.*], and .bam, where * is any of the following extensions: gz, bz2, and bgzf for transparent (de)compression.
OPTIONS¶
- -h, --help
- Display the help message.
- --version
- Display version information.
Main Options:¶
- -f, --forward
- Map reads only to forward strands.
- -r, --reverse
- Map reads only to reverse strands.
- -i, --percent-identity DOUBLE
- Percent identity threshold. In range [50..100]. Default: 92.
- -rr, --recognition-rate DOUBLE
- Percent recognition rate. In range [80..100]. Default: 99.
- -pd, --param-dir STRING
- Read user-computed parameter files in the directory <DIR>.
- -id, --indels
- Allow indels. Default: mismatches only.
- -ll, --library-length INTEGER
- Paired-end library length. In range [1..inf]. Default: 220.
- -le, --library-error INTEGER
- Paired-end library length tolerance. In range [0..inf]. Default: 50.
- -m, --max-hits INTEGER
- Output only <NUM> of the best hits. In range [1..inf]. Default: 100.
- --unique
- Output only unique best matches (-m 1 -dr 0 -pa).
- -tr, --trim-reads INTEGER
- Trim reads to given length. Default: off. In range [14..inf].
- -o, --output OUTPUT_FILE
- Change output filename (use - to dump to stdout in razers format). Default: <READS FILE>.razers. Valid filetypes are: .razers, .gff, .fasta, .fa, and .eland.
- -v, --verbose
- Verbose mode.
- -vv, --vverbose
- Very verbose mode.
Output Format Options:¶
- -a, --alignment
- Dump the alignment for each match (only razer or fasta format).
- -pa, --purge-ambiguous
- Purge reads with more than <max-hits> best matches.
- -dr, --distance-range INTEGER
- Only consider matches with at most NUM more errors compared to the best. Default: output all.
- -gn, --genome-naming INTEGER
- Select how genomes are named (see Naming section below). In range [0..1]. Default: 0.
- -rn, --read-naming INTEGER
- Select how reads are named (see Naming section below). In range [0..2]. Default: 0.
- -so, --sort-order INTEGER
- Select how matches are sorted (see Sorting section below). In range [0..1]. Default: 0.
- -pf, --position-format INTEGER
- Select begin/end position numbering (see Coordinate section below). In range [0..1]. Default: 0.
Filtration Options:¶
- -s, --shape STRING
- Manually set k-mer shape. Default: 11111111111.
- -t, --threshold INTEGER
- Manually set minimum k-mer count threshold. In range [1..inf].
- -oc, --overabundance-cut INTEGER
- Set k-mer overabundance cut ratio. In range [0..1].
- -rl, --repeat-length INTEGER
- Skip simple-repeats of length <NUM>. In range [1..inf]. Default: 1000.
- -tl, --taboo-length INTEGER
- Set taboo length. In range [1..inf]. Default: 1.
- -lm, --low-memory
- Decrease memory usage at the expense of runtime.
Verification Options:¶
- -mN, --match-N
- N matches all other characters. Default: N matches nothing.
- -ed, --error-distr STRING
- Write error distribution to FILE.
- -mcl, --min-clipped-len INTEGER
- Set minimal read length for read clipping. In range [0..inf]. Default: 0.
- -qih, --quality-in-header
- Quality string in fasta header.
FORMATS, NAMING, SORTING, AND COORDINATE SCHEMES¶
RazerS supports various output formats. The output format is detected automatically from the file name suffix.
- .razers
- Razer format
- .fa, .fasta
- Enhanced Fasta format
- .eland
- Eland format
- .gff
- GFF format
By default, reads and contigs are referred by their Fasta ids given in the input files. With the -gn and -rn options this behaviour can be changed:
- 0
- Use Fasta id.
- 1
- Enumerate beginning with 1.
- 2
- Use the read sequence (only for short reads!).
The way matches are sorted in the output file can be changed with the -so option for the following formats: razer, fasta, sam, and amos. Primary and secondary sort keys are:
- 0
- 1. read number, 2. genome position
- 1
- 1. genome position, 2. read number
The coordinate space used for begin and end positions can be changed with the -pf option for the razer and fasta formats:
- 0
- Gap space. Gaps between characters are counted from 0.
- 1
- Position space. Characters are counted from 1.
EXAMPLES¶
- razers example/genome.fa example/reads.fa -id -a -mN -v
- Map single-end reads with 4% error rate, indels, and output the alignments. Ns are considered to match everything.
- razers example/genome.fa example/reads.fa example/reads2.fa -id -mN
- Map paired-end reads with up to 4% errors, indels, and output concordantly mapped pairs within default library size. Ns are considered to match everything.
razers 1.5.8 [tarball] |