Scroll to navigation

DINDEL(1) User Commands DINDEL(1)

NAME

dindel - finds of insertions and deletions from short nucleotide sequences

DESCRIPTION

[Required] :

fasta reference sequence (should be indexed with .fai file)
file-prefix for output results

[Required] Program option:

getCIGARindels: Extract indels from CIGARs of mapped reads, and infer library insert size distributions indels: infer indels realignCandidates: Realign/reposition candidates in candidate file

[Required] BAM input. Choose one of the following:

read alignment file (should be indexed)
file containing filepaths for BAMs to be jointly analysed (not possible for --analysis==indels

[Required for analysis == getCIGARindels]: Region to be considered for extraction of candidate indels.:

region to be analysed in format start-end, eg. 1000-2000
target sequence (eg 'X')

[Required for analysis == indels]:

file with candidate variants to be tested.
coordinates in varFile are one-based

Output options:

output BAM file with realigned reads
file
quiet output

parameters for analysis==indels option:

analyze data assuming a diploid sequence
estimate haplotype frequencies using Bayesian EM algorithm. May be applied to single individual and pools.

General algorithm parameters:

use faster but less accurate ungapped read-haplotype alignment model
prefilter haplotypes based on coverage
#bases of reference sequence of indel region
max number of mismatches in indel region
prior probability of a SNP site
prior probability of a detected indel not being a sequencing error
number of bases to left and right of indel
maximum number of haplotypes in likelihood computation
maximum number of reads in likelihood computation
lower limit for read mapping quality
upper limit for read mapping quality in observationmodel_old (phred units)
cap mapping quality in alignment using fast ungapped method
(WARNING: setting it too high (>50)
might result in significant overcalling!)
skip computation if number of haplotypes exceeds this number
minimum overlap between read and haplotype
maximum length of reads
minimum number of WS observations of indel
skip if product of number of reads and haplotypes exceeds this value
change sequence of inserted sequence to 'N', so that no penalty is incurred if a read mismatches the inserted sequence

parameters for --pooled option:

Dirichlet a0 parameter haplotype frequency prior
singlevariant or priorpersite)

General algorithm filtering options:

--checkAllCIGARs arg (=1) include all indels at the position of the call site

match string for exclusion of reads based on auxilary information

Observation model parameters:

probability of a read indel
probability of a mutation in the read
maximum length of a _sequencing error_ indel in read [not for --faster option]

Library options:

file with library insert histograms (as generated by --analysis getCIGARindels)

Misc results analysis options:

compare likelihood differences in reads against haplotypes

--compareReadHapThreshold arg (=0.5) difference threshold for viewing

show empirical distribution over nucleotides
show candidate haplotypes for fast method
show for each haplotype which reads map to it
show reads
inference method
output likelihoods for every read and haplotype

[Required] :

fasta reference sequence (should be indexed with .fai file)
file-prefix for output results

[Required] Program option:

getCIGARindels: Extract indels from CIGARs of mapped reads, and infer library insert size distributions indels: infer indels realignCandidates: Realign/reposition candidates in candidate file

[Required] BAM input. Choose one of the following:

read alignment file (should be indexed)
file containing filepaths for BAMs to be jointly analysed (not possible for --analysis==indels

[Required for analysis == getCIGARindels]: Region to be considered for extraction of candidate indels.:

region to be analysed in format start-end, eg. 1000-2000
target sequence (eg 'X')

[Required for analysis == indels]:

file with candidate variants to be tested.
coordinates in varFile are one-based

Output options:

output BAM file with realigned reads
file
quiet output

parameters for analysis==indels option:

analyze data assuming a diploid sequence
estimate haplotype frequencies using Bayesian EM algorithm. May be applied to single individual and pools.

General algorithm parameters:

use faster but less accurate ungapped read-haplotype alignment model
prefilter haplotypes based on coverage
#bases of reference sequence of indel region
max number of mismatches in indel region
prior probability of a SNP site
prior probability of a detected indel not being a sequencing error
number of bases to left and right of indel
maximum number of haplotypes in likelihood computation
maximum number of reads in likelihood computation
lower limit for read mapping quality
upper limit for read mapping quality in observationmodel_old (phred units)
cap mapping quality in alignment using fast ungapped method
(WARNING: setting it too high (>50)
might result in significant overcalling!)
skip computation if number of haplotypes exceeds this number
minimum overlap between read and haplotype
maximum length of reads
minimum number of WS observations of indel
skip if product of number of reads and haplotypes exceeds this value
change sequence of inserted sequence to 'N', so that no penalty is incurred if a read mismatches the inserted sequence

parameters for --pooled option:

Dirichlet a0 parameter haplotype frequency prior
singlevariant or priorpersite)

General algorithm filtering options:

--checkAllCIGARs arg (=1) include all indels at the position of the call site

match string for exclusion of reads based on auxilary information

Observation model parameters:

probability of a read indel
probability of a mutation in the read
maximum length of a _sequencing error_ indel in read [not for --faster option]

Library options:

file with library insert histograms (as generated by --analysis getCIGARindels)

Misc results analysis options:

compare likelihood differences in reads against haplotypes

--compareReadHapThreshold arg (=0.5) difference threshold for viewing

show empirical distribution over nucleotides
show candidate haplotypes for fast method
show for each haplotype which reads map to it
show reads
inference method
output likelihoods for every read and haplotype

SEE ALSO

The full documentation for dindel you find referenced on https://sites.google.com/site/keesalbers/soft/dindel

March 2016