Scroll to navigation

ABPOA(1) User Commands ABPOA(1)

NAME

abpoa, abpoa.avx2, abpoa.avx, abpoa.sse4.1, abpoa.ssse3, abpoa.sse3, abpoa.generic - adaptive banded Partial Order Alignment

SYNOPSIS

abpoa [options] <in.fa/fq> > cons.fa/msa.fa/abpoa.gfa

DESCRIPTION

abPOA is an extended version of Partial Order Alignment (POA) that performs adaptive banded dynamic programming (DP) with an SIMD implementation. abPOA can perform multiple sequence alignment (MSA) on a set of input sequences and generate a consensus sequence by applying the heaviest bundling algorithm to the final alignment graph.

abPOA can generate high-quality consensus sequences from error-prone long reads and offer significant speed improvement over existing tools.

abPOA supports three alignment modes (global, local, extension) and flexible scoring schemes that allow linear, affine and convex gap penalties. It right now supports SSE2/SSE4.1/AVX2 vectorization.

OPTIONS

Alignment:
INT alignment mode [0] 0: global, 1: local, 2: extension
INT match score [2]
INT mismatch penalty [4]
FILE scoring matrix file, '-M' and '-X' are not used when '-t' is used [Null] e.g., 'HOXD70.mtx, BLOSUM62.mtx'

-O --gap-open INT(,INT) gap opening penalty (O1,O2) [4,24]

INT(,INT) gap extension penalty (E1,E2) [2,1] abPOA provides three gap penalty modes, cost of a g-long gap: - convex (default): min{O1+g*E1, O2+g*E2} - affine (set O2 as 0): O1+g*E1 - linear (set O1 as 0): g*E1
ambiguous strand mode [False] for each input sequence, try the reverse complement if the current alignment score is too low, and pick the strand with a higher score
Adaptive banded DP:
INT first adaptive banding parameter [10] set b as < 0 to disable adaptive banded DP
FLOAT second adaptive banding parameter [0.01] the number of extra bases added on both sites of the band is b+f*L, where L is the length of the aligned sequence
Minimizer-based seeding and partition (only effective in global alignment mode):
enable minimizer-based seeding and anchoring [False]
INT minimizer k-mer size [19]
INT minimizer window size [10]
min. size of window to perform POA [500]
build guide tree and perform progressive partial order alignment [False]
Input/Output:
take base quality score from FASTQ input file as graph edge weight for consensus calling [False] effective only when input sequences are in FASTQ format and consensus calling with heaviest bundling
input sequences are amino acid (default is nucleotide) [False]
input file is a list of sequence file names [False] each line is one sequence file containing a set of sequences which will be aligned by abPOA to generate a consensus sequence
FILE incrementally align sequences to an existing graph/MSA [Null] graph could be in GFA or MSA format generated by abPOA
FILE output to FILE [stdout]
INT output result mode [0] - 0: consensus in FASTA format - 1: MSA in PIR format - 2: both 0 & 1 - 3: graph in GFA format - 4: graph with consensus path in GFA format - 5: consensus in FASTQ format
consensus algorithm [0] - 0: heaviest bundling path in partial order graph - 1: most frequent bases at each position
max. number of consensus sequence to generate [1]
FLOAT min. frequency of each consensus sequence (only effective when -d/--num-cons > 1) [0.25]
FILE dump final alignment graph to FILE (.pdf/.png) [Null]
print this help usage information
show version number
INT verbose level (0-2). 0: none, 1: information, 2: debug [0]

SEE ALSO

For more information please refer to the paper published in Bioinformatics:

https://dx.doi.org/10.1093/bioinformatics/btaa963

September 2024 abpoa 1.5.3