Scroll to navigation

HHFILTER(1) User Commands HHFILTER(1)


hhfilter - filter an alignment by maximum sequence identity of match states and minimum coverage


hhfilter -i infile -o outfile [options]


HHfilter 3.0.0 (15-03-2015) Filter an alignment by maximum pairwise sequence identity, minimum coverage, minimum sequence identity, or score per column to the first (seed) sequence.n(C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser Remmert M, Biegert A, Hauser A, and Soding J. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat. Methods 9:173-175 (2011).
-i <file>
read input file in A3M/A2M or FASTA format
-o <file>
write to output file in A3M format
-a <file>
append to output file in A3M format


-v <int>
verbose mode: 0:no screen output 1:only warings 2: verbose
[0,100] maximum pairwise sequence identity (%) (def=90)
-diff [0,inf[
filter MSA by selecting most diverse set of sequences, keeping at least this many seqs in each MSA block of length 50 (def=0)
[0,100] minimum coverage with query (%) (def=0)
[0,100] minimum sequence identity with query (%) (def=0)
[0,100] minimum score per column with query (def=-20.0)
-neff [1,inf]
target diversity of alignment (default=off)

Input alignment format:

-M a2m
use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)
-M first
use FASTA: columns with residue in 1st sequence are match states
-M [0,100]
use FASTA: columns with fewer than X% gaps are match states

Example: hhfilter -id 50 -i d1mvfd_.a2m -o d1mvfd_.fil.a2m

February 2019 hhfilter 3.0~beta3+dfsg