Scroll to navigation

GRABIX(1) General Commands Manual GRABIX(1)

NAME

grabix - random access on large compressed sequence data

SYNOPSIS

grabix index bedfile.gz
grabix grab bedfile.gz linenumber

DESCRIPTION

In biomedical research it is increasing practice to study the genetic basis of disease. This now frequently comprises the sequencing of human sequences. The output of the machine however is redundant, and the real sequence is the best sequence to explain the redundancy. The exchange of data happens only with compressed files - to huge and redundant to perform otherwise. One should avoid uncompression whenever possible.

grabix leverages the fantastic BGZF library of the samtools package to provide random access into text files that have been compressed with bgzip (from tabix package). grabix creates it's own index (.gbi) of the bgzipped file. Once indexed, one can extract arbitrary lines from the file with the grab command. Or choose random lines with the, well, random command.

SEE ALSO

https://github.com/arq5x/grabix
July 18, 2013