sam(5) | Bioinformatics formats | sam(5) |
NAME¶
sam - Sequence Alignment/Map file format
DESCRIPTION¶
Sequence Alignment/Map (SAM) format is TAB-delimited. Apart from the header lines, which are started with the `@' symbol, each alignment line consists of:
1 | QNAME | Query template/pair NAME |
2 | FLAG | bitwise FLAG |
3 | RNAME | Reference sequence NAME |
4 | POS | 1-based leftmost POSition/coordinate of clipped sequence |
5 | MAPQ | MAPping Quality (Phred-scaled) |
6 | CIGAR | extended CIGAR string |
7 | MRNM | Mate Reference sequence NaMe (`=' if same as RNAME) |
8 | MPOS | 1-based Mate POSition |
9 | TLEN | inferred Template LENgth (insert size) |
10 | SEQ | query SEQuence on the same strand as the reference |
11 | QUAL | query QUALity (ASCII-33 gives the Phred base quality) |
12+ | OPT | variable OPTional fields in the format TAG:VTYPE:VALUE |
Each bit in the FLAG field is defined as:
0x0001 | p | the read is paired in sequencing |
0x0002 | P | the read is mapped in a proper pair |
0x0004 | u | the query sequence itself is unmapped |
0x0008 | U | the mate is unmapped |
0x0010 | r | strand of the query (1 for reverse) |
0x0020 | R | strand of the mate |
0x0040 | 1 | the read is the first read in a pair |
0x0080 | 2 | the read is the second read in a pair |
0x0100 | s | the alignment is not primary |
0x0200 | f | the read fails platform/vendor quality checks |
0x0400 | d | the read is either a PCR or an optical duplicate |
0x0800 | S | the alignment is supplementary |
where the second column gives the string representation of the FLAG field.
SEE ALSO¶
- https://github.com/samtools/hts-specs
- The full SAM/BAM file format specification
August 2013 | htslib |