NAME¶

readseq - Reads and writes nucleic/protein sequences in various formats

SYNOPSIS¶

readseq [-options] in.seq > out.seq

DESCRIPTION¶

This manual page documents briefly the readseq command. This manual page was written for the Debian GNU/Linux distribution because the original program does not have a manual page. Instead, it has documentation in text form, see below.

readseq reads and writes biosequences (nucleic/protein) in various formats. Data files may have multiple sequences. readseq is particularly useful as it automatically detects many sequence formats, and interconverts among them.

FORMATS¶

Formats which readseq currently understands:
* IG/Stanford, used by Intelligenetics and others
* GenBank/GB, genbank flatfile format
* NBRF format
* EMBL, EMBL flatfile format
* GCG, single sequence format of GCG software
* DNAStrider, for common Mac program
* Fitch format, limited use
* Pearson/Fasta, a common format used by Fasta programs and others
* Zuker format, limited use. Input only.
* Olsen, format printed by Olsen VMS sequence editor. Input only.
* Phylip3.2, sequential format for Phylip programs
* Phylip, interleaved format for Phylip programs (v3.3, v3.4)
* Plain/Raw, sequence data only (no name, document, numbering)
+ MSF multi sequence format used by GCG software
+ PAUP's multiple sequence (NEXUS) format
+ PIR/CODATA format used by PIR
+ ASN.1 format used by NCBI
+ Pretty print with various options for nice looking output. Output only.
+ LinAll format, limited use (LinAll and ConStruct programs)
+ Vienna format used by ViennaRNA programs
See the included "Formats" file for detail on file formats.

OPTIONS¶

-help

Show summary of options.

Select All sequences

Change to lower case

Change to UPPER CASE

Remove gap symbols

Select Item number(s) from several

-l[ist]

List sequences only

-o[utput=]out.seq

Redirect Output

-p[ipe]

Pipe (command line, <stdin, >stdout)

-r[everse]

Change to Reverse-complement

-v[erbose]

Verbose progress

-f[ormat=]# Format number for output, or

-f[ormat=]Name Format name for output:
1. IG/Stanford 11. Phylip3.2
2. GenBank/GB 12. Phylip
3. NBRF 13. Plain/Raw
4. EMBL 14. PIR/CODATA
5. GCG 15. MSF
6. DNAStrider 16. ASN.1
7. Fitch 17. PAUP/NEXUS
8. Pearson/Fasta 18. Pretty (out-only)
9. Zuker (in-only) 19. LinAll
10. Olsen (in-only) 20. Vienna

Pretty format options:

-wid[th]=#