table of contents
| TRANSDECODER.PREDICT(1) | Transcriptome Protein Prediction | TRANSDECODER.PREDICT(1) |
NAME¶
Transdecoder - Transcriptome Protein PredictionUSAGE¶
Required:-t <string> transcripts.fasta
Common options:
--retain_long_orfs <int> retain all ORFs found that are equal or longer than these many nucleotides even if no other evidence
marks it as coding (default: 900 bp => 300aa)
--retain_pfam_hits <string> domain table output file from running hmmscan to search Pfam (see transdecoder.github.io for info)
Any ORF with a pfam domain hit will be retained in the final output.
--retain_blastp_hits <string> blastp output in '-outfmt 6' format.
Any ORF with a blast match will be retained in the final output.
--single_best_orf Retain only the single best ORF per transcript.
(Best is defined as having (optionally pfam and/or blast support) and longest orf)
--cpu <int> Use multiple cores for cd-hit-est. (default=1)
-G <string> genetic code (default: universal; see PerlDoc; options: Euplotes, Tetrahymena, Candida, Acetabularia, ...)
Advanced options
--train <string> FASTA file with ORFs to train Markov Mod for protein identification; otherwise
longest non-redundant ORFs used
-T <int> If no --train, top longest ORFs to train Markov Model (hexamer stats) (default: 500)
Note, 10x this value are first selected for use with cd-hit to remove redundancies,
and then this -T value of longest ORFs are selected from the non-redundant set.
Genetic Codes¶
See <http://golgi.harvard.edu/biolinks/gencode.html>. These are currently supported:universal (default) Euplotes Tetrahymena Candida Acetabularia Mitochondrial-Canonical Mitochondrial-Vertebrates Mitochondrial-Arthropods Mitochondrial-Echinoderms Mitochondrial-Molluscs Mitochondrial-Ascidians Mitochondrial-Nematodes Mitochondrial-Platyhelminths Mitochondrial-Yeasts Mitochondrial-Euascomycetes Mitochondrial-Protozoans
| 2016-11-10 | 3.0.1+dfsg-1 |