NAME¶
bp_seqfeature_load.pl - Load GFF into a SeqFeature database
DESCRIPTION¶
Pass any number of GFF or fasta format files (or GFF with embedded fasta) to
load the features and sequences into a SeqFeature database. The database (and
adaptor) to use is specified on the command line. Use the --create flag to
create a new SeqFeature database.
SYNOPSIS¶
bp_seqfeature_load.pl [options] gff_or_fasta_file1 [gff_or_fasta_file2 [...]]
Try 'bp_seqfeature_load.pl --help' or '--man' for more information.
OPTIONS¶
- -d, --dsn
- DBI data source (default dbi:mysql:test)
- -n, --namespace
- The table prefix to use (default undef) Allows several
independent sequence feature databases to be stored in a single
database
- -s, --seqfeature
- The type of SeqFeature to create... RTSC (default
Bio::DB::SeqFeature)
- -a, --adaptor
- The storage adaptor (class) to use (default
DBI::mysql)
- -v, --verbose
- Turn on verbose progress reporting (default true) Use
--noverbose to switch this off.
- -f, --fast
- Activate fast loading. (default 0) Only available for some
adaptors.
- -T, --temporary-directory
- Specify temporary directory for fast loading (default
File::Spec-> tmpdir())
- -i, --ignore-seqregion
- If true, then ignore ##sequence-region directives in the
GFF3 file (default, create a feature for each region)
- -c, --create
- Create the database and reinitialize it (default false)
Note, this will erase previous database contents, if any.
- -u, --user
- User to connect to database as
- -p, --password
- Password to use to connect to database
- -z, --zip
- Compress database tables to save space (default false)
- -S, --subfeatures
- Turn on indexing of subfeatures (default true) Use
--nosubfeatures to switch this off.
- --summary
- Generate summary statistics for coverage graphs (default
false) This can be run on a previously loaded database or during the load.
It will default to true if --create is used.
- --noalias-target
- Don't create an Alias attribute whose value is the
target_id in a Target attribute (if the feature contains a Target
attribute, the default is to create an Alias attribute whose value is the
target_id in the Target attribute)
Please see
http://www.sequenceontology.org/gff3.shtml for information about the
GFF3 format. BioPerl extends the format slightly by adding a
##index-subfeatures directive. Set this to a true value if you wish the
database to be able to retrieve a feature's individual parts (such as the
exons of a transcript) independently of the top level feature:
##index-subfeatures 1
It is also possible to control the indexing of subfeatures on a case-by-case
basis by adding "index=1" or "index=0" to the feature's
attribute list. This should only be used for subfeatures.
Subfeature indexing is true by default. Set to false (0) to save lots of
database space and speed performance. You may use --nosubfeatures to force
this.