Scroll to navigation

GT-SELECT(1) GenomeTools Manual GT-SELECT(1)

NAME

gt-select - Select certain features (specified by the used options) from given GFF3 file(s).

SYNOPSIS

gt select [option ...] [GFF3_file ...]

DESCRIPTION

-retainids [yes|no]

when available, use the original IDs provided in the source file (memory consumption is proportional to the input file size(s)) (default: no)

-seqid [string]

select feature with the given sequence ID (all comments are selected). (default: undefined)

-source [string]

select feature with the given source (the source is column 2 in regular GFF3 lines) (default: undefined)

-contain [start end]

select all features which are contained in the given range (default: undefined)

-overlap [start end]

select all features which do overlap with the given range (default: undefined)

-strand [string]

select all top-level features(i.e., features without parents) whose strand equals the given one (must be one of +-.?) (default: undefined)

-targetstrand [string]

select all top-level features (i.e., features without parents) which have exactly one target attribute whose strand equals the given one (must be one of +-.?) (default: undefined)

-targetbest [yes|no]

if multiple top-level features (i.e., features without parents) with exactly one target attribute have the same target_id, keep only the feature with the best score. If -targetstrand is used at the same time, this option is applied after -targetstrand. Memory consumption is proportional to the input file size(s). (default: no)

-hascds [yes|no]

select all top-level features which do have a CDS child (default: no)

-maxgenelength [value]

select genes up to the given maximum length (default: undefined)

-maxgenenum [value]

select the first genes up to the given maximum number (default: undefined)

-mingenescore [value]

select genes with the given minimum score (default: undefined)

-maxgenescore [value]

select genes with the given maximum score (default: undefined)

-minaveragessp [value]

set the minimum average splice site probability (default: undefined)

-rule_files

specify Lua filter rule files to be used for selection (terminate list with --)

-rule_logic [...]

select how multiple Lua files should be combined choose from AND|OR (default: AND)

-dropped_file [filename]

save non-selected features to file (default: undefined)

-v [yes|no]

be verbose (default: no)

-o [filename]

redirect output to specified file (default: undefined)

-gzip [yes|no]

write gzip compressed output file (default: no)

-bzip2 [yes|no]

write bzip2 compressed output file (default: no)

-force [yes|no]

force writing to output file (default: no)

-help

display help and exit

-version

display version information and exit

File format for option -rule_files:

The files supplied to option -rule_files define a function for filtering by user given criteria (see example below):

function filter(gn)

target = "exon"
for curnode in gn:children() do
if (curnode:get_type() == target) then
return false
end
end
return true end

The above function iterates over all children of gn and checks whether there is a node of type exon. If there is such a node the function returns false, indicating that the parent node gn will not be sorted out.

NOTE: The function must be named filter and must return false, indicating that the node survived the filtering process.

REPORTING BUGS

Report bugs to https://github.com/genometools/genometools/issues.

04/27/2024 GenomeTools 1.6.5