Scroll to navigation

CLEANASN(1) NCBI Tools User's Manual CLEANASN(1)

NAME

cleanasn - clean up irregularities in NCBI ASN.1 objects

SYNOPSIS

cleanasn [-] [-A filename] [-B str] [-C str] [-D str] [-F str] [-K str] [-L filename] [-M filename] [-N str] [-O str] [-P str] [-Q str] [-R] [-S str] [-T] [-U str] [-V str] [-X str] [-Z str] [-a str] [-b] [-c] [-d str] [-f str] [-i filename] [-j filename] [-k filename] [-m str] [-n path] [-o filename] [-p path] [-q path] [-r path] [-v path] [-x ext]

DESCRIPTION

cleanasn is a utility program to clean up irregularities in NCBI ASN.1 objects.

OPTIONS

A summary of options is included below.

-
Print usage message
-A filename
Accession list file
Branch, per the flags in str:
Has coding regions
No coding regions
Passes validation
Validator errors or rejects
Only pop/phy/mut/eco/WGS sets
Exclude pop/phy/mut/eco/WGS sets
Only nuc-prot sets
Exclude nuc-prot sets
Only segmented sequences
Exclude segmented sequences
Only segmented proteins
Exclude segmented proteins
Sequence operations, per the flags in str:
Compress
Decompress
Recalculated segmented sequence length
Virtual gaps inside segmented sequence
Convert segmented set to delta sequence
Non-NucProt segmented set to delta sequence
Improved non-NucProt segmented set to delta sequence
Raw to delta by assembly gap
Merge assembly gap features
Clean up descriptors, per the flags in str:
Remove Title
Remove Comment
Remove Nuc-Prot Set title
Remove Pop/Phy/Mut/Eco Set title
Remove mRNA title
Remove Protein title
Title to name
AutoDef title or name
Prefix title with organism name
Clean up features, per the flags in str:
Remove User-objects
Remove db_xrefs
Remove /evidence and /inference
Fuse multi-interval genes
Fuse adjacent-interval imported features
Remove redundant gene xrefs
Fuse duplicate features
Package features on referenced Bioseq
Package coding-region or parts features
Delete or update EC numbers
Set Best coding-region reading frame
Retranslate coding regions
Adjust for missing stop codon
Perform a general cleanup, per the flags in str:
BasicSeqEntryCleanup
C++ BasicCleanup (via an external utility)
AdvancedSeqEntryCleanup
SeriousSeqEntryCleanup
ExtendedSeqEntryCleanup
GpipeSeqEntryCleanup
Normalize descriptor order
Remove NcbiCleanup User Objects
Synchronize genetic Codes
CDS partial from translation
Impose CDS partials
Resynchronize CDS partials
Resynchronize mRNA partials
Resynchronize Peptide partials
Adjust consensus splice
Promote to "worst" Seq-ID
Reassign local IDs
Remove locus
Log file
Macro file
Clean up links, per the flags in str:
Link CDS mRNA by Overlap
Link CDS mRNA by Product
Link CDS mRNA by Label and Location
Reassign feature IDs
Merge colliding feature IDs
Fix missing reciprocal feature IDs
Clear feature IDs
Missing prot-ref name
Publication options:
Remove All publications
Remove Serial number
Remove Figure, numbering, and name
Remove Remark
Update PMID-only publication
Lookup ISO Journal title abbreviation
Merge identical publication features
#
Replace unpublished with PMID
Report:
Record count
ASN.1 BSEC report
ASN.1 SSEC report
NORM vs. SSEC report
PopPhyMutEco AutoDef report
Overlap report
Latitude-longitude country diff
Log SSEC differences
GenBank SSEC diff
asn2gb/asn2flat diff
Seg-to-delta GenBank diff
Validator SSEC diff
Modernize Gene/RNA/PCR
Unpublished Pub lookup
Published Pub lookup
Unindexed Journal report
tRNA anticodon report
Component offset report
Custom scan
Remote fetching from ID (NCBI sequence databases)
Selective difference filter (capital letters skip)
SSEC
BSEC
Author
Publication
Location
RNA
Qualifier sort order
Genbank block
Package CdRegion or parts features
Move publication
Leave duplicate Bioseq publication
Automatic definition line
Pop/Phy/Mut/Eco Set definition line
Taxonomy Lookup
Modernize, per the flags in str:
Genes
RNA
PCR Primers
Remove features by validator severity:
Reject
Error
Warning
Info
Miscellaneous options, per str:
Automatic definition line
Automatic definition line with Source qualifiers
Pop/Phy/Mut/Eco Set definition line
Instantiate NC title
Instantiate NM titles
Special XM titles
Instantiate Protein titles
GPipe instantiate titles
Create mRNAs for coding sequences
Fix reciprocal protein_id/transcript_id
Revert preRNA or ncRNA transcript_id
Parse anticodon from Sequence
Batch cleanup of multireader output
Wrap SegSet with NucProt set
GFF/WGS genome cleanup
Remove indicated User-object
-a str
ASN.1 type
Any (default)
Seq-entry
Bioseq
Bioseq-set
Seq-submit
Batch Bioseq-set
Batch Seq-submit
-b
Input ASN.1 is Binary
-c
Input ASN.1 is Compressed
-d str
Source database
Any (default)
GenBank
EMBL
DDBJ
EMBL or DDBJ
INSD
RefSeq
NCBI
Exclude EMBL/DDBJ
Exclude gbcon, gbest, gbgss, gbhtg, gbpat, gbsts
-f str
Substring filter
-i filename
Single input file (defaults to stdin)
-j filename
First filename
-k filename
Last filename
-m str
Flatfile mode:
Release
Entrez
Sequin
Dump
-n path
asn2flat executable (default is /netopt/ncbi_tools/bin/asn2flat)
-o filename
Single output file (defaults to stdout)
-p path
Process all matching files in path
-q path
ffdiff executable (default is /netopt/genbank/subtool/bin/ffdiff)
-r path
Path for results
-v path
asnval executable (default is /netopt/ncbi_tools/bin/asnval)
-x ext
File selection suffix for use with -p (defaults to .ent)

AUTHOR

The National Center for Biotechnology Information.

SEE ALSO

asndisc(1), asnval(1), sequin(1).

2017-01-09 NCBI