table of contents
SWISS::TextFunc(3pm) | User Contributed Perl Documentation | SWISS::TextFunc(3pm) |
NAME¶
SWISS::TextFunc
DESCRIPTION¶
This module is designed to be a repository of functions that are repeatedly used during parsing and formatting of SWISS-PROT/TREMBL lines. If more than two line types need to do aproximately the same thing then it is probably in here.
All functions expect to be called as package->function(param list)
- listFromText
- Takes a piece of text, a seperator regex and a seperator that may appear at the end. Returns an array of items that were seperated in the text by that seperator. Takes care of null items (looses them for you).
- textFromList
- Takes an array of items, a separator, a terminating string, and a line
width. Returns an array of strings, each ending with the separator or the
terminator with a width less than or equal to the width specified.
Seems to do the wrong thing for references - not sure why. Don't use it for that.
- wrapText
- Takes a string and a length. Returns an array of strings which are shorter or equal in length to length, spliting the string on white space.
- wrapOn ($firstLinePrefix, $linePrefix, $colums, $text[, @separators])
- Wraps $text into lines with at most
$colums colums. Prepends the prefixes to the
lines. @separators is a list of expressions on
which to wrap. The expression itself is part of the upper line.
If no @separators are provided, the $text is wrapped at whitespace except in EC/TC numbers or at dashes that separate words.
First tries to wrap on the first item of @separators, then the next etc. If no wrap on any element of @separators or whitespaces is possible, wraps into lines of exactly length $colums.
A special case is that the first item of @separators may be a reference to an array. This is used internally for wrapping FT VARIANT-like lines.
Example:
wrapOn('DE ', 'DE ', 40, '14-3-3 PROTEIN BETA/ALPHA (PROTEIN KINASE C INHIBITOR PROTEIN-1)', '\s+') returns ['14-3-3 PROTEIN BETA/ALPHA (PROTEIN ', 'KINASE C INHIBITOR PROTEIN-1)'] wrapOn('DE ', 'DE ', 40, '14-3-3 PROTEIN BETA/ALPHA (PROTEIN KINASE C INHIBITOR PROTEIN-1)', ' (?=\()', '\s+') returns ['14-3-3 PROTEIN BETA/ALPHA ', '(PROTEIN KINASE C INHIBITOR PROTEIN-1)']
- cleanLine
- Remove the leading line Identifier and three blanks and trailing spaces from an SP line.
- joinWith ($text, $with, $noAddAfter, @list)
- Concatenates $text and @list into one string. Adds $with between the original elements, unless the postfix of the current string is $noAddAfter. This is used to avoid inserting blanks after hyphens during concatenation. So unpleasant strings like 'CALMODULIN- DEPENDENT' are avoided. Unfortunately a correct reassembly of strings like 'CARBON-DIOXIDE' is not done.
- insertLineGroup ($textRef, $text, $pattern)
- Inserts text block $text into the text referred to by $textRef. $text will replace the text block in $textRef matched by $pattern.
- uniqueList (@list)
- Returns a list in which all duplicates from @list have been removed.
- currentSpDate
- returns the current date in SWISS-PROT format
- toMixedCase($text, @regexps)
- Convert a text to mixed case, according to one or more regular expressions. In scalar context, returns the new text; in array context, also returns the regexp with which the change was performed, or undef on failure. See corresponding item in SWISS::GN for more details.
2021-08-15 | perl v5.32.1 |