Scroll to navigation

BIBCLEAN(1) General Commands Manual BIBCLEAN(1)

NAME

bibclean - prettyprint and syntax check BibTeX and Scribe bibliography data base files

SYNOPSIS

bibclean-author ] [ -error-log filename ] [ -help ] [ -? ] [ -init-file filename ] [ -long-field fieldname ] [ -max-width nnn ] [ -[no-]align-equals ] [ -[no-]check-values ] [ -[no-]delete-empty-values ] [ -[no-]file-position ] [ -[no-]fix-font-changes ] [ -[no-]fix-initials ] [ -[no-]fix-names ] [ -[no-]German-style ] [ -[no-]keep-linebreaks ] [ -[no-]keep-parbreaks ] [ -[no-]keep-preamble-spaces ] [ -[no-]keep-spaces ] [ -[no-]keep-string-spaces ] [ -[no-]parbreaks ] [ -[no-]prettyprint ] [ -[no-]print-patterns ] [ -[no-]read-init-files ] [ -[no-]remove-OPT-prefixes ] [ -[no-]scribe ] [ -[no-]trace-file-opening ] [ -[no-]warnings ] [ -version ] ( <infile | bibfile1 bibfile2 bibfile3 ... ) >outfile

All options can be abbreviated to a unique leadingprefix.

An explicit file name of ``-'' representsstandard input; it is assumed if no input filesare specified.

DESCRIPTION

bibcleanprettyprints input BibTeX files tostdout,and checks the brace balance and bibliographyentry syntax as well. It can be used to detectproblems in BibTeX files that sometimes confuseeven BibTeX itself, and importantly, can be usedto normalize the appearance of collectionsof BibTeX files.

Here is a summary of the formatting actions:

BibTeX items are formatted into a consistentstructure with one field = "value" pair per line, and the initial @ and trailing right brace in column 1.
Tabs are expanded into blank strings; their use isdiscouraged because they inhibit portability, andcan suffer corruption in electronic mail.
Long string values are split at a blank andcontinued onto the next line with leadingindentation.
A single blank line separates adjacentbibliography entries.
Text outside BibTeX entries is passed throughverbatim.
Outer parentheses around entries areconverted to braces.
Personal names inauthorandeditorfield values are normalized to the form ``P. D. Q.Bach'', from ``P.D.Q. Bach'' and ``Bach, P.D.Q.''.
Hyphen sequences in page numbers are converted toen-dashes.
Month values are converted to standard BibTeXstring abbreviations.
In titles, sequences of upper-case characters atbrace level zero are braced to protect them frombeing converted to lower-case letters by somebibliography styles.
CODEN, ISBN (International Standard Book Number)and ISSN (International Standard Serial Number)entry values are examined to verify the checksumsof each listed number, and correct ISBNhyphenation is automatically supplied.

The standardized format of the output ofbibcleanfacilitates the later application of simplefilters, such asbibcheck(1),bibdup(1),bibextract(1),bibindex(1),bibjoin(1),biblabel(1),biblook(1),biborder(1),bibsort(1),citefind(1),andcitetags(1),to process the text, and also is the one expectedby the GNU Emacs BibTeX support functions.

OPTIONS

Command-line switches may be abbreviated to aunique leading prefix, and letter case isnotsignificant. All options are parsed before anyinput bibliography files are read, no matter whattheir order on the command line. Options thatcorrespond to a yes/no setting of a flag have aform with a prefix "no-" to set the flag tono.For such options, the last setting determines theflag value used. This is significant when optionsare also specified in initialization files (seetheINITIALIZATION FILESmanual section).

The leading hyphen that distinguishes an optionfrom a filename may be doubled, for compatibilitywith GNU and POSIX conventions. Thus,-authorand--authorare equivalent.

To avoid confusion with options, if a filenamebegins with a hyphen, it must be disguised by aleading absolute or relative directory path, e.g.,/tmp/-foo.bibor./-foo.bib.

Display an author credit on the standard error unit,stderr,and then exit with a success return code.Sometimes an executable program is separated fromits documentation and source code; this optionprovides a way to recover from that.
Redirectstderrto the indicated file, which will then contain allof the error and warning messages. This option isprovided for those systems that have difficultyredirectingstderr.
Display a help message onstderr,giving a usage description, similar to thissection of the manual pages, and then exit with asuccess return code.
Provide an explicit value pattern initializationfile. It will be processedafterany system-wide and job-wide initialization files,and may override them.It in turn may be overridden by a subsequentfile-specific initialization file.For further details, see theINITIALIZATION FILESmanual section.
Suppress warnings that field namedfieldnamehave lenghts exceeding the standard BibTeX limits.NB! This is a Debian-specific extension!
bibcleannormally limits output line widths to 72characters, and in the interests of consistency,that value should not be changed. Occasionally,special-purpose applications may require differentmaximum line widths, so this option provides thatcapability. The number following the option namecan be specified in decimal, octal (starting with0), or hexadecimal (starting with 0x).A zero or negative value is interpreted to meanunlimited, so-max-width 0can be used to ensure that each field/value pairappears on a single line.
When-no-prettyprintrequestsbibcleanto act as a lexical analyzer, the default linewidth is unlimited, unless overridden by thisoption.
Whenbibcleanis prettyprinting, line wrapping will be done onlyat a space. Consequently, a long non-blankcharacter sequence may result in the outputexceeding the requested line width.
Whenbibcleanis lexing, line wrapping is done by inserting abackslash-newline pair when the specified maximumis reached, so no line length will ever exceed themaximum.
-[no-]align-equals
With the positive form, align the equals sign inkey/value assignments at the same column,separated by a single space from the value string.Otherwise, the equals sign follows the key,separated by a single space.Default: no.
-[no-]check-values
With the positive form, apply heuristic patternmatching to field values in order to detectpossible errors (e.g., ``year = "192"'' instead of ``year = "1992"''), and issue warnings when unexpected patterns are found.
This checking is usually beneficial, but if itproduces too many bogus warnings for a particularbibliography file, you can disable it with thenegative form of this option.Default: yes.
-[no-]delete-empty-values
With the positive form, remove all field/value pairsfor which the value is an empty string. This ishelpful in cleaning up bibliographies generatedfrom text editor templates. Compare this optionwith-[no-]remove-OPT-prefixesdescribed below.Default: no.
-[no-]file-position
With the positive form, give detailed fileposition information in warning and errormessages.Default: no.
-[no-]fix-font-changes
With the positive form, supply an additional bracelevel around font changes in titles to protectagainst downcasing by some BibTeX styles. Fontchanges that already have more than one level ofbraces are not modified.
For example, if a title contains the Latinphrase{\em Dictyostelium Discoideum}or{\em {D}ictyostelium {D}iscoideum},then downcasing will incorrectly convert thephrase to lower-case letters. Most BibTeX usersare surprised that bracing the initial lettersdoes not prevent the downcase action. The correctcoding is{{\em Dictyostelium Discoideum}}.However, there are also legitimate cases where anextra level of bracing wrongly protects fromdowncasing. Consequently,bibcleanwill normallynotsupply an extra level of braces, but if you havea bibliography where the extra braces areroutinely missing, you can use this option tosupply them.
If you think that you need this option,it isstronglyrecommended that you applybibcleanto your bibliography filewith and without-fix-font-changes,then compare the two output files to ensure thatextra braces are not being supplied in titleswhere they should not be present. You will haveto decide which of the two output files is thebetter choice, then repair the incorrect titlebracing by hand.
Since font changes in titles are uncommon, exceptfor cases of the type which this option isdesigned to correct, it should do more good thanharm.Default: no.
-[no-]fix-initials
With the positive form, insert a space after aperiod following author initials.Default: yes.
-[no-]fix-names
With the positive form, reorderauthorandeditorname lists to remove commas at brace level zero,placing first names or initials before last names.Default: yes.
-[no-]German-style
With the positive form, interpret quote characters ["]insidebracedvalue strings at brace level 1 according to theconventions of the TeX style filegerman.sty,which overloads quote to simplify input andrepresentation of German umlaut accents, sharp-s(es-zet), ligature separators, invisible hyphens,raised/lowered quotes, French guillemets, anddiscretionary hyphens. Recognized charactercombinations will be braced to prevent BibTeXfrom interpreting the quote as a string delimiter.
Quoted strings receive no special handling fromthis option, and since German nouns in titles mustanyway be protected from the downcasing operationof most BibTeX bibliography styles, German valuestrings that use the overloaded quote charactercan always be entered in the form "{...}",without the need to specify this option at all.
Default: no.
-[no-]keep-linebreaks
Normally, line breaks inside value strings arecollapsed into a single space, so that long valuestrings can later be broken to provide lines ofreasonable length.
With the positive form, linebreaks are preservedin value strings. If-max-widthis set to zero, this preserves the original linebreaks. Spacingoutsidevalue strings remains underbibclean'scontrol, and is not affected by this option.
Default: no.
-[no-]keep-parbreaks
With the positive form, preserve paragraph breaks(either formfeeds, or lines containing onlyspaces) in value strings. Normally, paragraphbreaks are collapsed into a single space. Spacingoutsidevalue strings remains underbibclean'scontrol, and is not affected by this option.Default: no.
-[no-]keep-preamble-spaces
With the positive form, preserve all whitespacein @Preamble{...} entries.Default: no.
-[no-]keep-spaces
With the positive form, preserve all spaces invalue strings. Normally, multiple spaces arecollapsed into a single space. This option can beused together with-keep-linebreaks,-keep-parbreaks,and-max-width 0to preserve the form of value strings while stillproviding syntax and value checking. Spacingoutsidevalue strings remains underbibclean'scontrol, and is not affected by this option.Default: no.
-[no-]keep-string-spaces
With the positive form, preserve all whitespacein @String{...} entries.Default: no.
-[no-]parbreaks
With the negative form, a paragraph break (eithera formfeed, or a line containing only spaces) isnot permitted in value strings, or betweenfield/value pairs. This may be useful to quicklytrap runaway strings arising from mismatcheddelimiters.Default: yes.
-[no-]prettyprint
Normally,bibcleanfunctions as a prettyprinter. However, with thenegative form of this option, it acts as a lexicalanalyzer instead, producing a stream of lexicaltokens. See theLEXICAL ANALYSISmanual section for further details.Default: yes.
-[no-]print-patterns
With the positive form, print the value patternsread from initialization files as they are addedto internal tables. Use this option to checknewly-added patterns, or to see what patterns arebeing used.
These patterns are the ones that will beused in checking value strings for valid syntax,and all of them are specified in initializationfiles, rather than hard-coded into the program.For further details, see theINITIALIZATION FILESmanual section.Default: no.
-[no-]read-init-files
With the negative form, suppress loading ofsystem-, user-, and file-specific initializationfiles. Initializations will comeonlyfrom those files explicitly given by-init-file filenameoptions.Default: yes.
-[no-]remove-OPT-prefixes
With the positive form, remove the ``OPT'' prefixfrom each field name where the corresponding valueisnotan empty string. The prefix ``OPT'' must beentirely in upper-case to be recognized.
This option is for bibliographies generated withthe help of the GNU Emacs BibTeX editing support,which generates templates with optional fieldsidentified by the ``OPT'' prefix. Although thefunctionM-x bibtex-remove-OPTnormally bound to the keystrokesC-c C-odoes the job, users often forget, with the resultthat BibTeX does not recognize the field name, andignores the value string. Compare this optionwith-[no-]delete-empty-valuesdescribed above.Default: no.
-[no-]scribe
With the positive form, accept input syntaxconforming to the Scribe document system. Theoutput will be converted to conform to BibTeXsyntax. See theSCRIBE BIBLIOGRAPHY FORMATmanual section for further details.Default: no.
-[no-]trace-file-opening
With the positive form, record in the error log filethe names of all files whichbibcleanattempts to open. Use this option to identifywhere initialization files are located.Default: no.
-[no-]warnings
With the positive form, allow all warningmessages. The negative form isnotrecommended since it may mask problems that shouldbe repaired.Default: yes.
Display the program version number onstderr,and then exit with a success return code.This will also include an indication of whocompiled the program, the host name on which itwas compiled, the time of compilation, and thetype of string-value matching code selected, whenthat information is available to the compiler.

ERROR RECOVERY AND WARNINGS

Whenbibcleandetects an error, it issues an error message tobothstderrandstdout.That way, the user is clearly notified, and theoutput bibliography also contains the message atthe point of error.

Error messages begin with a distinctive pair ofqueries, ??, beginning in column 1, followed bythe input file name and line number. If the-file-positionoption was specified, they also contain the inputand output positions of the current file, entry,and value. Each position includes the file bytenumber, the line number, and the column number.In the event of a runaway string argument, theentry and value positions should preciselypinpoint the erroneous bibliography entry, and thefile positions will indicate where it wasdetected, which may be rather later in the files.

Warning messages identify possible problems, andare therefore sent only tostderr,and not tostdout,so they never appear in the output file. They areidentified by a distinctive pair of percents, %%,beginning in column 1, and as with error messages,may be followed by file position messages if the-file-positionoption was specified.

For convenience, the first line of each error andwarning message sent tostderris formatted according to the expectations of theGNU Emacsnext-errorcommand. You can invokebibcleanwith the EmacsM-x compile<RET>bibclean filename.bib >filename.newcommand, then use thenext-errorcommand, normally bound toC-x `(that's a grave, or back, accent), to move to thelocation of the error in the input file.

If error messages are ignored, and left in theoutput bibliography file, they will precipitate anerror when the bibliography is next processedwith BibTeX.

After issuing an error message,bibcleanthen resynchronizes its input by copying itverbatim tostdoutuntil a new bibliography entry is recognized on aline in which the first non-blank character is anat-sign (@). This ensures that nothing is lostfrom the input file(s), allowing corrections to bemade in either the input or the output files.However, ifbibcleandetects an internal error in its data structures,it will terminate abruptly without further inputor output processing; this kind of error shouldnever happen, and if it does, it should bereported immediately to the author of the program.Errors in initialization files, and running out ofdynamic memory, will also immediately terminatebibclean.

INITIALIZATION FILES

bibcleancan be compiled with one of three different typesof pattern matching; the choice is made by theinstaller at compile time:

The original version uses explicit hand-codedtests of value-string syntax.
The second version uses regular-expressionpattern-matching host library routines togetherwith regular-expression patterns that comeentirely from initialization files.
The third version uses special patterns that comeentirely from initialization files.

This Debianized version ofbibcleanuses the third version.However, command-line options can also bespecified in initialization files, no matter whichpattern matching choice was selected.

Whenbibcleanstarts, it searches for initialization files,using the first one of$(HOME)/.bibcleanrc,/usr/share/bibcleanrc,and/etc/bibcleanrcthat exists.Afterwards, it reads the first.bibcleanrcfound in theBIBINPUTSsearch path.The name.bibcleanrccan be changed at run time through a setting of theenvironment variableBIBCLEANINI.If the name starts with a dot, it will be strippedwhen looking in/usr/shareand/etc.

Then,when command-line arguments are processed, anyadditional files specified by-init-filefilenameoptions are also processed. Finally, immediatelybefore eachnamedbibliography file is processed, an attempt is madeto process an initialization file with the samename, but with the extension changed to.ini.The default extension can be changed by a settingof the environment variableBIBCLEANEXT.This scheme permits system-wide, user-wide,session-wide, and file-specific initializationfiles to be supported.

When input is taken fromstdin,there is no file-specific initialization.

For precise control, the-no-read-init-filesoption suppresses all initialization files exceptthose explicitly named by-init-filefilenameoptions, either on the command line, or inrequested initialization files.

Recursive execution of initialization files withnested-init-fileoptions is permitted; if the recursion iscircular,bibcleanwill finally get a non-fatal initialization fileopen failure after opening too many files. Thisterminates further initialization file processing.As the recursion unwinds, the files are allclosed, then execution proceeds normally.

An initialization file may contain empty lines,comments from percent to end of line (just likeTeX), option switches, and field/pattern orfield/pattern/message assignments. Leading andtrailing spaces are ignored. This is bestillustrated by a short example:

% This is a small bibclean initialization file
-init-file /u/math/bib/.bibcleanrc %% departmental patterns
chapter = "\"D\""                 %% 23
pages   = "\"D--D\""              %% 23--27
volume  = "\"D \\an\\d D\""       %% 11 and 12
year    = \

"\"dddd, dddd, dddd\"" \
"Multiple years specified." %% 1989, 1990, 1991 -no-fix-names %% do not modify author/editor lists

Long logical lines can be split into multiplephysical lines by breaking at a backslash-newlinepair; the backslash-newline pair is discarded.This processing happens while characters are beingread, before any further interpretation of theinput stream.

Each logical line must contain a complete option(and its value, if any), or a complete field/patternpair, or a field/pattern/message triple.

Comments are stripped during the parsing of thefield, pattern, and message values. The commentstart symbol is not recognized inside quotedstrings, so it can be freely used in such strings.

Comments on logical lines that were input asmultiple physical lines via the backslash-newlineconvention must appear on thelastphysical line; otherwise, the remaining physicallines will become part of the comment.

Pattern strings must be enclosed in quotationmarks; within such strings, a backslash starts anescape mechanism that is commonly used in UNIXsoftware. The recognized escape sequences are:

alarm bell (octal 007)
backspace (octal 010)
formfeed (octal 014)
newline (octal 012)
carriage return (octal 015)
horizontal tab (octal 011)
vertical tab (octal 013)
character number octalooo(e.g\012is linefeed). Up to 3 octal digitsmay be used.
\0xhh
character number hexadecimalhh(e.g.,\0x0ais linefeed).xhhmay be in either letter case.Any number of hexadecimal digitsmay be used.

Backslash followed by any other character producesjust that character. Thus, \% gets a literalpercent into a string (preventing itsinterpretation as a comment), \" produces aquotation mark, and \\ produces a singlebackslash.

An ASCII NUL(\0)in a string will terminate it; this is a featureof the C programming language in whichbibcleanis implemented.

Field/pattern pairs can be separated by arbitraryspace, and optionally, either an equals sign orcolon functioning as an assignment operator.Thus, the following are equivalent:

pages="\"D--D\""
pages:"\"D--D\""
pages "\"D--D\""

pages = "\"D--D\""
pages : "\"D--D\"" pages "\"D--D\""

Each field name can have an arbitrary number ofpatterns associated with it; however, they mustbe specified in separate field/pattern assignments.

An empty pattern string causes previously-loadedpatterns for that field name to be forgotten. Thisfeature permits an initialization file tocompletely discard patterns from earlierinitialization files.

Patterns for value strings are represented in atiny special-purpose language that is bothconvenient and suitable for bibliographyvalue-string syntax checking. While not aspowerful as the language of regular-expressionpatterns, its parsing can be portably implementedin less than 3% of the code in a widely-usedregular-expression parser (the GNUregexppackage).

The patterns are represented by the followingspecial characters:

<space>
one or more spaces
exactly one letter
one or more letters
exactly one digit
one or more digits
exactly one Roman numeral
one or more Roman numerals (i.e. a Roman number)
exactly one word (one or more letters and digits)
one or more space-separated words, beginning andending with a word
.
one `special' character, one of the characters<space>!#()*+,-./:;?[]~,a subset of punctuation characters that aretypically used in string values
:
one or more `special' characters
one or more `special'-separated words, beginningand ending with a word
exactly one x (x is any character), possibly withan escape sequence interpretation given earlier
exactly the character x (x is anything butone of these pattern characters:aAdDrRwW.:<space>\)

TheXpattern character is very powerful, but generallyinadvisable, since it will match almost anythinglikely to be found in a BibTeX value string.The reason for providing pattern matching on thevalue strings is to uncover possible errors, notmask them.

There is no provision for specifying ranges orrepetitions of characters, but this can usually bedone with separate patterns. It is a good idea toaccompany the pattern with a comment showing thekind of thing it is expected to match. Here is aportion of an initialization file giving a few ofthe patterns used to matchnumbervalue strings:

number  =       "\"D\""         %% 23
number  =       "\"A AD\""      %% PN LPS5001
number  =       "\"A D(D)\""    %% RJ 34(49)
number  =       "\"A D\""       %% XNSS 288811
number  =       "\"A D\\.D\""   %% Version 3.20
number  =       "\"A-A-D-D\""   %% UMIAC-TR-89-11
number  =       "\"A-A-D\""     %% CS-TR-2189
number  =       "\"A-A-D\\.D\"" %% CS-TR-21.7

For a bibliography that contains onlyarticleentries, this list should probably be reduced tojust the first pattern, so that anything otherthan a digit string fails the pattern-match test.This is easily done by keepingbibliography-specific patterns in a correspondingfile with extension.ini,since that file is read automatically.

You should be sure to use empty pattern strings inthis pattern file to discard patterns from earlierinitialization files.

The value strings passed to the pattern matchercontain surrounding quotes, so the patterns shouldalso. However, you could use a patternspecification like "\"D" to match an initialdigit string followed by anything else; theomission of the final quotation mark \" in thepattern allows the match to succeed withoutchecking that the next character in the valuestring is a quotation mark.

Because the value strings are intended to beprocessed by TeX, the pattern matching ignoresbraces, and TeX control sequences, togetherwith any space following those control sequences.Spaces around braces are preserved. Thisconvention allows the pattern fragmentA-AD-Dto match the value stringTN-K\slash 27-70,because the value is implicitly collapsed toTN-K27-70during the matching operation.

bibclean'snormal action when a string value fails to matchany of the corresponding patterns is to issue awarningmessage something like this:"Unexpected value in ``year = "192"''. In most cases, that is sufficient to alert the user to a problem. In some cases, however, it may be desirable to associate a different message with a particular pattern. This can be done by supplying a message string following the pattern string. Format items %%(single percent),%e(entry name),%f(field name),%k(citation key),and%v(string value)are available to get current values expanded inthe messages. Here is an example:

chapter = "\"D:D\"" "Colon found in ``%f = %v''" %% 23:2

To be consistent with other messages output bybibclean,the message string shouldnotend with punctuation.

If you wish to make the message an error, ratherthan just a warning, begin it with a query (?),like this:

chapter = "\"D:D\"" "?Colon found in ``%f = %v''" %% 23:2

The query will not be included in the output message.

Escape sequences are supported in message strings,just as they are in pattern strings. You can usethis to advantage for fancy things, such asterminal display mode control. If you rewrite theprevious example as

chapter = "\"D:D\"" \

"?\033[7mColon found in ``%f = %v''\033[0m" %% 23:2

the error message will appear in inverse video ondisplay screens that support ANSI terminal controlsequences. Such practice is not normallyrecommended, since it may have undesirableeffects on some output devices. Nevertheless, youmay find it useful for restricted applications.

For some types of bibliography fields,bibcleancontains special-purpose code to supplement orreplace the pattern matching:

CODEN,ISBNandISSNfield values are handled this way because theirvalidation requires evaluation of checksums thatcannot be expressed by simple patterns; nopatterns are even used in these three cases.
chapter,number,pages,andvolumevalues are checked only by pattern matching.
monthvalues are first checked against the standardBibTeX month abbreviations, and only if no matchis found are patterns then used.
yearvalues are first checked against patterns, then ifno match is found, the year numbers are found andconverted to integer values for testing againstreasonable bounds.

Values for other fields are checked only againstpatterns. You can provide patterns foranyfield you like, even onesbibcleandoes not already know about. New ones are simplyadded to an internal table that is searched foreach string to be validated.

The special field,key,represents the bibliographic citation key. It canbe given patterns, like any other field. Hereis an initialization file pattern assignment thatwill match an author name, a colon, an alphabeticstring, and a two-digit year:

key = "A:Add"                     %% Knuth:TB86

Notice that no quotation marks are included in thepattern, because the citation keys are not quoted.You can use such patterns to help enforce uniformnaming conventions for citation keys, which isincreasingly important as your bibliography database grows.

LEXICAL ANALYSIS

When-no-prettyprintis specified,bibcleanacts as a lexical analyzer instead of aprettyprinter, producing output in lines of theform

<token-number><tab><token-name><tab>"<token-value>"

Each output line contains a single complete token,identified by a small integer number for use by acomputer program, a token type name for humanreaders, and a string value in quotes.

Special characters in the token value string arerepresented with ANSI/ISO Standard C escapesequences, so all characters other than NUL arerepresentable, and multi-line values can berepresented in a single line.

Here are the token numbers and token type namesthat can appear in the output when-prettyprintis specified:


0 UNKNOWN
1 ABBREV
2 AT
3 COMMA
4 COMMENT
5 ENTRY
6 EQUALS
7 FIELD
8 INCLUDE
9 INLINE 10 KEY 11 LBRACE 12 LITERAL 13 NEWLINE 14 PREAMBLE 15 RBRACE 16 SHARP 17 SPACE 18 STRING 19 VALUE

Programs that parse such output should also beprepared for lines beginning with the warningprefix, %%, or the error prefix, ??, and forANSI/ISO Standard C line number directives of theform

# line 273 "texbook1.bib"
which record the line number and file nameof the current input file.

If a-max-width nnncommand-line option was specified, long outputlines will be wrapped at a backslash-newline pair,and consequently, software that processes thelexical token stream should be prepared tocollapse such wrapped lines back into singlelines.

As an example of the use of-no-prettyprint,the UNIX command pipeline

bibclean -no-prettyprint mylib.bib | \

awk '$2 == "KEY" {print $3}' | \
sed -e 's/"//g' | \
sort
will extract a sorted list of all citation keys inthe filemylib.bib.

A certain amount of processing will have been doneon the tokens. In particular, delimitersequivalent to braces will have been replaced bybraces, and braced strings will have become quotedstrings.

The LITERAL token type is used for arbitrary textthatbibcleandoes not examine further, such as the contents ofa @Preamble{...} or a @Comment{...}.

The UNKNOWN token type should never appear in theoutput stream. It is used internally toinitialize token type variables.

SCRIBE BIBLIOGRAPHY FORMAT

bibclean'ssupport for the Scribe bibliography format isbased on the syntax description in the ScribeIntroductory User's Manual, 3rd Edition, May 1980.Scribe was originally developed by Brian Reid atCarnegie-Mellon University, and is now marketed byUnilogic, Ltd.

The BibTeX bibliography format was stronglyinfluenced by Scribe, and indeed, with care, itis possible to share bibliography files betweenthe two systems. Nevertheless, there are somedifferences, so here is a summary of features ofthe Scribe bibliography file format:

(1)
Letter case is not significant in field names andentry names, but case is preserved in valuestrings.
(2)
In field/value pairs, the field and value may beseparated by one of three characters: =, /, orspace. Space may optionally surround theseseparators.
(3)
Value delimiters are any of these sevenpairs: { } [ ] ( ) < > ' ' " " ` `
(4)
Value delimiters may not be nested, even though withthe first four delimiter pairs, nested balanceddelimiters would be unambiguous.
(5)
Delimiters can be omitted around values thatcontain only letters, digits, sharp (#), ampersand(&), period (.), and percent (%).
(6)
Outside of delimited values, a literal at-sign(@) is represented by doubled at-signs (@@).
(7)
Bibliography entries begin with @name, asfor BibTeX, but any of the seven Scribe valuedelimiter pairs may be used to surround the valuesin field/value pairs. As in (4), nested delimitersare forbidden.
(8)
Arbitrary space may separate entry names from thefollowing delimiters.
(9)
@Comment is a special command whose delimitedvalue is discarded. As in (4), nested delimitersare forbidden.
(10)
The special form
@Begin{comment}

... @End{comment}
permits encapsulating arbitrary text containingany characters or delimiters, other than``@End{comment}''. Any of the seven delimiterpairs may be used around the word ``comment''following the ``@Begin'' or ``@End''; thedelimiters in the two cases need not be the same,and consequently,``@Begin{comment}''/``@End{comment}'' pairs maynotbe nested.
(11)
Thekeyfield is required in each bibliography entry.
(12)
A backslashed quote in a string will be assumed tobe a TeX accent, and braced appropriately.While such accents do not conform to Scribesyntax, Scribe-format bibliographies have beenfound that appear to be intended for TeXprocessing.

Because of this loose syntax,bibclean'snormal error detection heuristics are lesseffective, and consequently, Scribe mode input isnot the default; it must be explicitly requested.

ENVIRONMENT VARIABLES

File extension of bibliography-specificinitialization files. Default:.ini.
Name ofbibcleaninitialization files. Default:.bibcleanrc.
Search path forbibcleanand BibTeX input files. This is acolon-separated list of directories that aresearched in order from first to last. It is notan error for a specified directory to not exist.

FILES

*.bib
BibTeX and Scribe bibliography data base files.
*.ini
File-specific initialization files.
/usr/share/bibcleanrc, /etc/bibcleanrc
System-wide initialization files.
.bibcleanrc
User-specific initialization files.

SEE ALSO

bibcheck(1),bibdup(1),bibextract(1),bibindex(1),bibjoin(1),biblabel(1),biblex(1),biblook(1),biborder(1),bibparse(1),bibsort(1),bibtex(1),bibunlex(1),citefind(1),citesub(1),citetags(1),latex(1),scribe(1),tex(1).

AUTHOR

Nelson H. F. Beebe
Center for Scientific Computing
University of Utah
Department of Mathematics, 322 INSCC
155 S 1400 E RM 233
Salt Lake City, UT 84112-0090
USA
Tel: +1 801 581 5254
FAX: +1 801 585 1640, +1 801 581 4148
Email: beebe@math.utah.edu, beebe@acm.org, beebe@ieee.org (Internet)
URL: http://www.math.utah.edu/~beebe

This Debianization ofbibcleanwas done by Henning Makholm <henning@makholm.net>, anddiffers from the upstream source in where it looks for the system-wideinitialization file (vanillabibcleanexpects to find it in$PATH),and has also been patched to ignore the built-inBibTeX field-length limit forabstractfields.

09 May 1998 Version 2.11.4