NAME¶
stx2any - converter from structured text to multiple formats
SYNOPSIS¶
stx2any [ -T
format ] [
stx and m4 options ] [
file
file ... ]
DESCRIPTION¶
stx2any converts files in structured text (Stx) format into other formats.
Formats currently implemented are HTML, man, raw text, PostScript, LaTeX,
XHTML and DocBook XML.
The source format, structured text, is a kind of plain text format with standard
markup for representing headings, lists, emphasis etc. The markup is both
quicker to write and easier to remember than conventional tag-based markup
languages, and is beautifully legible also in source form. Stx markup is
better explained in
Stx quickie guide, which is available in the
examples directory.
Most of the conversion happens in m4, and you can define your own macros and
other stuff for giving structure to your documents. stx2any provides a
LaTeX-like extensible environment system and a diversion system for
rearranging input. (Tårta på tårta, as they say in
Swedish.)
Because stx2any doesn't perform any kind of quoting on the input, markup that
isn't available can be written directly in the destination language (losing
convertibility to multiple languages). This way, if you are only interested in
one output format (eg. LaTeX), you can use Stx as an abbreviation format for
the most common constructs.
Some formatting is not available as abbreviations, but by calling m4 macros. You
need macros relatively rarely: for example, floats (material that can
“float” around in the document) are created by macros.
OPTIONS¶
stx2any accepts all command line options of m4, passing them directly on. Of
these, the -D argument is important enough to mention here separately.
- -DNAME=VALUE
- Define macro NAME to have the expansion VALUE. This allows you to pass
information into the document from the command line.
- -T format
- Sets the output format. Default format is html. format should be
one of:
- html
- produces basic HTML (hypertext markup language) output.
- man
- produces man macro output. This output is usable as a man page directly
(although see WRITING MAN PAGES below), or can be fed to troff / groff for
formatting to e.g. postscript.
- latex
- produces LaTeX document preparation language output. You can run latex on
the result to produce e.g. high quality pdf's.
- text
- produces raw text output by postprocessing HTML output with w3m. The
resulting output is very basic, like stripping away most Stx markup; if
you want more formatted output, consider piping man output to nroff
-man.
- ps
- produces simple postscript output by postprocessing man output with groff.
If you want to do real publishing, consider the LaTeX format instead.
- xhtml
- produces XHTML output by postprocessing HTML output with W3C tidy. By the
way, check http://hixie.ch/advocacy/xhtml for discussion about HTML and
XHTML.
- docbook-xml
- produces rudimentary DocBook XML output. See BUGS below for more
discussion about this.
- --link-abbrevs
- Take link abbreviation syntax into use. Note that because link
abbreviation processing occurs in two phases, it doesn't work totally when
the input comes from standard input (for example, if you use stx2any as a
middle part of a pipeline).
- --quote
- Request quoting of characters (other than underscores and dollar signs)
that are somehow magical in the requested output format. This will make it
quite difficult to put markup in the output format directly in your
document, but will greatly increase the possibility that your document
will be correct (ie. does not have syntax errors) in the output
format.
- --quote-me-harder
- Request quoting of underscores and dollar signs. This might make some
LaTeX documents work but might break some documents where underscores are
used in macro names or dollar signs in macro definitions.
- --numbering { on | off }
- Request numbering of section headings. The default varies by output
format: section numbering is by default off for HTML, DocBook XML and man,
on for LaTeX.
- --table-of-contents { on | off }
- Request producing a table of contents from the headings. The default is to
produce a TOC when numbering is on. Not implemented for DocBook XML.
- --make-title { on | off }
- Request a “title page”. The default is “on”.
This setting does not have any effect in some formats. In HTML, it
produces a big heading at the beginning of the document. In LaTeX, it
produces the canonical maketitle.
- --no-template
- Do not produce a document template at all, only the formatted input text.
You probably need this if your document will be included as a part of a
bigger document. If that bigger document is written totally in Stx,
however, it will be cleaner to give all the source files directly as
arguments to stx2any rather than combine the results afterwards.
- --symmetric-crossrefs
- In document formats that support linking (HTML, DocBook), produce reverse
links from labels to referrers as well as links from referrers to
labels.
- --latex-params params
- Set the document class parameters for LaTeX documents. The default is
affected by system paper size; for example, on a European system it is
typically a4paper,notitlepage. (See “ENVIRONMENT”
below.)
- --html-params params
- Set the body tag parameters for HTML documents. The default is no
parameters.
- --picture-suffix suffix
- Inline images will refer to files with suffix suffix. The default
is png for HTML and DocBook, eps for LaTeX and man.
- --no-emdash-separate
- In the output, don't separate em dashes from adjacent text with spaces.
This is in accordance to traditional English typography (if I understand
correctly), but is not standard in many other languages — including
Finnish, my mother tongue.
- --more-secure
- Disable some insecure features of m4 and check some command line arguments
that are passed to shell for problematic characters. This might be
desirable if you've received the document from somewhere else and want to
make sure it won't do anything malicious when converted. Currently this
denies execution of shell escapes.
- Note that clearly no implementation of m4 has been designed with security
in mind. As a consequence, this option cannot prevent every potentially
harmful thing. Things not prevented which I'm aware of are including
contents of arbitrary files in the output and writing busy loops (so that
the conversion will use all processor time it can get, until
terminated).
- --sed-preprocessor scriptname
- Run the sed script scriptname for all input. This allows you to add
custom abbreviation markups. It is almost the same as preprocessing input
with sed, then piping it into stx2any, but interacts better with
--link-abbrevs (see its explanation for details).
- --version, -V
- Just show version information and exit.
- --help, -?
- Just show a short help message and exit.
WRITING MAN PAGES¶
Basically, man pages are simply files in the man macro format. However, there
are some programs (first and foremost mandb) that require parts of man pages
to be in a specific format, and man pages should generally adhere to the
standard sectioning and form (see man (1) and lexgrog (1) for details).
When writing a man page, the title (w_title) of the page should be the
program/file/format/utility name, and you should define the section
(w_section). To make the page suitable for mandb parsing, you should start the
page with one or more calls to w_man_desc. This will create a proper
“NAME” section for you. (Although you could write one by
yourself.)
DIAGNOSTICS¶
stx2any may give any error message that m4 may give, e.g. on malformatted input
(a macro call with missing closing parenthesis etc). In addition, it has the
following own error messages:
- unknown output format: “X”
- You requested unsupported output format X with the -T option.
- unknown macro “X” called
- stx2any encountered a macro beginning with w_, but knows no definition for
it. This is a warning, not an error — the offending macro and its
arguments are stripped from the output.
- environment “X” closed by “Y” in layer N
- Environments in stx2any must be properly nested. stx2any encountered
w_end( Y) when it was expecting w_end( X). Often this is a
sign of a forgotten w_end( X).
- If N (the layer) is something other than 0, then the problem is
probably in your environment definitions, not at the point that stx2any
was processing when it encountered the error.
- unknown environment “X”
- There was an attempt to begin an environment whose name is unknown to
stx2any, i.e. no such environment has been defined.
- diversion “X” closed by “Y”
- unknown diversion “X”
- Same as above, but for diversions (w_begdiv and w_enddiv).
- attempt to use “X” in secure environment
- You requested secure processing with --more-secure and the document
contained an “insecure” macro. This is a warning message,
not an error — the causing macro is left in the text verbatim.
- unknown cross link to “X”
- There was a cross link to document X, but stx2any does not know
about such a document. Probably you didn't gather /X/'s data with
gather_stx_titles or you misspelled the document reference. This is a
warning, not an error — the reference is left in the output
verbatim, without any kind of link.
The return value of stx2any is zero on success, one if there was some problem.
ENVIRONMENT¶
- PAPERCONF
- PAPERSIZE
- used for determining the default paper size for LaTeX documents.
FILES¶
- /etc/papersize
- used for determining the default paper size for LaTeX documents.
- /usr/share/stx2any/common
- directory for the definitions shared by all formats
- /usr/share/stx2any/{html,man,latex,docbook-xml}
- directory for output format specific definitions
SEE ALSO¶
m4 (1), latex (1), groff (1), lexgrog (1), w3m (1), strip_stx (1),
gather_stx_titles (1), html2stx (1), extract_usage_from_stx (1)
Stx quickie guide (/usr/share/doc/stx2any/Stx-doc.txt)
Stx markup reference (/usr/share/doc/stx2any/Stx-ref.txt)
BUGS¶
The structured text format is not yet fully standardised. There are some corner
cases where it is unclear what the result of the formatting should be. In
these cases, the output of stx2any is authoritative, so it
cannot have
bugs :)
Some old GNU libc's seem to be abysmally slow on some instances of the emphasis
regexps. It would be possible to make the regexps faster and less correct, but
as newer GNU libc's and BSD libc seem to work OK in these cases, I guess it's
not worth it.
The --more-secure switch is not really very secure for reasons explained above.
The support for DocBook XML sucks. It is only included because someone will show
up anyway and ask, “hey, does it support
DocBook XML?”
Partly this sucking is due to my laziness, but partly it is because of the
nature of DocBook. For instance, stx2any will transform literal formatting
into DocBook Literal elements, but the
point of using DocBook is to
convey more information than that — whether it is some ComputerOutput,
UserInput, EnVar, or Application, or... and the result is still very abstract,
not actually meant for humans to read but rather for computers to process into
something readable. Now the truth is that I doubt you will ever come up with a
DSSSL stylesheet whose output outperforms LaTeX (for publishing on paper) or
direct conversion to HTML (for publishing on the web).
The only sensible reasons I can think of for using Stx as a DocBook frontend
are:
- 1.
- the ability to use both DocBook constructs and Stx abbreviations
- 2.
- if you have to write DocBook for some interesting reason (your boss told
you so) but don't want to learn it
- 3.
- you happen to already have infrastructure for processing DocBook
documents, and you want to take advantage of it
AUTHOR¶
This page is written by Panu A. Kalliokoski.