UPMENDEX(1) | General Commands Manual | UPMENDEX(1) |
NAME¶
upmendex - Multilingual index processor
SYNOPSIS¶
upmendex [-ilqrcgf] [-s sty] [-d
dic] [-o ind] [-t log] [-p no]
[--] [ idx0 idx1 idx2 ...]
upmendex --help
DESCRIPTION¶
The program upmendex is a general purpose multilingual
hierarchical index generator working with upLaTeX, XeLaTeX and LuaLaTeX; it
accepts one or more input files (.idx; often produced by a text
formatter such as LaTeX families), sorts the entries, and produces an output
file which can be formatted. It supports Latin (including non-English),
Greek, Cyrillic, Korean Hangul and Han (Hanzi ideographs) scripts, as well
as Japanese Kana. It is almost compatible with makeindex and
mendex, and additional feature for handling readings of kanji words
is also available.
The formats of the input and output files are specified in a style file. The
readings of kanji words can be specified in a dictionary file.
The index can have up to three levels (0, 1, and 2) of subitem nesting.
OPTIONS¶
- -i
- Take input from stdin, even when index files are specified.
- -l
- Set ´sort by character order´. By default, ´sort by word order´ is used. Details are described below.
- -q
- Quiet mode; send no message to stderr, except error messages and warnings.
- -r
- Disable implicit page range formation. By default, three or more successive pages are automatically abbreviated as a range (e.g. 1–5).
- -c
- Compress sequence of intermediate blanks (space(s) and/or tab(s)) into a space and ignore leading and trailing blank(s). By default, blanks in the index key are retained.
- -g
- Make Japanese index head A-line (A, Ka, Sa, ...; 10 characters) of the gojuon table (Japanese syllabary). By default, all 48 characters in the gojuon table are used.
- -f
- Force to output characters even if the scripts are not supported by upmendex.
- -s sty
- Employ sty as the style file.
- -d dic
- Employ dic as the dictionary file. The dictionary file is composed of lists of <index_word reading>.
- -o ind
- Employ ind as the output index file. By default, the file name is created by appending the extension ind to the base name of the first input file.
- -t log
- Employ log as the transcript file. By default, the file name is created by appending the extension ilg to the base name of the first input file.
- -p no
- Set the starting page number of the output index list to be no. The argument no may be numerical or one of the following: any (the next page to the end of contents), odd (the next odd page to the end of contents), even (the next even page to the end of contents).
- --help
- Show summary of options.
- --
- Arguments after -- are not taken as options. This is useful when the input file name starts with '-'.
STYLE FILE¶
The style file informs upmendex about the format of the idx input files and the intended format of the final output file. The format is upper compatible with the one for makeindex and mendex. The style file contains a list of <specifier attribute> pairs. There are two types of specifiers: input and output. Pairs do not have to appear in any particular order. A line begun by ´%´ is a comment.
Input file style parameter
- keyword <string>
- "\\indexentry"
- arg_open <char>
- ´{´
- arg_close <char>
- ´}´
- level <char>
- ´!´
- actual <char>
- ´@´
- encap <char>
- ´|´
- page_precedence <string>
- "rnaRA"
- quote <char>
- ´"´
- escape <char>
- ´\\´
Output file style parameter
- preamble <string>
- "\\begin{theindex}\n"
- postamble <string>
- "\n\n\\end{theindex}\n"
- setpage_prefix <string>
- "\n \\setcounter{page}{"
- setpage_suffix <string>
- "}\n"
- group_skip <string>
- "\n\n \\indexspace\n"
- hangul_head <string>
- "��������������"
- tumunja <string>
- "��������������"
- devanagari_head <string>
- "à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�à¤�ठडढणतथदधनपफबà¤à¤®à¤¯à¤°à¤²à¤³à¤µà¤¶à¤·à¤¸à¤¹"
- thai_head <string>
- "à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�à¸�ภมยรฤลฦวศษสหฬà¸à¸®"
- item_0 <string>
- "\n \\item "
- item_1 <string>
- "\n \\subitem "
- item_2 <string>
- "\n \\subsubitem "
- item_01 <string>
- "\n \\subitem "
- item_x1 <string>
- "\n \\subitem "
- item_12 <string>
- "\n \\subsubitem "
- item_x2 <string>
- "\n \\subsubitem "
- delim_0 <string>
- ", "
- delim_1 <string>
- ", "
- delim_2 <string>
- ", "
- delim_n <string>
- ", "
- delim_r <string>
- "--"
- symhead_positive <string>
- "Symbols"
- symhead_negative <string>
- "symbols"
- numhead_positive <string>
- "Numbers"
- numhead_negative <string>
- "numbers"
- character_order <string>
- "SNLGCJKHDTah"
ABOUT JAPANESE PROCESSING¶
upmendex has an additional feature to simplify the
procedure of handling Japanese indexes, compared to makeindex. Users
can save the effort of manually specifying a reading for every kanji word.
Japanese kanji words are usually sorted by the syllables of their readings
(´Yomi´), which can be represented by kana (Hiragana,
Katakana) scripts. upmendex accepts index words specified in kana
expression directly on an input file, and also accepts conversion from index
words in Kanji or symbols to phonogram scripts by referring to Japanese
dictionaries.
Examples of internal simplification of syllables are shown below.
�������� ��������
ã��ã��ã�ã�³ã��ã��ã�·ã�¥ ã�¾ã�¤ã��ã��ã�¨ã�¤ã��ã��
ã�¯ã�¼ã��ã� ã��ã��ã�µã��
The dictionary file consists of list with
<´index_word´ ´reading´>. The index word
can be written in any scripts (kanji, kana, etc), and the reading can be in
any phonograms such as Hiragana or Katakana scripts. The delimiter between
the index word and its reading is one or more tab(s) or space(s).
An example of a Japanese dictionary is shown below.
æ¼¢å� ã��ã��ã��
èªã�¿ ã��ã�¿
�� �����
� ��
Here, each index word is allowed to have only one Yomi. Though
some kanji words (ex.
�表�)
may have more than one Yomi´s (ex.
�����
and
�����),
only one of them can be registered in the dictionary. When some different
Yomi´s are needed, they should be specified explicitly in kana
expression (ex.
\index{���@表}
or
\index{���@表})
on the input file.
Moreover, a dictionary file is automatically referred by setting the file name
at an environment variable INDEXDEFAULTDICTIONARY. The dictionary set
by the environment variable can be used together with file(s) specified by
-d option.
ABOUT SORTING PROCEDURE¶
upmendex sorts indexes as is (´sort by word
order´) by default. Setting -l option, spaces between words in
an index are truncated prior to sorting procedure (´sort by character
order´).
Even when sort by character order, the index at output remains the original
sequence without the truncation.
Follows show an example.
X Window Xlib
Xlib XView
XView X Window
In addition, two sorting methods can be applied for indexes which
contains both Japanese kana and other scripts (e.g. Latin script). By
setting priority 0 (default) and 1 at a style file, a space between
Japanese Kana and other scripts is inserted and not inserted respectively,
prior to the sorting procedure.
Follows show an example.
index sort ind����
ind���� index sort
ENVIRONMENT VARIABLES¶
upmendex refers environment variables as follows.
- INDEXSTYLE
- Directory where index style files exist.
- INDEXDEFAULTSTYLE
- Index style file to be referred to as default.
- INDEXDICTIONARY
- Directory where dictionary files exist.
- INDEXDEFAULTDICTIONARY
- Dictionary file which is automatically read.
DETAIL¶
Detailed specification is compatible with makeindex.
KNOWN ISSUES¶
When plural page number expression is used, .idx files should be specified along with the order of page numbers. Otherwise, wrong page numbers might be output.
SEE ALSO¶
tex(1), latex(1), makeindex(1),
mendex(1).
International Components for Unicode (ICU): <http://icu.unicode.org/>,
<https://unicode-org.github.io/icu/>
AUTHOR¶
This manual page was written by Takuji Tanaka based on the mendex manual page written by Japanese TeX Development Community.