table of contents
smujajgau(1L) | smujajgau(1L) |
NAME¶
smujajgau - Dictionary builder for use with jbofihe and cmafihe
SYNOPSIS¶
smujajgau [-v]
smujajgau dictionary-name [ file1 file2 ... filen ]
DESCRIPTION¶
smujajgau is a program that reads a set of English definitions for Lojban words, and formats them into a presorted dictionary for use by the jbofihe and cmafihe programs. The dictionary is arranged for rapid access.
OPTIONS¶
- -v
- Show the program version and exit.
- dictionary-name
- This is the name of the formatted dictionary to be generated / modified. If the file exists, the definitions in the other file will be added, replacing existing entries where they clash.
- file1 .. filen
- The source files to be added. Lines beginning with # are treated as comments and discarded. Other lines should have one of the forms
lojban:english
lojban:english:comment
SEE ALSO¶
FILES¶
- /usr/local/lib/jbofihe/smujmaji.dat
- This file is the default location where jbofihe and cmafihe expect to find the dictionary. It should therefore be the default first argument to smujajgau (unless the software was installed to an alternative location.)
SPECIAL TRANSLATION FORMATS¶
Dictionary entries for brivla (gismu & lujvo) are expected to provide entries for each place of the word. The English translation should indicate the type of the word and the translation. The types are shown in the following table. In addition, the gloss for a translation X is shown depending on the context where it will arise in the translation. These defaults may be overridden.
Letter | Type | Noun | Verb | Qualifier | Tag |
A | Act | X-er(s) | X-ing | X-ing | X-er(s) |
D | Discrete | X(s) | being X | X | X |
S | Substance | X | being X | X | X |
P | Property | X thing(s) | being X | X | X thing(s) |
R | Rev. prop | thing(s) X | being X | X | things(s) X |
I | Idiomatic | thing(s) X-ing | X-ing | X-ing | thing(s) X-ing |
E | Event | X(s) | being X | X | X |
To specify the dictionary entries, the lojban should take the form 'brivlaN', where brivla is the word and N is the place number. One of the following may be suffixed to provide an override of the defaults in the table : n v a or t (for noun, verb, adjective, tag respectively.)
As an example, 'nanmu' might have entries
nanmu1:D;man
nanmu1n:D;man/men
whereas 'nandu' might have the definition
nandu1:P;difficulty
nandu2:I;have* difficulty
nandu3:S;conditions for difficulty
nandu3t:under conditions
a '*' is used in the places where the affixed -s, -er and -ing should be applied (instead of putting them at the end of the English translation, which is the default.)
The 'places.dat' file included in the distribution shows many examples.
Where a translation with an 'n' suffix exists, this is used in place of some other default forms in the table. For example, this allows special plural forms to be used in other places (e.g. tags.)
The dictionary also supports some 'pattern' translations. This allows defaults to be automatically generated for forms ending in '-gau', defined in terms of the prefix.
The 'Lojban' form for such patterns should be defined in the dictionary as '*Mprefix+N' or '*M+suffixN' for prefix forms (e.g. nu+) and suffix forms (e.g. +zmadu) respectively. M is the 'precedence' (5 highest, 0 lowest), defining the order in which prefix v suffix matches will be attempted. N is the place number as usual. The letters n, v, a or t may be suffixed to define a particular form if required, as for normal definitions.
The 'English' form is either a standard definition or a place redirection. In standard definitions, the symbol % is used to mean the translation of the rest of the lujvo form. Place redirections take the form @N, and mean that the lojban pattern form should be translated as place N of the rest of the lujvo.
An example makes this clearer : zmadu.
*2+zmadu1:R;more %1q
*2+zmadu2:R;less %1q
*2+zmadu2t:than
*2+zmadu3:@2
*2+zmadu4:@3
*2+zmadu5:@4
*2+zmadu6:@5
The components are defined in terms of the full gismu forms, rather than rafsi (hence zmadu rather than mau). This is necessary because the form of a rafsi can change when components are added to or subtracted from a lujvo form.
When the 'English' form is given as '-', it indicates that the next components inwards should be concatenated to form a new 'Lojban' form for lookup. This facility is only used for one thing so far - to handle the rafsi 'zil' followed by the rafsi for a cmavo of selma'o PA to puncture a place from the following form. An example (to delete the 1st place of a word, e.g. zilpavykla) :
*4zi'o+1:-
*4zi'o+2:-
*4zi'o+3:-
*4zi'o+4:-
*4zi'o+5:-
*4zi'o+pa+1:@2
*4zi'o+pa+2:@3
*4zi'o+pa+3:@4
*4zi'o+pa+4:@5
*4zi'o+pa+5:@6
The pattern forms are all defined in a file 'patterns' in the distribution.
For cmavo, there are special forms used for certain selma'o, particular tenses. These allow dependence on the context in which the cmavo is used. Logical connectives are also defined in a special way. The file 'extradict' in the distribution provides examples.
(More documentation is required!)
BUGS¶
ju'oru'e so'imei (Surely there are many).
REFERENCES¶
AUTHOR¶
Richard Curnow <rpc@myself.com>
April 2000 |