table of contents
other versions
- experimental 0+r1668.r3-1
SEQUITUR-G2P(1) | User Commands | SEQUITUR-G2P(1) |
NAME¶
sequitur-g2p - grapheme-to-phoneme conversion tool
SYNOPSIS¶
sequitur-g2p [OPTION]... FILE...
DESCRIPTION¶
Grapheme-to-Phoneme Conversion
Samples can be either in plain format (one word per line followed by phonetic transcription) or Bliss XML Lexicon format.
OPTIONS¶
- --version
- show program's version number and exit
- -h, --help
- show this help message and exit
- -p FILE, --profile=FILE
- Profile execution time and store result in FILE
- -R, --resource-usage
- Report resource usage execution time
- -Y, --psyco
- Use Psyco to speed up execution
- --tempdir=PATH
- store temporary files in PATH
- -t FILE, --train=FILE
- read training sample from FILE
- -d FILE / N%, --devel=FILE / N%
- read held-out training sample from FILE or use N% of the training data
- -x FILE, --test=FILE
- read test sample from FILE
- --checkpoint
- save state of training in regular time intervals. The name of the checkpoint file is derived from --writemodel.
- --resume-from-checkpoint=FILE
- load checkpoint FILE and continue training
- -T, --transpose
- Transpose model, i.e. do phoneme-to-grapheme conversion
- -m FILE, --model=FILE
- read model from FILE
- -n FILE, --write-model=FILE
- write model to FILE
- --continuous-test
- report error rates on development and test set in each iteration
- -S, --self-test
- apply model to development set and report error rates
- -s l1,l2,r1,r2, --size-constraints=l1,l2,r1,r2
- multigrams must have l1 ... l2 left-symbols and r1 ... r2 right-symbols
- -E, --no-emergence
- do not allow new joint-multigrams to be added to the model
- --viterbi
- estimate model using maximum approximation rather than true EM
- -r, --ramp-up
- ramp up the model
- -W, --wipe-out
- wipe out probabilities, retain only model structure
- -C, --initialize-with-counts
- initialize probabilities estimation by counting how many times every graphone occurs in the training set, disregarding possible overlaps
- -i MINITERATIONS, --min-iterations=MINITERATIONS
- minimum number of EM iterations during training
- -I MAXITERATIONS, --max-iterations=MAXITERATIONS
- maximum number of EM iterations during training
- --eager-discount-adjustment
- re-adjust discounts in each iteration
- --fixed-discount=D
- set discount to D and keep it fixed
- -e ENC, --encoding=ENC
- use character set encoding ENC
- -P, --phoneme-to-phoneme
- train/apply a phoneme-to-phoneme converter
- --test-segmental
- evaluate only at segmental level, i.e. do not count syllable boundaries and stress marks
- -B FILE, --result=FILE
- store test result in table FILE (for use with bootlog or R)
- -a FILE, --apply=FILE
- apply grapheme-to-phoneme conversion to words read from FILE
- -V Q, --variants-mass=Q
- generate pronunciation variants until \sum_i p(var_i) >= Q (only effective with --apply)
- --variants-number=N
- generate up to N pronunciation variants (only effective with --apply)
- -f FILE, --fake=FILE
- use a translation memory (read from sample FILE) instead of a genuine model (use in combination with -x to evaluate two files against each other)
- --stack-limit=N
- limit size of search stack to N elements
May 2016 | sequitur-g2p 0+r1668 |