other versions
OCR4GAMERA(1) | OCR4GAMERA(1) |
NAME¶
ocr4gamera - OCR system using the Gamera framework
USAGE¶
ocr4gamera -x <traindata> [options] <imagefile>
OPTIONS¶
- -v <int>, --verbosity=<int>
- Set verbosity level to <int>. Possible values are 0 (default): silent operation; 1: information on progress; >2: segmentation info is written to PNG files with prefix debug_.
- -h, --help
- Display help and exit.
- --version
- Print version and exit.
- -d, --deskew
- Do a skew correction (recommended).
- -mf <ws>, --median_filter=<ws>
- Smooth the input image with a median filter with window size <ws>. Default is <ws>=0, which means no smoothing
- -ds <s>, --despeckle=<s>
- Remove all speckle with size <= <s>. Default is <s> = 0, which means no despeckling.
- -f, --filter
- Filter out very large (images) and very small components (noise).
- -a, --automatic-group
- Autogroup glyphs with classifier.
- -x <file>, --xmlfile=<file>
- Read training data from <file>.
- -o <xml>, --output=<xml>
- Write recognized text to file <xml> (otherwise it is written to stdout).
- -od <dir>, --output_directory=<dir>
- Writes for each input image <img> the recognized text to <dir>/<img>.txt. Note that this option cannot be used in combination with -o (--outfile).
- -c <csv>, --extra_chars_csvfile=<csv>
- Read additional class name conversions from file <csv>. <csv> must contain one conversion per line.
- -R <rules>, --heuristic_rules=<rules>
- Apply heuristic rules <rules> for disambiguation of some chars. <rules> can be roman (default) or none (for no rules).
- -D, --dictionary-correction
- Correct words using a dictionary (requires aspell or ispell).
- -L <lang>, --dictionary-language=<lang>
- Use <lang> as language for aspell (when option -D is set).
- -e <int>, --edit-distance=<int>
- Correct words only when edit distance not more than <int>.
- -ho, --hocr_out
- Writes output as hocr file (only works with the -o option).
- -hi <hocrfile>, --hocr_in=<hocrfile>
- Uses an hocr input file for textline segmentation.