- testing 4.0.0-2
- stretch-backports 4.00~git2439-c3ed6f03-1~bpo9+1
- unstable 4.0.0-2
TEXT2IMAGE(1) | TEXT2IMAGE(1) |
NAME¶
text2image - generate OCR training pages.SYNOPSIS¶
text2image --text FILE --outputbase PATH --fonts_dir PATH [OPTION]DESCRIPTION¶
text2image(1) generates OCR training pages. Given a text file it outputs an image with a given font and degradation.OPTIONS¶
--text FILE--outputbase FILE
--fontconfig_tmpdir PATH
--fonts_dir PATH
--font FONTNAME
--writing_mode MODE
--tlog_level INT
--max_pages INT
--degrade_image BOOL
--rotate_image BOOL
--strip_unrenderable_words BOOL
--ligatures BOOL
--exposure INT
--resolution INT
--xsize INT
--ysize INT
--margin INT
--ptsize INT
--leading INT
--box_padding INT
--char_spacing DOUBLE
--underline_start_prob DOUBLE
--underline_continuation_prob DOUBLE
--render_ngrams BOOL
--output_word_boxes BOOL
--unicharset_file FILE
--bidirectional_rotation BOOL
--only_extract_font_properties BOOL
USE THESE FLAGS TO OUTPUT ZERO-PADDED, SQUARE INDIVIDUAL CHARACTER IMAGES¶
--output_individual_glyph_images BOOL--glyph_resized_size INT
--glyph_num_border_pixels_to_pad INT
USE THESE FLAGS TO FIND FONTS THAT CAN RENDER A GIVEN TEXT¶
--find_fonts BOOL--render_per_font BOOL
--min_coverage DOUBLE
Example Usage: ``` text2image --find_fonts \ --fonts_dir /usr/share/fonts \ --text ../langdata/hin/hin.training_text \ --min_coverage .9 \ --render_per_font \ --outputbase ../langdata/hin/hin \ |& grep raw | sed -e s/ :.*/" \\/g | sed -e s/^/ "/ >../langdata/hin/fontslist.txt ```
SINGLE OPTIONS¶
--list_available_fonts BOOLHISTORY¶
text2image(1) was first made available for tesseract 3.03.RESOURCES¶
Main web site: https://github.com/tesseract-ocr Information on training tesseract LSTM: https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00SEE ALSO¶
tesseract(1)COPYING¶
Copyright (C) 2012 Google, Inc. Licensed under the Apache License, Version 2.0AUTHOR¶
The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995) and Google (2006-present).05/18/2018 |