MERGE_UNICHARSETS(1)

NAME¶

merge_unicharsets - Simple tool to merge two or more unicharsets.

SYNOPSIS¶

merge_unicharsets unicharset-in-1 ... unicharset-in-n unicharset-out

DESCRIPTION¶

merge_unicharsets(1) is a simple tool to merge two or more unicharsets. It could be used to create a combined unicharset for a script-level engine, like the new Latin or Devanagari.

IN/OUT ARGUMENTS¶

unicharset-in-1

(Input) The name of the first unicharset file to be merged.

unicharset-in-n

(Input) The name of the nth unicharset file to be merged.

unicharset-out

(Output) The name of the merged unicharset file.

HISTORY¶

merge_unicharsets(1) was first made available for tesseract4.00.00alpha.

RESOURCES¶

Main web site: https://github.com/tesseract-ocr Information on training tesseract LSTM: https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html

COPYING¶

AUTHOR¶

The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995) and Google (2006-2018).

03/29/2026

Source file:	merge_unicharsets.1.en.gz (from tesseract-ocr 5.5.0-1.1)
Source last updated:	2026-03-29T15:42:00Z
Converted to HTML:	2026-03-31T20:40:56Z