Scroll to navigation

hunspell(3) Library Functions Manual hunspell(3)

NAME

hunspell - spell checking, stemming, morphological generation and analysis

SYNOPSIS

#include <hunspell.hxx> /* or */
#include <hunspell.h>

Hunspell(const char *affpath, const char *dpath);

Hunspell(const char *affpath, const char *dpath, const char * key);

~Hunspell();

int add_dic(const char *dpath);

int add_dic(const char *dpath, const char *key);

int spell(const char *word);

int spell(const char *word, int *info, char **root);

int suggest(char***slst, const char *word);

int analyze(char***slst, const char *word);

int stem(char***slst, const char *word);

int stem(char***slst, char **morph, int n);

int generate(char***slst, const char *word, const char *word2);

int generate(char***slst, const char *word, char **desc, int n);

void free_list(char ***slst, int n);

int add(const char *word);

int add_with_affix(const char *word, const char *example);

int remove(const char *word);

char * get_dic_encoding();

const char * get_wordchars();

const std::vector<w_char>& get_wordchars_utf16();

struct cs_info * get_csconv();

const char * get_version();

DESCRIPTION

The Hunspell library routines give the user word-level linguistic functions: spell checking and correction, stemming, morphological generation and analysis in item-and-arrangement style.

The optional C header contains the C interface of the C++ library with Hunspell_create and Hunspell_destroy constructor and destructor, and an extra HunHandle parameter (the allocated object) in the wrapper functions (see in the C header file hunspell.h).

The basic spelling functions, spell() and suggest(), can also be used for stemming, morphological generation and analysis via XML input (see XML API).

Constructor and destructor

Hunspell's constructor takes the paths to the affix and dictionary files. (On Windows, use UTF-8 encoded paths starting with the long path prefix \\?\ to handle system-independent character encoding and very long path names.) See the hunspell(4) manual page for the dictionary format. The optional key parameter is for dictionaries encrypted by the hzip tool of the Hunspell distribution.

Extra dictionaries

The add_dic() function loads an extra dictionary file. The extra dictionaries use the affix file of the allocated Hunspell object. The maximum number of extra dictionaries is limited in the source code (20).

Spelling and correction

The spell() function returns non-zero, if the input word is recognised by the spell checker, and a zero value if not. Optional reference variables return a bit array (info) and the root word of the input word. Info bits checked with the SPELL_COMPOUND, SPELL_FORBIDDEN or SPELL_WARN macros mark compound words, explicitly forbidden words, and probably bad words. Since version 1.3, the non-zero return value is 2 for dictionary words with the "WARN" flag (probably bad words).

The suggest() function has two input parameters: a reference to the output suggestion list, and an input word. The function returns the number of suggestions. The reference variable will hold the address of the newly allocated suggestion list, or NULL if the return value of suggest() is zero. The maximum number of suggestions is limited in the source code.

The spell() and suggest() can recognize XML input, see the XML API section.

Morphological functions

The plain stem() and analyze() functions are similar to suggest(), but instead of suggestions they return stems and results of the morphological analysis. The plain generate() takes a second word as well. This extra word and its affixation serve as the model for morphologically generating the requested forms of the first word.

These functions only return useful results for dictionaries whose entries carry morphological description fields (po:, st:, is:, ts:, al:, ds:, dp: etc., as documented in hunspell(5)). Most widely distributed dictionaries (including the en_US, fr, nl and hu_HU dictionaries shipped by Linux distributions and LibreOffice) do not include this metadata, and analyze(), stem() and generate() will return an empty list for them. This is a property of the dictionary, not a bug in the library. See tests/morph.aff and tests/morph.dic in the Hunspell source tree for a minimal dictionary that exercises these functions.

The extended stem() and generate() use the results of a morphological analysis:

char ** result, result2;
int n1 = analyze(&result, "words");
int n2 = stem(&result2, result, n1);   

The morphological annotation of the Hunspell library uses fixed field identifiers (two letters and a colon); see the hunspell(4) manual page.

char ** result;
char * affix = "is:plural"; // description depends from dictionaries, too
int n = generate(&result, "word", &affix, 1);
for (int i = 0; i < n; i++) printf("%s\n", result[i]);

Memory deallocation

The free_list() function frees the memory allocated by the suggest(), analyze(), generate() and stem() functions.

Other functions

The add(), add_with_affix() and remove() are helper functions for implementing a personal dictionary, used to add and remove words from the base dictionary at run time. The add_with_affix() function uses a second root word as the model for the affixation and compounding of the new word.

The get_dic_encoding() function returns "ISO8859-1" or the character encoding defined in the affix file with the "SET" keyword.

The get_csconv() function returns the 8-bit character case table of the encoding of the dictionary.

The get_wordchars() and get_wordchars_utf16() functions return the extra word characters defined for tokenization in the affix file by the "WORDCHARS" keyword.

The get_version() returns the version string of the library.

XML API

The spell() function returns non-zero for the "<?xml?>" input indicating the XML API support.

The suggest() function stems, analyzes and generates the forms of the input word when called with one of the following "SPELLML" syntaxes:

<?xml?>
<query type="analyze">
<word>dogs</word>
</query>

<?xml?>
<query type="stem">
<word>dogs</word>
</query>

<?xml?>
<query type="generate">
<word>dog</word>
<word>cats</word>
</query>

<?xml?>
<query type="generate">
<word>dog</word>
<code><a>is:pl</a><a>is:poss</a></code>
</query>

<?xml?>
<query type="add">
<word>word</word>
</query>

<?xml?>
<query type="add">
<word>word</word>
<word>model_word_for_affixation_and_compounding</word>
</query>

The outputs of the type="stem" query and the stem() library function are the same. The output of the type="analyze" query is a string containing a <code><a>result1</a><a>result2</a>...</code> element. This element can be used in the second syntax of the type="generate" query.

EXAMPLE

See analyze.cxx in the Hunspell distribution.

AUTHORS

Hunspell is based on Ispell's spell checking algorithms and OpenOffice.org's Myspell source code.

Author of International Ispell is Geoff Kuenning.

Author of MySpell is Kevin Hendricks.

Author of Hunspell is László Németh.

Author of the original C API is Caolan McNamara.

Author of the Aspell table-driven phonetic transcription algorithm and code is Björn Jacke.

See also THANKS and Changelog files of Hunspell distribution.

2026-04-30