NAME¶

unicode::wordbreak_callback_base, unicode::wordbreak - unicode word-breaking rules

SYNOPSIS¶

#include <courier-unicode.h>
class wordbreak : public unicode::wordbreak_callback_base {
public:


    using unicode::wordbreak_callback_base::operator<<;


    using unicode::wordbreak_callback_base::operator();


    int callback(bool flag)


    {


        // ...


    }
};
char32_t c;
std::u32string buf;
wordbreak compute_wordbreak;
compute_wordbreak << c;
compute_wordbreak(buf);
compute_wordbreak(buf.begin(), buf.end());
compute_wordbreak.finish();
// ...
unicode_wordbreakscan scan;
scan << c;
size_t nchars=scan.finish();

DESCRIPTION¶

unicode::wordbreak_callback_base is a C++ binding for the unicode word-breaking rule implementation described in unicode_word_break(3).

Subclass unicode::wordbreak_callback_base and implement callback() that's virtually inherited from unicode::wordbreak_callback_base. The callback() callback function receives the output values from the word-breaking algorithm, namely a bool indicating whether a word break exists before the unicode character in the underlying input sequence.

callback() should return 0. A non-zero return reports an error, that stops the word-breaking algorithm. See unicode_word_break(3) for more information.

The input unicode characters for the word-breaking algorithm are provided by the << operator, one unicode character at a time; or by the () operator, passing either a container, or a beginning and an ending iterator value for an input sequence of unicode characters. finish() indicates the end of the unicode character sequence.

unicode::wordbreakscan is a C++ binding for the unicode_wbscan_init(), unicode_wbscan_next() and unicode_wbscan_end methods described in unicode_word_break(3). Its << iterates over the unicode characters, and finish() indicates the number of characters before the first unicode word break. The << iterator returns a bool indicating when the first word break has already been found, so further calls are not necessary.

AUTHOR¶

Sam Varshavchik

Author

08/26/2025

Courier Unicode Library

Source file:	unicode::wordbreak_callback_base.3.en.gz (from libcourier-unicode-dev 2.4.0-4+b1)
Source last updated:	2026-04-23T20:08:45Z
Converted to HTML:	2026-06-25T20:40:35Z

NAME¶

SYNOPSIS¶

DESCRIPTION¶

SEE ALSO¶

AUTHOR¶