table of contents
| LLAMA-TOKENIZE(1) | User Commands | LLAMA-TOKENIZE(1) | 
NAME¶
llama-tokenize - llama-tokenize
DESCRIPTION¶
usage: obj-x86_64-linux-gnu/bin/llama-tokenize [options]
The tokenize program tokenizes a prompt using a given model, and prints the resulting tokens to standard output.
It needs a model file, a prompt, and optionally other flags to control the behavior of the tokenizer.
- The possible options are:
- -h, --help
- print this help and exit
- -m MODEL_PATH, --model MODEL_PATH
- path to model.
- --ids
- if given, only print numerical token IDs, and not token strings. The output format looks like [1, 2, 3], i.e. parseable by Python.
-f PROMPT_FNAME, --file PROMPT_FNAME read prompt from a file.
- -p PROMPT, --prompt PROMPT
- read prompt from the argument.
- --stdin
- read prompt from standard input.
- --no-bos
- do not ever add a BOS token to the prompt, even if normally the model uses a BOS token.
- --no-escape
- do not escape input (such as \n, \t, etc.).
- --no-parse-special
- do not parse control tokens.
- --log-disable
- disable logs. Makes stderr quiet when loading the model.
- --show-count
- print the total number of tokens.
| August 2025 | debian |