Scroll to navigation

LLAMA-BENCH(1) User Commands LLAMA-BENCH(1)

NAME

llama-bench - llama-bench

DESCRIPTION

load_backend: loaded CPU backend from /usr/lib/x86_64-linux-gnu/ggml/backends0/libggml-cpu-icelake.so usage: obj-x86_64-linux-gnu/bin/llama-bench [options]

options:

-h, --help

numa mode (default: disabled)
number of times to repeat each test (default: 5)
process/thread priority (default: 0)
delay between each test (default: 0)
output format printed to stdout (default: md)

-oe, --output-err <csv|json|jsonl|md|sql> output format printed to stderr (default: none)

verbose output
print test progress indicators
skip warmup runs before benchmarking

test parameters:

(default: models/7B/ggml-model-q4_0.gguf)
(default: 512)
(default: 128)
(default: )
(default: 0)
(default: 2048)
(default: 512)
(default: f16)
(default: f16)
(default: -1)
(default: 6)
(default: 0x0)
(default: 0)
(default: 50)
(default: 99)
(default: layer)
(default: 0)
(default: 0)
(default: 0)
(default: 1)
(default: 0)
(default: 0)
(default: disabled)
(default: 0)

Multiple values can be given for each parameter by separating them with ',' or by specifying the parameter multiple times. Ranges can be given as 'first-last' or 'first-last+step' or 'first-last*mult'.

August 2025 debian