table of contents
other versions
- trixie 1.4.29-1
- testing 1.4.31-1
- unstable 1.4.31-1
- experimental 2.0.0-1
| OMINDEX(1) | User Commands | OMINDEX(1) |
NAME¶
omindex - Index static website data via the filesystem
SYNOPSIS¶
omindex [OPTIONS] --db DATABASE [BASEDIR] DIRECTORY
DESCRIPTION¶
omindex - Index static website data via the filesystem
DIRECTORY is the directory to start indexing from.
BASEDIR is the directory corresponding to URL (default: DIRECTORY).
OPTIONS¶
- -d, --duplicates=ARG
- set duplicate handling: ARG can be 'ignore' or 'replace' (default: replace)
- -p, --no-delete
- skip the deletion of documents corresponding to deleted files
- -e, --empty-docs=ARG
- how to handle documents we extract no text from: ARG can be index, warn (issue a diagnostic and index), or skip. (default: warn)
- -D, --db=DATABASE
- path to database to use
- -U, --url=URL
- base url BASEDIR corresponds to (default: /)
- -M, --mime-type=EXT:TYPE
- assume any file with extension EXT has MIME Content-Type TYPE, instead of using libmagic (empty TYPE removes any existing mapping for EXT; other special TYPE values: 'ignore' and 'skip')
- -G, --mime-type-match=GLOB:TYPE
- assume any file with leaf name matching shell wildcard pattern GLOB has MIME Content-Type TYPE (special TYPE values: 'ignore' and 'skip')
- -F, --filter=M[,[T][,C]]:CMD
- process files with MIME Content-Type M using command CMD, which produces output (on stdout or in a temporary file) with format T (Content-Type or file extension; currently txt (default), html or svg) in character encoding C (default: UTF-8). E.g. -Fapplication/octet-stream:'|strings -n8' or -Ftext/x-foo,,utf-16:'foo2utf16 %f %t'
- -W, --worker=TYPE:WORKER
- process files with MIME Content-Type TYPE using worker sub-process WORKER. WORKER is the name of the program to run to start the worker. If it has no path then it's looked for in pkglibbindir (which can be overridden by setting environment variable XAPIAN_OMEGA_PKGLIBBINDIR). This invocation will look in: /usr/local/lib/xapian-omega/bin
- --read-filters=FILE
- bulk-load --filter arguments from FILE, which should contain one such argument per line (e.g. text/x-bar:bar2txt --utf8). Lines starting with # are treated as comments and ignored.
- --read-workers=FILE
- bulk-load --worker arguments from FILE, which should contain one such argument per line (e.g. text/x-bar:omindex_libbar). Lines starting with # are treated as comments and ignored.
- -l, --depth-limit=LIMIT
- set recursion limit (0 = unlimited)
- -f, --follow
- follow symbolic links
- -i, --ignore-exclusions
- ignore meta robots tags and similar exclusions
- -S, --spelling
- index data for spelling correction
- -m, --max-size=N[SUFFIX]
- maximum size of file to index (in bytes or with a suffix of 'K'/'k', 'M'/'m', 'G'/'g') (default: unlimited)
- --sample=SOURCE
- what to use for the stored sample of text for HTML documents - SOURCE can be 'body' or 'description' (default: 'body')
- -E, --sample-size=SIZE
- maximum size for the document text sample (supports the same formats as --max-size). (default: 512)
- -T, --title-size=SIZE
- maximum size for the document title (supports the same formats as --max-size). (default: 128)
- -R, --retry-failed
- retry files which omindex failed to extract text from on a previous run
- --opendir-sleep=SECS
- sleep for SECS seconds before opening each directory - sleeping for 2 seconds seems to reliably work around problems with indexing files on Microsoft DFS shares.
- -C, --track-ctime
- track each file's ctime so we can detect changes to ownership or permissions.
- --date-terms
- index D, M and Y prefixed terms to support date range filtering using terms (we now recommend using a value slot for this instead).
- --no-date-terms
- ignored for compatibility with Omega 1.4.x.
- -v, --verbose
- show more information about what is happening
- --overwrite
- create the database anew (the default is to update if the database already exists)
- -s, --stemmer=LANG
- set the stemming language (default: english). Possible values: arabic armenian basque catalan danish dutch dutch_porter earlyenglish english esperanto estonian finnish french german greek hindi hungarian indonesian irish italian lithuanian lovins nepali norwegian polish porter portuguese romanian russian serbian spanish swedish tamil turkish yiddish (pass 'none' to disable stemming)
- -h, --help
- display this help and exit
- -V, --version
- output version information and exit
Please report bugs at: https://xapian.org/bugs
| March 2026 | xapian-omega 2.0.0 |