NAME¶

omindex - Index static website data via the filesystem

SYNOPSIS¶

omindex [OPTIONS] --db DATABASE [BASEDIR] DIRECTORY

DESCRIPTION¶

omindex - Index static website data via the filesystem

DIRECTORY is the directory to start indexing from.

BASEDIR is the directory corresponding to URL (default: DIRECTORY).

OPTIONS¶

-d, --duplicates: set duplicate handling ('ignore' or 'replace')
-p, --no-delete: skip the deletion of documents corresponding to deleted files (--preserve-nonduplicates is a deprecated alias for --no-delete)
-e, --empty-docs=ARG: how to handle documents we extract no text from: ARG can be index, warn (issue a diagnostic and index), or skip. (default: warn)
-D, --db=DATABASE: path to database to use
-U, --url=URL: base url BASEDIR corresponds to (default: /)
-M, --mime-type=EXT:TYPE: assume any file with extension EXT has MIME Content-Type TYPE, instead of using libmagic (empty TYPE removes any existing mapping for EXT)
-F, --filter=M[,[T][,C]]:CMD: process files with MIME Content-Type M using command CMD, which produces output (on stdout or in a temporary file) with format T (Content-Type or file extension; currently txt (default) or html) in character encoding C (default: UTF-8). E.g. -Fapplication/octet-stream:'strings -n8' or -Ftext/x-foo,,utf-16:'foo2utf16 %f %t'
-l, --depth-limit=LIMIT: set recursion limit (0 = unlimited)
-f, --follow: follow symbolic links
-i, --ignore-exclusions: ignore meta robots tags and similar exclusions
-S, --spelling: index data for spelling correction
-m, --max-size: maximum size of file to index (in bytes or with a suffix of 'K'/'k', 'M'/'m', 'G'/'g') (default: unlimited)
--sample=SOURCE: what to use for the stored sample of text for HTML documents - SOURCE can be 'body' or 'description' (default: 'body')
-E, --sample-size=SIZE: maximum size for the document text sample (supports the same formats as --max-size). (default: 512)
-T, --title-size=SIZE: maximum size for the document title (supports the same formats as --max-size). (default: 128)
-R, --retry-failed: retry files which omindex failed to extract text from on a previous run
--opendir-sleep=SECS: sleep for SECS seconds before opening each directory - sleeping for 2 seconds seems to reliably work around problems with indexing files on Microsoft DFS shares.
-C, --track-ctime: track each file's ctime so we can detect changes to ownership or permissions.
-v, --verbose: show more information about what is happening
--overwrite: create the database anew (the default is to update if the database already exists)
-s, --stemmer=LANG: set the stemming language (default: english). Possible values: arabic armenian basque catalan danish dutch earlyenglish english finnish french german german2 hungarian italian kraaij_pohlmann lovins norwegian porter portuguese romanian russian spanish swedish turkish (pass 'none' to disable stemming)
-h, --help: display this help and exit
-V, --version: output version information and exit

Please report bugs at: https://xapian.org/bugs

April 2017

xapian-omega 1.4.4

Source file:	omindex.1.en.gz (from xapian-omega )
Source last updated:	2017-04-25T23:50:06Z
Converted to HTML:	2025-04-24T06:44:45Z