NAME¶
hashdeep - Compute, compare, or audit multiple message digests
SYNOPSIS¶
hashdeep -V | -h
hashdeep [-c <alg1>[,<alg2>]] [-k <file>] [-i
<size>] [-f <file>] [-o <fbcplsde>] [-amxwMXrespblvv]
[-F<bum>] [-j <num>] [
FILES]
DESCRIPTION¶
Computes multiple hashes, or message digests, for any number of files while
optionally recursively digging through the directory structure. By default the
program computes MD5 and SHA-256 hashes, equivalent to -c md5,sha256. Can also
take a list of known hashes and display the filenames of input files whose
hashes either do or do not match any of the known hashes. Can also use a list
of known hashes to audit a set of FILES. Errors are reported to standard
error. If no FILES are specified, reads from standard input.
- -c <alg1>[,<alg2>...]
- Computation mode. Compute hashes of FILES using the
algorithms specified. Legal values are md5, sha1, sha256, tiger, and
whirlpool.
- -k
- Load a file of known hashes. This flag is required when
using any of the matching or audit modes (i.e. -m, -x, -M, -X, or -a) This
flag may be used more than once to add multiple sets of known hashes.
Loading sets with different hash algorithms can sometimes generate spurrious
hash collisions. For example, let's say we have two hash sets, A and B,
which have some overlapping files. For example, the file /usr/bin/bad is
in both sets. In A we've recorded the MD5 and SHA-256. In B we've recorded
the MD5, SHA-1, and SHA-256. Because these two records are different, they
will both be loaded. When the program computes all three hashes and
compares them to the set of knowns, we will get an exact match from the
record in B and a collision from the record in A.
- -a
- Audit mode. Each input file is compared against the set of
knowns. An audit is said to pass if each input file is matched against
exactly one file in set of knowns. Any collisions, new files, or missing
files will make the audit fail. Using this flag alone produces a message,
either "Audit passed" or "Audit Failed". Use the
verbose modes, -v, for more details. Using -v prints the number of files
in each category. Using -v a second time prints any discrepancies. Using
-v a third time prints the results for every file examined and every known
file.
Due to limitations in the program, any filenames with Unicode characters
will appear to have moved during an audit. See the section "UNICODE
SUPPORT" below.
- -m
- Positive matching, requires at least one use of the -k
flag. The input files are examined one at a time, and only those files
that match the list of known hashes are output. The only acceptable format
for known hashes is the output of previous hashdeep runs.
If standard input is used with the -m flag, displays
"stdin" if the input matches one of the hashes in the list of
known hashes. If the hash does not match, the program displays no output.
This flag may not be used in conjunction with the -x, -X, or -a
flags. See the section "UNICODE SUPPORT" below.
- -x
- Negative matching. Same as the -m flag above, but does
negative matching. That is, only those files NOT in the list of known
hashes are displayed.
This flag may not be used in conjunction with the -m, -M, or -a
flags. See the section "UNICODE SUPPORT" below.
- -f <file>
- Takes a list of files to be hashed from the specified file.
Each line is assumed to be a filename. This flag can only be used once per
invocation. If it's used a second time, the second instance will clobber
the first.
Note that you can still use other flags, such as the -m or -x modes, and
submit additional FILES on the command line.
- -w
- When used with positive matching modes (-m,-M) displays the
filename of the known hash that matched the input file. See the section
"UNICODE SUPPORT" below.
- -M and -X
- Same as -m and -x above, but displays the hash for each
file that does (or does not) match the list of known hashes.
- -r
- Enables recursive mode. All subdirectories are traversed.
Please note that recursive mode cannot be used to examine all files of a
given file extension. For example, calling hashdeep -r *.txt will examine
all files in directories that end in .txt.
- -e
- Displays a progress indicator and estimate of time
remaining for each file being processed. Time estimates for files larger
than 4GB are not available on Windows. This mode may not be used with th
-p mode.
- -i <size>
- Size threshold mode. Only hash files smaller than the given
the threshold. Sizes may be specified using IEC multipliers b,k,m,g,t,p,
and e.
- -o <bcpflsd>
- Enables expert mode. Allows the user specify which (and
only which) types of files are processed. Directory processing is still
controlled with the -r flag. The expert mode options allowed are:
f - Regular files
b - Block Devices
c - Character Devices
p - Named Pipes
l - Symbolic Links
s - Sockets
d - Solaris Doors
e - Windows PE executables
- -s
- Enables silent mode. All error messages are supressed.
- -p
- Piecewise mode. Breaks files into chunks before hashing.
Chunks may be specified using IEC multipliers b,k,m,g,t,p, and e. (Never
let it be said that the author didn’t plan ahead.)
- -b
- Enables bare mode. Strips any leading directory information
from displayed filenames. This flag may not be used in conjunction with
the -l flag.
- -l
- Enables relative file paths. Instead of printing the
absolute path for each file, displays the relative file path as indicated
on the command line. This flag may not be used in conjunction with the -b
flag.
- -v
- Enables verbose mode. Use again to make the program more
verbose. This mostly changes the behvaior of the audit mode, -a.
- -jnn
- Controls multi-threading. By default the program will
create one producer thread to scan the file system and one hashing thread
per CPU core. Multi-threading causes output filenames to be in
non-deterministic order, as files that take longer to hash will be delayed
while they are hashed. If a deterministic order is required, specify
-j0 to disable multi-threading
- -d
- Output in Digital Forensics XML (DFXML) format.
- -u
- Quote Unicode output. For example, the snowman is shown as
U+C426.
- -F<bum>
- Specifies the input mode that is used to read files. The
default is -Fb (buffered I/O) which reads files with fopen().
Specifying -Fu will use unbuffered I/O and read the file with
open(). Specifying -Fm will use memory-mapped I/O which will be
faster on some platforms, but which (currently) will not work with files
that produce I/O errors.
- -h
- Show a help screen and exit.
- -V
- Show the version number and exit.
UNICODE SUPPORT¶
As of version 3.0 the program supports Unicode characters in filenames on
Microsoft Windows systems for filenames specified on the command line with
globbing (e.g. *), for files specified with the
-f of files to hash,
and for files read from directories using the
-r option.
By default all program input and output should be in UTF-8. The program
automatically converts this to UTF-16 for opening files).
On Unix/Linux/MacOS, you should use a terminal emulator that supports UTF-8 and
UTF-8 characters in filenames will be properly displayed.
On Windows, please note that the onsole is not capiable of displaying Unicode
characters. You must either redirect output to a file and open the file with
Wordpad (which can display Unicode), or you must specify the
-u option
to quote Unicode using standard
U+XXXX notation.
Currently the file name of a file containing known hashes may not be specified
as a unicode filename, but you can specify the name using tab completition or
an asterisk (e.g. md5deep -m *.txt where there is only one file with a .txt
extension).
RETURN VALUE¶
Returns zero on success, one on error.
AUTHOR¶
hashdeep was written by Jesse Kornblum, research@jessekornblum.com, and Simson
Garfinkel.
KNOWN ISSUES¶
Using the -r flag cannot be used to recursively process all files of a given
extension in a directory. This is a feature, not a bug. If you need to do
this, use the
find(1) command.
The program will fail if you attempt to compare 2^64 or more input files against
a set of known files.
REPORTING BUGS¶
We take all bug reports
very seriously. Any bug that jeopardizes the
forensic integrity of this program could have serious consequenses on people's
lives. When submitting a bug report, please include a description of the
problem, how you found it, and your contact information.
Send bug reports to the author at the address above.
COPYRIGHT¶
This program is a work of the US Government. In accordance with 17 USC 105,
copyright protection is not available for any work of the US Government. This
program is PUBLIC DOMAIN. Portions of this program contain code that is
licensed under the terms of the General Public License (GPL). Those portions
retain their original copyright and license. See the file COPYING for more
details.
There is NO warranty for this program; not even for MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.
SEE ALSO¶
More information and installation instructions can be found in the README file.
Current versions of both documents can be found on the project homepage:
http://md5deep.sourceforge.net/
The MD5 specification, RFC 1321, is available at
http://www.ietf.org/rfc/rfc1321.txt
The SHA-1 specification, RFC 3174, is available at
http://www.faqs.org/rfcs/rfc3174.html
The SHA-256 specification, FIPS 180-2, is available at
http://csrc.nist.gov/publications/fips/fips180-2/fips180-2.pdf
The Tiger specification is available at
http://www.cs.technion.ac.il/~biham/Reports/Tiger/
The Whirlpool specification is available at
http://planeta.terra.com.br/informatica/paulobarreto/WhirlpoolPage.html