NAME¶
pt-fingerprint - Convert queries into fingerprints.
SYNOPSIS¶
Usage: pt-fingerprint [OPTIONS] [FILES]
pt-fingerprint converts queries into fingerprints. With the --query option,
converts the option's value into a fingerprint. With no options, treats
command-line arguments as FILEs and reads and converts semicolon-separated
queries from the FILEs. When FILE is -, it read standard input.
Convert a single query:
pt-fingerprint --query "select a, b, c from users where id = 500"
Convert a file full of queries:
pt-fingerprint /path/to/file.txt
RISKS¶
The following section is included to inform users about the potential risks,
whether known or unknown, of using this tool. The two main categories of risks
are those created by the nature of the tool (e.g. read-only tools vs.
read-write tools) and those created by bugs.
The pt-fingerprint tool simply reads data and transforms it, so risks are
minimal.
See also "BUGS" for more information on filing bugs and getting help.
DESCRIPTION¶
A query fingerprint is the abstracted form of a query, which makes it possible
to group similar queries together. Abstracting a query removes literal values,
normalizes whitespace, and so on. For example, consider these two queries:
SELECT name, password FROM user WHERE id='12823';
select name, password from user
where id=5;
Both of those queries will fingerprint to
select name, password from user where id=?
Once the query's fingerprint is known, we can then talk about a query as though
it represents all similar queries.
Query fingerprinting accommodates a great many special cases, which have proven
necessary in the real world. For example, an IN list with 5 literals is really
equivalent to one with 4 literals, so lists of literals are collapsed to a
single one. If you want to understand more about how and why all of these
cases are handled, please review the test cases in the Subversion repository.
If you find something that is not fingerprinted properly, please submit a bug
report with a reproducible test case. Here is a list of transformations during
fingerprinting, which might not be exhaustive:
- •
- Group all SELECT queries from mysqldump together, even if
they are against different tables. Ditto for all of pt-table-checksum's
checksum queries.
- •
- Shorten multi-value INSERT statements to a single
VALUES() list.
- •
- Strip comments.
- •
- Abstract the databases in USE statements, so all USE
statements are grouped together.
- •
- Replace all literals, such as quoted strings. For
efficiency, the code that replaces literal numbers is somewhat
non-selective, and might replace some things as numbers when they really
are not. Hexadecimal literals are also replaced. NULL is treated as a
literal. Numbers embedded in identifiers are also replaced, so tables
named similarly will be fingerprinted to the same values (e.g. users_2009
and users_2010 will fingerprint identically).
- •
- Collapse all whitespace into a single space.
- •
- Lowercase the entire query.
- •
- Replace all literals inside of IN() and
VALUES() lists with a single placeholder, regardless of
cardinality.
- •
- Collapse multiple identical UNION queries into a single
one.
OPTIONS¶
This tool accepts additional command-line arguments. Refer to the
"SYNOPSIS" and usage information for details.
- --config
- type: Array
Read this comma-separated list of config files; if specified, this must be
the first option on the command line.
- --help
- Show help and exit.
- --match-embedded-numbers
- Match numbers embedded in words and replace as single
values. This option causes the tool to be more careful about matching
numbers so that words with numbers, like "catch22" are matched
and replaced as a single "?" placeholder. Otherwise the default
number matching pattern will replace "catch22" as
"catch?".
This is helpful if database or table names contain numbers.
- --match-md5-checksums
- Match MD5 checksums and replace as single values. This
option causes the tool to be more careful about matching numbers so that
MD5 checksums like "fbc5e685a5d3d45aa1d0347fdb7c4d35" are
matched and replaced as a single "?" placeholder. Otherwise, the
default number matching pattern will replace
"fbc5e685a5d3d45aa1d0347fdb7c4d35" as "fbc?".
- --query
- type: string
The query to convert into a fingerprint.
- --version
- Show version and exit.
ENVIRONMENT¶
The environment variable "PTDEBUG" enables verbose debugging output to
STDERR. To enable debugging and capture all output to a file, run the tool
like:
PTDEBUG=1 pt-fingerprint ... > FILE 2>&1
Be careful: debugging output is voluminous and can generate several megabytes of
output.
SYSTEM REQUIREMENTS¶
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
installed in any reasonably new version of Perl.
BUGS¶
For a list of known bugs, see
http://www.percona.com/bugs/pt-fingerprint
<
http://www.percona.com/bugs/pt-fingerprint>.
Please report bugs at
https://bugs.launchpad.net/percona-toolkit
<
https://bugs.launchpad.net/percona-toolkit>. Include the following
information in your bug report:
- •
- Complete command-line used to run the tool
- •
- Tool "--version"
- •
- MySQL version of all servers involved
- •
- Output from the tool including STDERR
- •
- Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with
"PTDEBUG"; see "ENVIRONMENT".
DOWNLOADING¶
Visit
http://www.percona.com/software/percona-toolkit/
<
http://www.percona.com/software/percona-toolkit/> to download the
latest release of Percona Toolkit. Or, get the latest release from the command
line:
wget percona.com/get/percona-toolkit.tar.gz
wget percona.com/get/percona-toolkit.rpm
wget percona.com/get/percona-toolkit.deb
You can also get individual tools from the latest release:
wget percona.com/get/TOOL
Replace "TOOL" with the name of any tool.
AUTHORS¶
Baron Schwartz and Daniel Nichter
This tool is part of Percona Toolkit, a collection of advanced command-line
tools developed by Percona for MySQL support and consulting. Percona Toolkit
was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and
Daniel Nichter, both of whom are employed by Percona. Visit
<
http://www.percona.com/software/> for more software developed by
Percona.
COPYRIGHT, LICENSE, AND WARRANTY¶
This program is copyright 2011-2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue `man perlgpl' or `man perlartistic' to read these
licenses.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA.
VERSION¶
pt-fingerprint 2.1.2