NAME¶
pt-table-usage - Analyze how queries use tables.
SYNOPSIS¶
Usage: pt-table-usage [OPTIONS] [FILES]
pt-table-usage reads queries from a log and analyzes how they use tables. If no
FILE is specified, it reads STDIN. It prints a report for each query.
RISKS¶
pt-table-use is very low risk. By default, it simply reads queries from a log.
It executes "EXPLAIN EXTENDED" if you specify the
"--explain-extended" option.
At the time of this release, we know of no bugs that could harm users.
The authoritative source for updated information is always the online issue
tracking system. Issues that affect this tool will be marked as such. You can
see a list of such issues at the following URL:
http://www.percona.com/bugs/pt-table-usage
<
http://www.percona.com/bugs/pt-table-usage>.
See also "BUGS" for more information on filing bugs and getting help.
DESCRIPTION¶
pt-table-usage reads queries from a log and analyzes how they use tables. The
log should be in MySQL's slow query log format.
Table usage is more than simply an indication of which tables the query reads or
writes. It also indicates data flow: data in and data out. The tool determines
the data flow by the contexts in which tables appear. A single query can use a
table in several different contexts simultaneously. The tool's output lists
every context for every table. This CONTEXT-TABLE list indicates how data
flows between tables. The "OUTPUT" section lists the possible
contexts and describes how to read a table usage report.
The tool analyzes data flow down to the level of individual columns, so it is
helpful if columns are identified unambiguously in the query. If a query uses
only one table, then all columns must be from that table, and there's no
difficulty. But if a query uses multiple tables and the column names are not
table-qualified, then it is necessary to use "EXPLAIN EXTENDED",
followed by "SHOW WARNINGS", to determine to which tables the
columns belong.
If the tool does not know the query's default database, which can occur when the
database is not printed in the log, then "EXPLAIN EXTENDED" can
fail. In this case, you can specify a default database with
"--database". You can also use the
"--create-table-definitions" option to help resolve ambiguities.
OUTPUT¶
The tool prints a usage report for each table in every query, similar to the
following:
Query_id: 0x1CD27577D202A339.1
UPDATE t1
SELECT DUAL
JOIN t1
JOIN t2
WHERE t1
Query_id: 0x1CD27577D202A339.2
UPDATE t2
SELECT DUAL
JOIN t1
JOIN t2
WHERE t1
The first line contains the query ID, which by default is the same as those
shown in pt-query-digest reports. It is an MD5 checksum of the query's
"fingerprint," which is what remains after removing literals,
collapsing white space, and a variety of other transformations. The query ID
has two parts separated by a period: the query ID and the table number. If you
wish to use a different value to identify the query, you can specify the
"--id-attribute" option.
The previous example shows two paragraphs for a single query, not two queries.
Note that the query ID is identical for the two, but the table number differs.
The table number increments by 1 for each table that the query updates. Only
multi-table UPDATE queries can update multiple tables with a single query, so
the table number is 1 for all other types of queries. (The tool does not
support multi-table DELETE queries.) The example output above is from this
query:
UPDATE t1 AS a JOIN t2 AS b USING (id)
SET a.foo="bar", b.foo="bat"
WHERE a.id=1;
The "SET" clause indicates that the query updates two tables:
"a" aliased as "t1", and "b" aliased as
"t2".
After the first line, the tool prints a variable number of CONTEXT-TABLE lines.
Possible contexts are as follows:
- •
- SELECT
SELECT means that the query retrieves data from the table for one of two
reasons. The first is to be returned to the user as part of a result set.
Only SELECT queries return result sets, so the report always shows a
SELECT context for SELECT queries.
The second case is when data flows to another table as part of an INSERT or
UPDATE. For example, the UPDATE query in the example above has the usage:
SELECT DUAL
This refers to:
SET a.foo="bar", b.foo="bat"
The tool uses DUAL for any values that do not originate in a table, in this
case the literal values "bar" and "bat". If that
"SET" clause were "SET a.foo=b.foo" instead, then the
complete usage would be:
Query_id: 0x1CD27577D202A339.1
UPDATE t1
SELECT t2
JOIN t1
JOIN t2
WHERE t1
The presence of a SELECT context after another context, such as UPDATE or
INSERT, indicates where the UPDATE or INSERT retrieves its data. The
example immediately above reflects an UPDATE query that updates rows in
table "t1" with data from table "t2".
- •
- Any other verb
Any other verb, such as INSERT, UPDATE, DELETE, etc. may be a context. These
verbs indicate that the query modifies data in some way. If a SELECT
context follows one of these verbs, then the query reads data from the
SELECT table and writes it to this table. This happens, for example, with
INSERT..SELECT or UPDATE queries that use values from tables instead of
constant values.
These query types are not supported: SET, LOAD, and multi-table DELETE.
- •
- JOIN
The JOIN context lists tables that are joined, either with an explicit JOIN
in the FROM clause, or implicitly in the WHERE clause, such as "t1.id
= t2.id".
- •
- WHERE
The WHERE context lists tables that are used in the WHERE clause to filter
results. This does not include tables that are implicitly joined in the
WHERE clause; those are listed as JOIN contexts. For example:
WHERE t1.id > 100 AND t1.id < 200 AND t2.foo IS NOT NULL
Results in:
WHERE t1
WHERE t2
The tool lists only distinct tables; that is why table "t1" is
listed only once.
- •
- TLIST
The TLIST context lists tables that the query accesses, but which do not
appear in any other context. These tables are usually an implicit
cartesian join. For example, the query "SELECT * FROM t1, t2"
results in:
Query_id: 0xBDDEB6EDA41897A8.1
SELECT t1
SELECT t2
TLIST t1
TLIST t2
First of all, there are two SELECT contexts, because "SELECT *"
selects rows from all tables; "t1" and "t2" in this
case. Secondly, the tables are implicitly joined, but without any kind of
join condition, which results in a cartesian join as indicated by the
TLIST context for each.
EXIT STATUS¶
pt-table-usage exits 1 on any kind of error, or 0 if no errors.
OPTIONS¶
This tool accepts additional command-line arguments. Refer to the
"SYNOPSIS" and usage information for details.
- --ask-pass
- Prompt for a password when connecting to MySQL.
- --charset
- short form: -A; type: string
Default character set. If the value is utf8, sets Perl's binmode on STDOUT
to utf8, passes the mysql_enable_utf8 option to DBD::mysql, and runs SET
NAMES UTF8 after connecting to MySQL. Any other value sets binmode on
STDOUT without the utf8 layer, and runs SET NAMES after connecting to
MySQL.
- --config
- type: Array
Read this comma-separated list of config files; if specified, this must be
the first option on the command line.
- --constant-data-value
- type: string; default: DUAL
Table to print as the source for constant data (literals). This is any data
not retrieved from tables (or subqueries, because subqueries are not
supported). This includes literal values such as strings ("foo")
and numbers (42), or functions such as "NOW()". For example, in
the query "INSERT INTO t (c) VALUES ('a')", the string 'a' is
constant data, so the table usage report is:
INSERT t
SELECT DUAL
The first line indicates that the query inserts data into table
"t", and the second line indicates that the inserted data comes
from some constant value.
- --[no]continue-on-error
- default: yes
Continue to work even if there is an error.
- --create-table-definitions
- type: array
Read "CREATE TABLE" definitions from this list of comma-separated
files. If you cannot use "--explain-extended" to fully qualify
table and column names, you can save the output of "mysqldump
--no-data" to one or more files and specify those files with this
option. The tool will parse all "CREATE TABLE" definitions from
the files and use this information to qualify table and column names. If a
column name appears in multiple tables, or a table name appears in
multiple databases, the ambiguities cannot be resolved.
- --daemonize
- Fork to the background and detach from the shell. POSIX
operating systems only.
- --database
- short form: -D; type: string
Default database.
- --defaults-file
- short form: -F; type: string
Only read mysql options from the given file. You must give an absolute
pathname.
- --explain-extended
- type: DSN
A server to execute EXPLAIN EXTENDED queries. This may be necessary to
resolve ambiguous (unqualified) column and table names.
- --filter
- type: string
Discard events for which this Perl code doesn't return true.
This option is a string of Perl code or a file containing Perl code that is
compiled into a subroutine with one argument: $event. If the given value
is a readable file, then pt-table-usage reads the entire file and uses its
contents as the code.
Filters are implemented in the same fashion as in the pt-query-digest tool,
so please refer to its documentation for more information.
- --help
- Show help and exit.
- --host
- short form: -h; type: string
Connect to host.
- --id-attribute
- type: string
Identify each event using this attribute. The default is to use a query ID,
which is an MD5 checksum of the query's fingerprint.
- --log
- type: string
Print all output to this file when daemonized.
- --password
- short form: -p; type: string
Password to use when connecting.
- --pid
- type: string
Create the given PID file when running. The file contains the process ID of
the daemonized instance. The PID file is removed when the daemonized
instance exits. The program checks for the existence of the PID file when
starting; if it exists and the process with the matching PID exists, the
program exits.
- --port
- short form: -P; type: int
Port number to use for connection.
- --progress
- type: array; default: time,30
Print progress reports to STDERR. The value is a comma-separated list with
two parts. The first part can be percentage, time, or iterations; the
second part specifies how often an update should be printed, in
percentage, seconds, or number of iterations.
- --query
- type: string
Analyze the specified query instead of reading a log file.
- --read-timeout
- type: time; default: 0
Wait this long for an event from the input; 0 to wait forever.
This option sets the maximum time to wait for an event from the input. If an
event is not received after the specified time, the tool stops reading the
input and prints its reports.
This option requires the Perl POSIX module.
- --run-time
- type: time
How long to run before exiting. The default is to run forever (you can
interrupt with CTRL-C).
- --set-vars
- type: string; default: wait_timeout=10000
Set these MySQL variables. Immediately after connecting to MySQL, this
string will be appended to SET and executed.
- --socket
- short form: -S; type: string
Socket file to use for connection.
- --user
- short form: -u; type: string
User for login if not current user.
- --version
- Show version and exit.
DSN OPTIONS¶
These DSN options are used to create a DSN. Each option is given like
"option=value". The options are case-sensitive, so P and p are not
the same option. There cannot be whitespace before or after the "="
and if the value contains whitespace it must be quoted. DSN options are
comma-separated. See the percona-toolkit manpage for full details.
- •
- A
dsn: charset; copy: yes
Default character set.
- •
- D
copy: no
Default database.
- •
- F
dsn: mysql_read_default_file; copy: no
Only read default options from the given file
- •
- h
dsn: host; copy: yes
Connect to host.
- •
- p
dsn: password; copy: yes
Password to use when connecting.
- •
- P
dsn: port; copy: yes
Port number to use for connection.
- •
- S
dsn: mysql_socket; copy: no
Socket file to use for connection.
- •
- u
dsn: user; copy: yes
User for login if not current user.
ENVIRONMENT¶
The environment variable "PTDEBUG" enables verbose debugging output to
STDERR. To enable debugging and capture all output to a file, run the tool
like:
PTDEBUG=1 pt-table-usage ... > FILE 2>&1
Be careful: debugging output is voluminous and can generate several megabytes of
output.
SYSTEM REQUIREMENTS¶
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
installed in any reasonably new version of Perl.
BUGS¶
For a list of known bugs, see
http://www.percona.com/bugs/pt-table-usage
<
http://www.percona.com/bugs/pt-table-usage>.
Please report bugs at
https://bugs.launchpad.net/percona-toolkit
<
https://bugs.launchpad.net/percona-toolkit>. Include the following
information in your bug report:
- •
- Complete command-line used to run the tool
- •
- Tool "--version"
- •
- MySQL version of all servers involved
- •
- Output from the tool including STDERR
- •
- Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with
"PTDEBUG"; see "ENVIRONMENT".
DOWNLOADING¶
Visit
http://www.percona.com/software/percona-toolkit/
<
http://www.percona.com/software/percona-toolkit/> to download the
latest release of Percona Toolkit. Or, get the latest release from the command
line:
wget percona.com/get/percona-toolkit.tar.gz
wget percona.com/get/percona-toolkit.rpm
wget percona.com/get/percona-toolkit.deb
You can also get individual tools from the latest release:
wget percona.com/get/TOOL
Replace "TOOL" with the name of any tool.
AUTHORS¶
Daniel Nichter
This tool is part of Percona Toolkit, a collection of advanced command-line
tools developed by Percona for MySQL support and consulting. Percona Toolkit
was forked from two projects in June, 2011: Maatkit and Aspersa. Those
projects were created by Baron Schwartz and developed primarily by him and
Daniel Nichter, both of whom are employed by Percona. Visit
<
http://www.percona.com/software/> for more software developed by
Percona.
COPYRIGHT, LICENSE, AND WARRANTY¶
This program is copyright 2012 Percona Inc. Feedback and improvements are
welcome.
THIS PROGRAM IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue `man perlgpl' or `man perlartistic' to read these
licenses.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA.
VERSION¶
pt-table-usage 2.1.2