table of contents
PERF_4.12-STAT(1) | perf Manual | PERF_4.12-STAT(1) |
NAME¶
perf-stat - Run a command and gather performance counter statisticsSYNOPSIS¶
perf stat [-e <EVENT> | --event=EVENT] [-a] <command> perf stat [-e <EVENT> | --event=EVENT] [-a] — <command> [<options>] perf stat [-e <EVENT> | --event=EVENT] [-a] record [-o file] — <command> [<options>] perf stat report [-i file]
DESCRIPTION¶
This command runs a command and gathers performance counter statistics from it.OPTIONS¶
<command>...record
report
-e, --event=
-i, --no-inherit
-p, --pid=<pid>
-t, --tid=<tid>
-a, --all-cpus
-c, --scale
-d, --detailed
-d: detailed events, L1 and LLC data cache -d -d: more detailed events, dTLB and iTLB events -d -d -d: very detailed events, adding prefetch events
-r, --repeat=<n>
-B, --big-num
-C, --cpu=
-A, --no-aggr
-n, --null
-v, --verbose
-x SEP, --field-separator SEP
-G name, --cgroup name
-o file, --output file
--append
--log-fd
--pre, --post
perf stat --repeat 10 --null --sync --pre make -s O=defconfig-build/clean — make -s -j64 O=defconfig-build/ bzImage
-I msecs, --interval-print msecs
--metric-only
--per-socket
--per-core
--per-thread
-D msecs, --delay msecs
-T, --transaction
STAT RECORD¶
Stores stat data into perf data file.-o file, --output file
STAT REPORT¶
Reads and reports stat data from perf data file.-i file, --input file
--per-socket
--per-core
-A, --no-aggr
--topdown
Frontend bound means that the CPU cannot fetch and decode instructions fast enough. Backend bound means that computation or memory access is the bottle neck. Bad Speculation means that the CPU wasted cycles due to branch mispredictions and similar issues. Retiring means that the CPU computed without an apparently bottleneck. The bottleneck is only the real bottleneck if the workload is actually bound by the CPU and not by something else.
For best results it is usually a good idea to use it with interval mode like -I 1000, as the bottleneck of workloads can change often.
The top down metrics are collected per core instead of per CPU thread. Per core mode is automatically enabled and -a (global monitoring) is needed, requiring root rights or perf.perf_event_paranoid=-1.
Topdown uses the full Performance Monitoring Unit, and needs disabling of the NMI watchdog (as root): echo 0 > /proc/sys/kernel/nmi_watchdog for best results. Otherwise the bottlenecks may be inconsistent on workload with changing phases.
This enables --metric-only, unless overriden with --no-metric-only.
To interpret the results it is usually needed to know on which CPUs the workload runs on. If needed the CPUs can be forced using taskset.
--no-merge
EXAMPLES¶
$ perf stat — make -jPerformance counter stats for 'make -j':
8117.370256 task clock ticks # 11.281 CPU utilization factor 678 context switches # 0.000 M/sec 133 CPU migrations # 0.000 M/sec 235724 pagefaults # 0.029 M/sec 24821162526 CPU cycles # 3057.784 M/sec 18687303457 instructions # 2302.138 M/sec 172158895 cache references # 21.209 M/sec 27075259 cache misses # 3.335 M/sec
Wall-clock time elapsed: 719.554352 msecs
CSV FORMAT¶
With -x, perf stat is able to output a not-quite-CSV format output Commas in the output are not put into "". To make it easy to parse it is recommended to use a different character like -x \;The fields are in this order:
Additional metrics may be printed with all earlier fields being empty.
SEE ALSO¶
perf_4.12-top(1), perf_4.12-list(1)2017-09-28 | perf |