NAME¶
sacct - displays accounting data for all jobs and job steps in the SLURM job
accounting log or SLURM database
SYNOPSIS¶
sacct [
OPTIONS...]
DESCRIPTION¶
Accounting information for jobs invoked with SLURM are either logged in the job
accounting log file or saved to the SLURM database.
The
sacct command displays job accounting data stored in the job
accounting log file or SLURM database in a variety of forms for your analysis.
The
sacct command displays information on jobs, job steps, status, and
exitcodes by default. You can tailor the output with the use of the
--format= option to specify the fields to be shown.
For the root user, the
sacct command displays job accounting data for all
users, although there are options to filter the output to report only the jobs
from a specified user or group.
For the non-root user, the
sacct command limits the display of job
accounting data to jobs that were launched with their own user identifier
(UID) by default. Data for other users can be displayed with the
--allusers,
--user, or
--uid options.
- Note:
- If designated, the slurmdbd.conf option PrivateData may further restrict
the accounting data visible to users which are not SlurmUser, root, or a
user with AdminLevel=Admin. See the slurmdbd.conf man page for additional
details on restricting access to accounting data.
- Note:
- If the AccountingStorageType is set to
"accounting_storage/filetxt", space characters embedded within
account names, job names, and step names will be replaced by underscores.
If account names with embedded spaces are needed, it is recommended that a
database type of accounting storage be configured.
- Note:
- The content's of SLURM's database are maintained in lower case. This may
result in some sacct output differing from that of other SLURM
commands.
- Note:
- Much of the data reported by sacct has been generated by the
wait3() and getrusage() system calls. Some systems gather
and report incomplete information for these calls; sacct reports
values of 0 for this missing data. See your systems getrusage (3)
man page for information about which data are actually available on your
system.
- Elapsed time fields are presented as
[days-]hours:minutes:seconds[.microseconds]. Only 'CPU' fields will ever
have microseconds.
- The default input file is the file named in the
AccountingStorageLoc parameter in slurm.conf.
OPTIONS¶
- -a, --allusers
- Displays all users jobs when run by user root or if PrivateData is
not configured to jobs. Otherwise display the current user's
jobs
-
- -A account_list ,
--accounts =account_list
- Displays jobs when a comma separated list of accounts are given as the
argument.
-
- -b, --brief
- Displays a brief listing, which includes the following data:
-
- -c, --completion
- Use job completion instead of job accounting. The JobCompType
parameter in the slurm.conf file must be defined to a non-none
option.
-
- -D, --duplicates
- If SLURM job ids are reset, some job numbers will probably appear more
than once in the accounting log file but refer to different jobs. Such
jobs can be distinguished by the "submit" time stamp in the data
records.
- When data for specific jobs are requested with the --jobs option,
sacct returns the most recent job with that number. This behavior
can be overridden by specifying --duplicates, in which case all records
that match the selection criteria will be returned.
- -e, --helpformat
- Print a list of fields that can be specified with the --format
option.
Fields available:
AllocCPUS Account AssocID AveCPU
AveCPUFreq AveDiskRead AveDiskWrite AvePages
AveRSS AveVMSize BlockID Cluster
Comment ConsumedEnergy CPUTime CPUTimeRAW
DerivedExitCode Elapsed Eligible End
ExitCode GID Group JobID
JobName Layout MaxDiskRead MaxDiskReadNode
MaxDiskReadTask MaxDiskWrite MaxDiskWriteNode MaxDiskWriteTask
MaxPages MaxPagesNode MaxPagesTask MaxRSS
MaxRSSNode MaxRSSTask MaxVMSize MaxVMSizeNode
MaxVMSizeTask MinCPU MinCPUNode MinCPUTask
NCPUS NNodes NodeList NTasks
Priority Partition QOSRAW ReqCPUFreq
ReqCPUs ReqMem Reserved ResvCPU
ResvCPURAW Start State Submit
Suspended SystemCPU Timelimit TotalCPU
UID User UserCPU WCKey
WCKeyID
- The section titled "Job Accounting Fields" describes these
fields.
- -E end_time,
--endtime= end_time
- Select jobs in any state before the specified time. If states are given
with the -s option return jobs in this state before this period.
Valid time formats are...
HH:MM[:SS] [AM|PM]
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
MM/DD[/YY]-HH:MM[:SS]
YYYY-MM-DD[THH:MM[:SS]]
-
- -f file, --file=file
- Causes the sacct command to read job accounting data from the named
file instead of the current SLURM job accounting log file. Only
applicable when running the filetxt plugin.
- -g gid_list, --gid=gid_list
--group=group_list
- Displays the statistics only for the jobs started with the GID or the
GROUP specified by the gid_list or thegroup_list operand,
which is a comma-separated list. Space characters are not allowed. Default
is no restrictions..
- -h, --help
- Displays a general help message.
- -j job(.step) ,
--jobs=job(.step)
- Displays information about the specified job(.step) or list of
job(.step)s.
- The job(.step) parameter is a comma-separated list of jobs. Space
characters are not permitted in this list. NOTE: A step id of 'batch' will
display the information about the batch step. The batch step information
is only available after the batch job is complete unlike regular steps
which are available when they start.
- The default is to display information on all jobs.
- -k, --timelimit-min
- Only send data about jobs with this timelimit. If used with timelimit_max
this will be the minimum timelimit of the range. Default is no
restriction.
- -K, --timelimit-max
- Ignored by itself, but if timelimit_min is set this will be the maximum
timelimit of the range. Default is no restriction.
- -l, --long
- Equivalent to specifying:
- --format=jobid,jobname,partition,maxvmsize,maxvmsizenode,maxvmsizetask,
avevmsize,maxrss,maxrssnode,maxrsstask,averss,maxpages,maxpagesnode,
maxpagestask,avepages,mincpu,mincpunode,mincputask,avecpu,ntasks,
alloccpus,elapsed,state,exitcode,maxdiskread,maxdiskreadnode,maxdiskreadtask,
avediskread,maxdiskwrite,maxdiskwritenode,maxdiskwritetask,avediskwrite
- -L, --allclusters
- Display jobs ran on all clusters. By default, only jobs ran on the cluster
from where sacct is called are displayed.
- -M cluster_list, --clusters=cluster_list
- Displays the statistics only for the jobs started on the clusters
specified by the cluster_list operand, which is a comma-separated
list of clusters. Space characters are not allowed in the
cluster_list. Use -1 for all clusters. The default is current
cluster you are executing the sacct command on.
- -n, --noheader
- No heading will be added to the output. The default action is to display a
header.
-
- -N node_list, --nodelist=node_list
- Display jobs that ran on any of these node(s). node_list can be a
ranged string.
-
- --name=jobname_list
- Display jobs that have any of these name(s).
-
- -o, --format
- Comma separated list of fields. (use "--helpformat" for a list
of available fields).
NOTE: When using the format option for listing various fields you can put a
%NUMBER afterwards to specify how many characters should be printed.
e.g. format=name%30 will print 30 characters of field name right justified.
A %-30 will print 30 characters left justified.
When set, the SACCT_FORMAT environment variable will override the default
format. For example:
SACCT_FORMAT="jobid,user,account,cluster"
- -p, --parsable
- output will be '|' delimited with a '|' at the end
- -P, --parsable2
- output will be '|' delimited without a '|' at the end
- -q, --qos
- Only send data about jobs using these qos. Default is all.
- -r, --partition
-
Comma separated list of partitions to select jobs and job steps from. The
default is all partitions.
- -s state_list , --state=state_list
- Selects jobs based on their state during the time period given. Unless
otherwise specified, the start and end time will be the current time when
the --state option is specified and only currently running jobs can
be displayed. A start and/or end time must be specified to view
information about jobs not currently running. The following state
designators are valid and multiple state names may be specified using
comma separators. Either the short or long form of the state name may be
used (e.g. CA or CANCELLED) and the the the name is case
insensitive (e.g. ca and CA both work).
- BF BOOT_FAIL
- Job terminated due to launch failure, typically due to a hardware failure
(e.g. unable to boot the node or block and the job can not be
requeued).
- CA CANCELLED
- Job was explicitly cancelled by the user or system administrator. The job
may or may not have been initiated.
- CD COMPLETED
- Job has terminated all processes on all nodes.
- CF CONFIGURING
- Job has been allocated resources, but are waiting for them to become ready
for use (e.g. booting).
- CG COMPLETING
- Job is in the process of completing. Some processes on some nodes may
still be active.
- F FAILED
- Job terminated with non-zero exit code or other failure condition.
- NF NODE_FAIL
- Job terminated due to failure of one or more allocated nodes.
- PD PENDING
- Job is awaiting resource allocation. Note for a job to be selected in this
state it must have "EligibleTime" in the requested time interval
or different from "Unknown". The "EligibleTime" is
displayed by the "scontrol show job" command. For example jobs
submitted with the "--hold" option will have
"EligibleTime=Unknown" as they are pending indefinitely.
- PR PREEMPTED
- Job terminated due to preemption.
- R RUNNING
- Job currently has an allocation.
- RS RESIZING
- Job is about to change size.
- S SUSPENDED
- Job has an allocation, but execution has been suspended.
- TO TIMEOUT
- Job terminated upon reaching its time limit.
- The state_list operand is a comma-separated list of these state
designators. Space characters are not allowed in the state_list
NOTE: When specifying states and no start time is given the default
starttime is 'now'. .
- -S, --starttime
- Select jobs in any state after the specified time. Default is 00:00:00 of
the current day, unless '-s' is set then the default is 'now'. If states
are given with the '-s' option then only jobs in this state at this time
will be returned.
Valid time formats are...
HH:MM[:SS] [AM|PM]
MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]
MM/DD[/YY]-HH:MM[:SS]
YYYY-MM-DD[THH:MM[:SS]]
- -T, --truncate
- Truncate time. So if a job started before --starttime the start time would
be truncated to --starttime. The same for end time and --endtime.
- -u uid_list, --uid=uid_list,
--user=user_list
- Use this comma separated list of uids or user names to select jobs to
display. By default, the running user's uid is used.
- --usage
- Display a command usage summary.
- -v, --verbose
- Primarily for debugging purposes, report the state of various variables
during processing.
- -V, --version
- Print version.
- -W wckey_list, --wckeys=wckey_list
- Displays the statistics only for the jobs started on the wckeys specified
by the wckey_list operand, which is a comma-separated list of wckey
names. Space characters are not allowed in the wckey_list. Default
is all wckeys.
- -x associd_list,
--associations=assoc_list
- Displays the statistics only for the jobs running under the association
ids specified by the assoc_list operand, which is a comma-separated
list of association ids. Space characters are not allowed in the
assoc_list. Default is all associations.
- -X, --allocations
- Only show cumulative statistics for each job, not the intermediate steps.
Job Accounting Fields¶
The following describes each job accounting field:
- ALL
- Print all fields listed below.
- AllocCPUs
- Count of allocated CPUs. Equivalant to NCPUs.
- account
- Account the job ran under.
- associd
- Reference to the association of user, account and cluster.
- AveCPU
- Average (system + user) CPU time of all tasks in job.
- AveCPUFreq
- Average weighted CPU frequency of all tasks in job, in kHz.
- AveDiskRead
- Average number of bytes read by all tasks in job.
- AveDiskWrite
- Average number of bytes written by all tasks in job.
- AvePages
- Average number of page faults of all tasks in job.
- AveRSS
- Average resident set size of all tasks in job.
- AveVMSize
- Average Virtual Memory size of all tasks in job.
- blockid
- Block ID, applicable to BlueGene computers only.
- cluster
- Cluster name.
- Comment
- The job's comment string when the AccountingStoreJobComment parameter in
the slurm.conf file is set (or defaults) to YES. The Comment string can be
modified by invoking sacctmgr modify job or the specialized
sjobexitmod command.
- ConsumedEnergy
- Total energy consumed by all tasks in job, in joules. Note: Only in case
of exclusive job allocation this value reflects the jobs' real energy
consumption.
- CPUTime
- Formatted (Elapsed time * CPU) count used by a job or step.
- CPUTimeRaw
- Unlike above non formatted (Elapsed time * CPU) count for a job or step.
Units are cpu-seconds.
- DerivedExitCode
- The highest exit code returned by the job's job steps (srun invocations).
Following the colon is the signal that caused the process to terminate if
it was terminated by a signal. The DerivedExitCode can be modified by
invoking sacctmgr modify job or the specialized sjobexitmod
command.
- elapsed
- The jobs elapsed time.
- The format of this fields output is as follows:
- as defined by the following:
- DD
- days
- hh
- hours
- mm
- minutes
- ss
- seconds
- eligible
- When the job became eligible to run.
- end
- Termination time of the job. Format output is, YYYY-MM-DDTHH:MM:SS, unless
changed through the SLURM_TIME_FORMAT environment variable.
- exitcode
- The exit code returned by the job script or salloc, typically as set by
the exit() function. Following the colon is the signal that caused the
process to terminate if it was terminated by a signal.
- gid
- The group identifier of the user who ran the job.
- group
- The group name of the user who ran the job.
- JobID
- The number of the job or job step. It is in the form: job.jobstep.
- jobname
- The name of the job or job step. The slurm_accounting.log file is a
space delimited file. Because of this if a space is used in the jobname an
underscore is substituted for the space before the record is written to
the accounting file. So when the jobname is displayed by sacct the
jobname that had a space in it will now have an underscore in place of the
space.
- layout
- What the layout of a step was when it was running. This can be used to
give you an idea of which node ran which rank in your job.
- MaxDiskRead
- Maximum number of bytes read by all tasks in job.
- MaxDiskReadNode
- The node on which the maxdiskread occurred.
- MaxDiskReadTask
- The task ID where the maxdiskread occurred.
- MaxDiskWrite
- Maximum number of bytes written by all tasks in job.
- MaxDiskWriteNode
- The node on which the maxdiskwrite occurred.
- MaxDiskWriteTask
- The task ID where the maxdiskwrite occurred.
- MaxPages
- Maximum number of page faults of all tasks in job.
- MaxPagesNode
- The node on which the maxpages occurred.
- MaxPagesTask
- The task ID where the maxpages occurred.
- MaxRSS
- Maximum resident set size of all tasks in job.
- MaxRSSNode
- The node on which the maxrss occurred.
- MaxRSSTask
- The task ID where the maxrss occurred.
- MaxVMSize
- Maximum Virtual Memory size of all tasks in job.
- MaxVMSizeNode
- The node on which the maxvmsize occurred.
- MaxVMSizeTask
- The task ID where the maxvmsize occurred.
- MinCPU
- Minimum (system + user) CPU time of all tasks in job.
- MinCPUNode
- The node on which the mincpu occurred.
- MinCPUTask
- The task ID where the mincpu occurred.
- ncpus
- Count of allocated CPUs. Equivalant to AllocCPUs
Total number of CPUs allocated to the job.
- nodelist
- List of nodes in job/step.
- nnodes
- Number of nodes in a job or step.
- NTasks
- Total number of tasks in a job or step.
- priority
- Slurm priority.
- partition
- Identifies the partition on which the job ran.
- qos
- Name of Quality of Service.
- qosraw
- Id of Quality of Service.
- ReqCPUFreq
- Requested CPU frequency for the step, in kHz. Note: This value applies
only to a job step. No value is reported for the job.
- reqcpus
- Required CPUs.
- ReqMem
- Minimum required memory for the job, in MB. A 'c' at the end of number
represents Memory Per CPU, a 'n' represents Memory Per Node. Note: This
value is only from the job allocation, not the step.
- reserved
- How much wall clock time was used as reserved time for this job. This is
derived from how long a job was waiting from eligible time to when it
actually started.
- resvcpu
- Formatted time for how long (cpu secs) a job was reserved for.
- resvcpuraw
- Reserved CPUs in second format, not formatted.
- start
- Initiation time of the job in the same format as end.
- state
- Displays the job status, or state.
Output can be RUNNING, RESIZING, SUSPENDED, COMPLETED, CANCELLED, FAILED,
TIMEOUT, PREEMPTED, BOOT_FAIL or NODE_FAIL. If more information is
available on the job state than will fit into the current field width (for
example, the uid that CANCELLED a job) the state will be followed by a
"+". You can increase the size of the displayed state using the
"%NUMBER" format modifier described earlier.
- submit
- The time and date stamp (in Universal Time Coordinated, UTC) the job was
submitted. The format of the output is identical to that of the end field.
NOTE: If a job is requeued, the submit time is reset. To obtain the original
submit time it is necessary to use the -D or --duplicate option to display
all duplicate entries for a job.
- suspended
- How long the job was suspended for.
- SystemCPU
- The amount of system CPU time used by the job or job step. The format of
the output is identical to that of the elapsed field.
NOTE: SystemCPU provides a measure of the task's parent process and does not
include CPU time of child processes.
- timelimit
- What the timelimit was/is for the job.
- TotalCPU
- The sum of the SystemCPU and UserCPU time used by the job or job step. The
total CPU time of the job may exceed the job's elapsed time for jobs that
include multiple job steps. The format of the output is identical to that
of the elapsed field.
NOTE: TotalCPU provides a measure of the task's parent process and does not
include CPU time of child processes.
- uid
- The user identifier of the user who ran the job.
- user
- The user name of the user who ran the job.
- UserCPU
- The amount of user CPU time used by the job or job step. The format of the
output is identical to that of the elapsed field.
NOTE: UserCPU provides a measure of the task's parent process and does not
include CPU time of child processes.
- wckey
- Workload Characterization Key. Arbitrary string for grouping orthogonal
accounts together.
- wckeyid
- Reference to the wckey.
ENVIRONMENT VARIABLES¶
Some
sacct options may be set via environment variables. These
environment variables, along with their corresponding options, are listed
below. (Note: Commandline options will always override these settings.)
- SLURM_TIME_FORMAT
- Specify the format used to report time stamps. A value of standard,
the default value, generates output in the form
"year-month-dateThour:minute:second". A value of relative
returns only "hour:minute:second" if the current day. For other
dates in the current year it prints the "hour:minute" preceded
by "Tomorr" (tomorrow), "Ystday" (yesterday), the name
of the day for the coming week (e.g. "Mon", "Tue",
etc.), otherwise the date (e.g. "25 Apr"). For other years it
returns a date month and year without a time (e.g. "6 Jun
2012"). All of the time stamps use a 24 hour format.
A valid strftime() format can also be specified. For example, a value of
"%a %T" will report the day of the week and a time stamp (e.g.
"Mon 12:34:56").
EXAMPLES¶
This example illustrates the default invocation of the
sacct command:
# sacct
Jobid Jobname Partition Account AllocCPUS State ExitCode
---------- ---------- ---------- ---------- ---------- ---------- --------
2 script01 srun acct1 1 RUNNING 0
3 script02 srun acct1 1 RUNNING 0
4 endscript srun acct1 1 RUNNING 0
4.0 srun acct1 1 COMPLETED 0
This example shows the same job accounting information with the
brief
option.
# sacct --brief
Jobid State ExitCode
---------- ---------- --------
2 RUNNING 0
3 RUNNING 0
4 RUNNING 0
4.0 COMPLETED 0
# sacct --allocations
Jobid Jobname Partition Account AllocCPUS State ExitCode
---------- ---------- ---------- ---------- ------- ---------- --------
3 sja_init andy acct1 1 COMPLETED 0
4 sjaload andy acct1 2 COMPLETED 0
5 sja_scr1 andy acct1 1 COMPLETED 0
6 sja_scr2 andy acct1 18 COMPLETED 2
7 sja_scr3 andy acct1 18 COMPLETED 0
8 sja_scr5 andy acct1 2 COMPLETED 0
9 sja_scr7 andy acct1 90 COMPLETED 1
10 endscript andy acct1 186 COMPLETED 0
This example demonstrates the ability to customize the output of the
sacct command. The fields are displayed in the order designated on the
command line.
# sacct --format=jobid,elapsed,ncpus,ntasks,state
Jobid Elapsed Ncpus Ntasks State
---------- ---------- ---------- -------- ----------
3 00:01:30 2 1 COMPLETED
3.0 00:01:30 2 1 COMPLETED
4 00:00:00 2 2 COMPLETED
4.0 00:00:01 2 2 COMPLETED
5 00:01:23 2 1 COMPLETED
5.0 00:01:31 2 1 COMPLETED
This example demonstrates the use of the -T (--truncate) option when used with
-S (--starttime) and -E (--endtime). When the -T option is used, the start
time of the job will be the specified -S value if the job was started before
the specified time, otherwise the time will be the job's start time. The end
time will be the specified -E option if the job ends after the specified time,
otherwise it will be the jobs end time.
NOTE: If no -s (--state) option is given sacct will display jobs that ran
durning the specified time, otherwise it returns jobs that were in the state
requested durning that period of time.
Without -T (normal operation) sacct output would be like this.
# sacct -S2014-07-03-11:40 -E2014-07-03-12:00 -X -ojobid,start,end,state
JobID Start End State
--------- --------------------- -------------------- ------------
2 2014-07-03T11:33:16 2014-07-03T11:59:01 COMPLETED
3 2014-07-03T11:35:21 Unknown RUNNING
4 2014-07-03T11:35:21 2014-07-03T11:45:21 COMPLETED
5 2014-07-03T11:41:01 Unknown RUNNING
By adding the -T option the job's start and end times are truncated to reflect
only the time requested. If a job started after the start time requested or
finished before the end time requested those times are not altered. The -T
option is useful when determining exact run times durning any given period.
# sacct -T -S2014-07-03-11:40 -E2014-07-03-12:00 -X -ojobid,jobname,user,start,end,state
JobID Start End State
--------- --------------------- -------------------- ------------
2 2014-07-03T11:40:00 2014-07-03T11:59:01 COMPLETED
3 2014-07-03T11:40:00 2014-07-03T12:00:00 RUNNING
4 2014-07-03T11:40:00 2014-07-03T11:45:21 COMPLETED
5 2014-07-03T11:41:01 2014-07-03T12:00:00 RUNNING
COPYING¶
Copyright (C) 2005-2007 Copyright Hewlett-Packard Development Company L.P.
Copyright (C) 2008-2010 Lawrence Livermore National Security. Produced at
Lawrence Livermore National Laboratory (cf, DISCLAIMER).
Copyright (C) 2010-2014 SchedMD LLC.
This file is part of SLURM, a resource management program. For details, see
<
http://slurm.schedmd.com/>.
SLURM is free software; you can redistribute it and/or modify it under the terms
of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
SLURM is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU General Public License for more details.
FILES¶
- /etc/slurm.conf
- Entries to this file enable job accounting and designate the job
accounting log file that collects system job accounting.
- /var/log/slurm_accounting.log
- The default job accounting log file. By default, this file is set to read
and write permission for root only.
SEE ALSO¶
sstat(1),
ps (1),
srun(1),
squeue(1),
getrusage (2),
time (2)