table of contents
PEGASUS-ANALYZER(1) | Pegasus Manual | PEGASUS-ANALYZER(1) |
NAME¶
pegasus-analyzer - debugs a workflow.
SYNOPSIS¶
pegasus-analyzer [--help|-h] [--quiet|-q] [--strict|-s]
[--monitord|-m|-t] [--verbose|-v]
[--output-dir|-o output_dir]
[--dag dag_filename] [--dir|-d|-i input_dir]
[--print|-p print_options] [--type workflow_type]
[--debug-job job][--debug-dir debug_dir]
[--local-executable local user executable]
[--conf|-c property_file] [--files]
[--top-dir dir_name] [--recurse|-r]
[workflow_directory]
DESCRIPTION¶
pegasus-analyzer is a command-line utility for parsing the jobstate.log file and reporting successful and failed jobs. When executed without any options, it will query the SQLite or MySQL database and retrieve failed job information for the particular workflow. When invoked with the --files option, it will retrieve information from several log files, isolating jobs that did not complete successfully, and printing their stdout and stderr so that users can get detailed information about their workflow runs.
OPTIONS¶
-h, --help
-q, --quiet
-s, --strict
-m, -t, --monitord
-v, --verbose
-o output_dir, --output-dir output_dir
--dag 'dag_filename
-d input_dir, -i input_dir, --dir input_dir
-p print_options, --print print_options
--debug-job job
--debug-dir debug_dir
--local-executable local user executable
--type workflow_type
-c property_file, --conf property_file
--files
--top-dir dir_name
-r, --recurse
ENVIRONMENT VARIABLES¶
pegasus-analyzer does not require that any environmental variables be set. It locates its required Python modules based on its own location, and therefore should not be moved outside of Pegasus' bin directory.
EXAMPLE¶
The simplest way to use pegasus-analyzer is to go to the run_directory and invoke the analyzer:
$ pegasus-analyzer .
which will cause pegasus-analyzer to print information about the workflow in the current directory.
pegasus-analyzer output contains a summary, followed by detailed information about each job that either failed, or is in an unknown state. Here is the summary section of the output:
**************************Summary***************************
Total jobs : 75 (100.00%)
# jobs succeeded : 41 (54.67%)
# jobs failed : 0 (0.00%)
# jobs unsubmitted : 33 (44.00%)
# jobs unknown : 1 (1.33%)
jobs_succeeded are jobs that have completed successfully. jobs_failed are jobs that have finished, but that did not complete successfully. jobs_unsubmitted are jobs that are listed in the dag_file, but no information about them was found in the jobstate.log file. Finally, jobs_unknown are jobs that have started, but have not reached completion.
After the summary section, pegasus-analyzer will display information about each job in the job_failed and job_unknown categories.
******************Failed jobs' details********************** =======================findrange_j3=========================
last state: POST_SCRIPT_FAILURE
site: local
submit file: /home/user/diamond-submit/findrange_j3.sub
output file: /home/user/diamond-submit/findrange_j3.out.000
error file: /home/user/diamond-submit/findrange_j3.err.000 --------------------Task #1 - Summary-----------------------
site : local
hostname : server-machine.domain.com
executable : (null)
arguments : -a findrange -T 60 -i f.b2 -o f.c2
error : 2
working dir :
In the example above, the findrange_j3 job has failed, and the analyzer displays information about the job, showing that the job finished with a POST_SCRIPT_FAILURE, and lists the submit, output and error files for this job. Whenever pegasus-analyzer detects that the output file contains a kickstart record, it will display the breakdown containing each task in the job (in this case we only have one task). Because pegasus-analyzer was not invoked with the --quiet flag, it will also display the contents of the output and error files (or the stdout and stderr sections of the kickstart record), which in this case are both empty.
In the case of SUBDAG and subdax jobs, pegasus-analyzer will indicate it, and show the command needed for the user to debug that sub-workflow. For example:
=================subdax_black_ID000009=====================
last state: JOB_FAILURE
site: local
submit file: /home/user/run1/subdax_black_ID000009.sub
output file: /home/user/run1/subdax_black_ID000009.out
error file: /home/user/run1/subdax_black_ID000009.err
This job contains sub workflows!
Please run the command below for more information:
pegasus-analyzer -d /home/user/run1/blackdiamond_ID000009.000 -----------------subdax_black_ID000009.out----------------- Executing condor dagman ... -----------------subdax_black_ID000009.err-----------------
tells the user the subdax_black_ID000009 sub-workflow failed, and that it can be debugged by using the indicated pegasus-analyzer command.
SEE ALSO¶
pegasus-status(1), pegasus-monitord(1), pegasus-statistics(1).
AUTHORS¶
Fabio Silva <fabio at isi dot edu>
Karan Vahi <vahi at isi dot edu>
Pegasus Team http://pegasus.isi.edu
11/09/2018 | Pegasus 4.4.0 |