PEGASUS-PLAN(1) | Pegasus Manual | PEGASUS-PLAN(1) |
NAME¶
pegasus-plan - runs Pegasus to generate the executable workflow
SYNOPSIS¶
pegasus-plan [-v] [-q] [-V] [-h]
[-Dprop=value...]] [-b prefix]
[--conf propsfile]
[-c cachefile[,cachefile...]] [--cleanup cleanup strategy ]
[-C style[,style...]]
[--dir dir]
[--force] [--force-replan]
[--inherited-rc-files] [-j prefix]
[-n][-I input-dir][-O output-dir] [-o site]
[-s site1[,site2...]]
[--staging-site s1=ss1[,s2=ss2[..]]
[--randomdir[=dirname]]
[--relative-dir dir]
[--relative-submit-dir dir]
-d daxfile
DESCRIPTION¶
The pegasus-plan command takes in as input the DAX and generates an executable workflow usually in form of condor submit files, which can be submitted to an execution site for execution.
As part of generating an executable workflow, the planner needs to discover:
data
The Pegasus Workflow Planner also tries to reduce the workflow, unless specified otherwise. This is done by deleting the jobs whose output files have been found in some location in the Replica Catalog. At present no cost metrics are used. However preference is given to a location corresponding to the execution site
The planner can also add nodes to transfer all the materialized files to an output site. The location on the output site is determined by looking up the site catalog file, the path to which is picked up from the pegasus.catalog.site.file property value.
executables
resources
The data and executable locations can now be specified in DAX’es conforming to DAX schema version 3.2 or higher.
OPTIONS¶
Any option will be displayed with its long options synonym(s).
-Dproperty=value
-d file, --dax file
-b prefix, --basename prefix
-c file[,file,...], --cache file[,file,...]
Each entry in the cache file describes a LFN , the corresponding PFN and the associated attributes. The pool attribute should be specified for each entry.
LFN_1 PFN_1 pool=[site handle 1] LFN_2 PFN_2 pool=[site handle 2]
... LFN_N PFN_N [site handle N]
To treat the cache files as supplemental replica catalogs set the property pegasus.catalog.replica.cache.asrc to true. This results in the mapping in the cache files to be merged with the mappings in the replica catalog. Thus, for a particular LFN both the entries in the cache file and replica catalog are available for replica selection.
-C style[,style,...], --cluster style[,style,...]
The clustered jobs can be run at the remote site, either sequentially or by using MPI. This can be specified by setting the property pegasus.job.aggregator. The property can be overridden by associating the PEGASUS profile key collapser either with the transformation in the transformation catalog or the execution site in the site catalog. The value specified (to the property or the profile), is the logical name of the transformation that is to be used for clustering jobs. Note that clustering will only happen if the corresponding transformations are catalogued in the transformation catalog.
PEGASUS ships with a clustering executable pegasus-cluster that can be found in the $PEGASUS_HOME/bin directory. It runs the jobs in the clustered job sequentially on the same node at the remote site.
In addition, an MPI based clustering tool called pegasus-mpi-cluster', is also distributed and can be found in the bin directory. pegasus-mpi-cluster can also be used in the sharedfs setup and needs to be compiled against the remote site MPI install. directory. The wrapper is run on every MPI node, with the first one being the master and the rest of the ones as workers.
By default, pegasus-cluster is used for clustering jobs unless overridden in the properties or by the pegasus profile key collapser.
The following type of clustering styles are currently supported:
Horizontal Clustering can operate in one of two modes. a. Job count based.
The granularity of clustering can be specified by associating either the PEGASUS profile key clusters.size or the PEGASUS profile key clusters.num with the transformation.
The clusters.size key indicates how many jobs need to be clustered into the larger clustered job. The clusters.num key indicates how many clustered jobs are to be created for a particular level at a particular execution site. If both keys are specified for a particular transformation, then the clusters.num key value is used to determine the clustering granularity.
To cluster jobs according to runtimes user needs to set one property and two profile keys. The property pegasus.clusterer.preference must be set to the value runtime. In addition user needs to specify two Pegasus profiles. a. clusters.maxruntime which specifies the maximum duration for which the clustered job should run for. b. job.runtime which specifies the duration for which the job with which the profile key is associated, runs for. Ideally, clusters.maxruntime should be set in transformation catalog and job.runtime should be set for each job individually.
To label the workflow, you need to associate PEGASUS profiles with the jobs in the DAX. The profile key to use for labeling the workflow can be set by the property pegasus.clusterer.label.key. It defaults to label, meaning if you have a PEGASUS profile key label with jobs, the jobs with the same value for the pegasus profile key label will go into the same clustered job.
--cleanup cleanup strategy
The following type of cleanup strategies are currently supported:
--conf propfile
--dir dir
By default the base directory is the directory from which one runs the pegasus-plan command.
-f, --force
--force-replan
-g, --group
-h, --help
--inherited-rc-files file[,file,...]
-I, --input-dir
pegasus.catalog.replica.directory.site specifies the pool attribute to associate with the mappings. Defaults to local
pegasus.catalog.replica.directory.url.prefix specifies the URL prefix to use while constructing the PFN. Defaults to file://
-j prefix, --job-prefix prefix
-n, --nocleanup
-o site, --output-site site
By default the materialized data remains in the working directory on the execution site where it was created. Only those output files are transferred to an output site for which transfer attribute is set to true in the DAX.
-O output directory, --output-dir output directory
If -o is specified the storage directory of the site specified as the output site is updated to be the directory passed. If no output site is specified, then this option internally sets the output site to local with the storage directory updated to the directory passed.
-q, --quiet
-r[dirname], --randomdir[=dirname]
By default, Pegasus duplicates the relative directory structure on the submit host on the remote site. The user can specify this option without arguments to create a random timestamp based name for the execution directory that are created by the create dir jobs. The user can can specify the optional argument to this option to specify the basename of the directory that is to be created.
The create dir jobs refer to the dirmanager executable that is shipped as part of the PEGASUS worker package. The transformation catalog is searched for the transformation named pegasus::dirmanager for all the remote sites where the workflow has been scheduled. Pegasus can create a default path for the dirmanager executable, if PEGASUS_HOME environment variable is associated with the sites in the site catalog as an environment profile.
--relative-dir dir
--relative-submit-dir dir
-s site[,site,...], --sites site[,site,...]
In case this option is not specified, all the sites in the site catalog are picked up as candidates for running the workflow.
--staging-site s1=ss1[,s2=ss2[..]]
In case of running on a shared filesystem, the staging site is automatically associated by the planner to be the execution site. If only a value is specified, then that is taken to be the staging site for all the execution sites. e.g --staging-site local means that the planner will use the local site as the staging site for all jobs in the workflow.
-s, --submit
-v, --verbose
For example, to see the INFO, CONFIG and DEBUG messages additionally, set -vvv.
-V, --version
RETURN VALUE¶
If the Pegasus Workflow Planner is able to generate an executable workflow successfully, the exitcode will be 0. All runtime errors result in an exitcode of 1. This is usually in the case when you have misconfigured your catalogs etc. In the case of an error occurring while loading a specific module implementation at run time, the exitcode will be 2. This is usually due to factory methods failing while loading a module. In case of any other error occurring during the running of the command, the exitcode will be 1. In most cases, the error message logged should give a clear indication as to where things went wrong.
CONTROLLING PEGASUS-PLAN MEMORY CONSUMPTION¶
pegasus-plan will try to determine memory limits automatically using factors such as total system memory and potential memory limits (ulimits). The automatic limits can be overridden by setting the JAVA_HEAPMIN and JAVA_HEAPMAX environment variables before invoking pegasus-plan. The values are in megabytes. As a rule of thumb, JAVA_HEAPMIN can be set to half of the value of JAVA_HEAPMAX.
PEGASUS PROPERTIES¶
This is not an exhaustive list of properties used. For the complete description and list of properties refer to $PEGASUS_HOME/doc/advanced-properties.pdf
pegasus.selector.site
pegasus.catalog.replica
If not specified, then the value defaults to RLS.
pegasus.catalog.replica.url
pegasus.dir.exec
pegasus.catalog.transformation
pegasus.catalog.transformation.file
If not specified, then the default location of $PEGASUS_HOME/var/tc.data is used.
pegasus.catalog.site
pegasus.catalog.site.file
pegasus.data.configuration
sharedfs If this is set, Pegasus will be setup to execute jobs on the shared filesystem on the execution site. This assumes, that the head node of a cluster and the worker nodes share a filesystem. The staging site in this case is the same as the execution site.
nonsharedfs If this is set, Pegasus will be setup to execute jobs on an execution site without relying on a shared filesystem between the head node and the worker nodes.
condorio If this is set, Pegasus will be setup to run jobs in a pure condor pool, with the nodes not sharing a filesystem. Data is staged to the compute nodes from the submit host using Condor File IO.
pegasus.code.generator
FILES¶
$PEGASUS_HOME/etc/dax-3.3.xsd
$PEGASUS_HOME/etc/sc-4.0.xsd
$PEGASUS_HOME/etc/tc.data.text
$PEGASUS_HOME/etc/sites.xml4 | $PEGASUS_HOME/etc/sites.xml3
$PEGASUS_HOME/lib/pegasus.jar
SEE ALSO¶
pegasus-run(1), pegasus-status(1), pegasus-remove(1), pegasus-rc-client(1), pegasus-analyzer(1)
AUTHORS¶
Karan Vahi <vahi at isi dot edu>
Pegasus Team http://pegasus.isi.edu
11/09/2018 | Pegasus 4.4.0 |