NAME¶
arcsub - ARC Submission
DESCRIPTION¶
The
arcsub command is used for submitting jobs to Grid enabled computing
resources.
SYNOPSIS¶
arcsub [options] [filename ...]
OPTIONS¶
- -c, --cluster=name
- select one or more computing elements: name can be an alias for a
single CE, a group of CEs or a URL
- -g, --index=name
- select one or more registries: name can be an alias for a single
registry, a group of registries or a URL
- -R, --rejectdiscovery=URL
- skip the service with the given URL during service discovery
- -S, --submissioninterface=InterfaceName
- only use this interface for submitting (e.g. org.nordugrid.gridftpjob,
org.ogf.glue.emies.activitycreation, org.ogf.bes)
- -I, --infointerface=InterfaceName
- the computing element specified by URL at the command line should be
queried using this information interface (possible options:
org.nordugrid.ldapng, org.nordugrid.ldapglue2, org.nordugrid.wsrfglue2,
org.ogf.glue.emies.resourceinfo)
- -e, --jobdescrstring=String
- jobdescription string describing the job to be submitted
- -f, --jobdescrfile=filename
- jobdescription file describing the job to be submitted
- -j, --joblist=filename
- the file storing information about active jobs (default
~/.arc/jobs.xml)
- -o, --jobids-to-file=filename
- the IDs of the submitted jobs will be appended to this file
- -D, --dryrun
- submit jobs as dry run (no submission to batch system)
- --direct
- submit directly - no resource discovery or matchmaking
- -x, --dumpdescription
- do not submit - dump job description in the language accepted by the
target
- -P, --listplugins
- list the available plugins
- -t, --timeout=seconds
- timeout in seconds (default 20)
- -z, --conffile=filename
- configuration file (default ~/.arc/client.conf)
- -d, --debug=debuglevel
- FATAL, ERROR, WARNING, INFO, VERBOSE or DEBUG
- -b, --broker=broker
- selected broker: Random (default), FastestQueue or custom. Use -P to find
possible options.
- -v, --version
- print version information
- -?, --help
- print help
ARGUMENTS¶
- filename ...
- job description files describing the jobs to be submitted
EXTENDED DESCRIPTION¶
arcsub is the key command when submitting jobs to Grid enabled computing
resources with the ARC client. As default
arcsub is able to submit jobs
to A-REX, CREAM and EMI ES enabled computing elements (CEs), and as always for
successful submission you need to be authenticated at the targeted computing
services. Since
arcsub is build on a modular library, modules can be
installed which enables submission to other targets, e.g. the classic ARC CE
Grid-Manager.
Job submission can be accomplished by specifying a job description file to
submit as an argument.
arcsub will then by default perform resource
discovery on the Grid and then the discovered resources will be matched to the
job description and ranked according to the chosen broker (
--broker
option). If no Grid environment has been configured, please contact your
system administrator, or setup one yourself in the client configuration file
(see files section). Another option is to explicitly specify a registry
service (or multiple) to
arcsub using the
--index option, which
accepts an URL, alias or group. Alternatively a specific CE (or multiple) can
be targeted by using the
--cluster option. If such a scenario is the
most common, it is worthwhile to specify those CEs in the client configuration
as default services, which makes it superfluous to specify them as argument.
In the same manner aliases and groups, defined in the configuration file, can
be utilized, and can be used as argument to the
--cluster or
--index options. In all of the above scenarios
arcsub obtains
resource information from the services which is then used for matchmaking
against the job description, however that step can be avoided by specifying
the
--direct option, in which case the job description is submitted
directly to first specified endpoint.
The format of a classic GRIDFTP-based cluster URLs:
[ldap://]<hostname>[:2135/nordugrid-cluster-name=<hostname>,Mds-Vo-name=local,o=grid]
Only the
hostname part has to be specified, the rest of the URL is
automatically generated.
The format of an A-REX URL is:
[https://]<hostname>[:<port>][/<path>]
Here the port is 443 by default, but the path cannot be guessed, so if it is not
specified, then the service is assumed to live on the root path.
Job descriptions can also be specified using the
--jobdescrfile option
which expect the file name of the description as argument, or the
--jobdescrstring option which expect as argument the job description as
a string, and both options can be specified multiple times and one does not
exclude the other. The default supported job description languages are xRSL,
JSDL and JDL.
If the job description is successfully submitted a job-ID is returned and
printed. This job-ID uniquely identifies the job while it is being executed.
On the other hand it is also possible that no CEs matches the constraints
defined in the description in which case no submission will be done. Upon
successful submission, the job-ID along with more technical job information is
stored in the job-list file (described below). The stored information enables
the job management commands of the ARC client to manage jobs easily, and thus
the job-ID need not to be saved manually. By default the job-list file is
stored in the .arc directory in the home directory of the user, however
another location can be specified using the
--joblist option taking the
location of this file as argument. If the
--joblist option was used
during submission, it should also be specified in the consecutive commands
when managing the job. If a Computing Element has multiple job submission
interfaces (e.g. gridftp, EMI-ES, BES), then the brokering algorithm will
choose one of them. With the
--submissioninterface option the requested
interface can be specified, and in that case only those Computing Elements
will be considered which has that specific interface, and only that interface
will be used to submit the jobs.
As mentioned above registry or index services can be specified with the
--index option. Specifying one or multiple index servers instructs the
arcsub command to query the servers for registered CEs, the returned
CEs will then be matched against the job description and those matching will
be ranked by the chosen broker (see below) and submission will be tried in
order until successful or reaching the end. From the returned list of CEs it
might happen that a troublesome or undesirable CE is selected for submission,
in that case it possible to reject that cluster using the
--rejectdiscovery option and providing the URL (or just the hostname)
of the CE, which will disregard that CE as a target for submission.
When multiple CEs are targeted for submission, the resource broker will be used
to filter out CEs which do not match the job description requirements and then
rank the remaining CEs. The broker used by default will rank the CEs randomly,
however a different broker can be chosen by using the
--broker option,
which takes the name of the broker as argument. The broker type can also be
specified in client.conf. The brokers available can be seen using
arcsub
-P. By default the following brokers are available:
- Random (default)
- Chooses a random CE matching the job requirements.
- FastestQueue
- Ranks matching CEs according to the length of the job queue at the CEs,
ranking those with shortest queue first/highest.
- Benchmark
- Ranks matching CEs according to a specified benchmark, which should be
specified by appending the broker name with ':' and then the name of the
benchmark. If no option is given to the Benchmark broker then CEs will be
ranked according to the 'specint2000' benchmark.
- Data
- Ranks matching CEs according to the amount of input data cached by each
CE, by querying the CE. Only CEs with the A-REX BES interface support this
operation.
- Null
- Choose a random CE with no filtering at all of CEs.
- PythonBroker
- User-defined custom brokers can be created in Python. See the example
broker SampleBroker.py or ACIXBroker.py (like Data broker but uses the ARC
Cache Index) that come installed with ARC for more details of how to write
your own broker. A PythonBroker is specified by --broker
PythonBroker:Filename.Class:args, where Filename is the file
containing the class Class which implements the broker interface. The
directory containing this file must be in the PYTHONPATH. args is optional
and allows specifying arguments to the broker.
Before submission,
arcsub performs an intelligent modification of the job
description (adding or modifying attributes, even converting the description
language to fit the needs of the CE) ensuring that it is valid. The modified
job description can be printed by specifying the
--dumpdescription
option. The format, i.e. job description language, of the printed job
description cannot be specified, and will be that which will be sent to and
accepted by the chosen target. Further information from
arcsub can be
obtained by increasing the verbosity, which is done with the
--debug
option where the default verbosity level is WARNING. Setting the level to
DEBUG will show all messages, while setting it to FATAL will only show fatal
log messages.
To
validate your job description without actually submitting a job, use
the
--dryrun option: it will capture possible syntax or other errors,
but will instruct the site not to submit the job for execution. Only the
grid-manager (ARC0) and A-REX (ARC1) CEs support this feature.
EXAMPLES¶
Submission of a job description file "helloworld.jsdl" to the Grid
arcsub helloworld.jsdl
A information index server (registry) can also be queried for CEs to submit to:
arcsub -g registry.example.com helloworld.jsdl
Submission of a job description file "helloworld.jsdl" to
ce.example.com:
arcsub -c ce.example.com helloworld.jsdl
Direct submission to a CE is done as:
arcsub --direct -c cd.example.com helloworld.jsdl
The job description can also be specified directly on the command line as shown
in the example, using the XRSL job description language:
arcsub -c example.com/arex -e \
´&(executable="/bin/echo")(arguments="Hello
World!")´
When submitting against CEs retrieved from information index servers it might be
useful to do resource brokering:
arcsub -g registry.example.com -b FastestQueue helloworld.jsdl
If the job has a large input data set, it can be useful to send it to a CE where
those files are already cached. The ACIX broker can be used for this:
arcsub -g registry.example.com -b
PythonBroker:ACIXBroker.ACIXBroker:https://cacheindex.ndgf.org:6443/data/index
helloworld.jsdl
Disregarding a specific CE for submission submitting against an information
index server:
arcsub -g registry.example.com -R badcomputingelement.com/arex
helloworld.jsdl
Dumping the job description is done as follows:
arcsub -c example.com/arex -x helloworld.jsdl
FILES¶
- ~/.arc/client.conf
- Some options can be given default values by specifying them in the ARC
client configuration file. Registry and computing element services can be
specified in seperate sections of the config. The default services can be
specified by adding 'default=yes' attribute to the section of the service,
thus when no --cluster or --index options are given these
will be used for submission. Each service has an alias, and can be member
of any number of groups. Then specifying the alias or the name of the
group with the --cluster or --index options will select the
given services. By using the --conffile option a different
configuration file can be used than the default. Note that some
installations also have a system client configuration file, however
attributes in the client one takes precedence, and then command line
options takes precedence over configuration file attributes.
- ~/.arc/jobs.xml
- This a local list of the user's active jobs. When a job is successfully
submitted it is added to this list and when it is removed from the remote
cluster it is removed from this list. This list is used as the list of all
active jobs when the user specifies the --all option to the various
NorduGrid ARC user interface commands. By using the --joblist
option a different file can be used than the default.
ENVIRONMENT VARIABLES¶
- X509_USER_PROXY
- The location of the user's Grid proxy file. Shouldn't be set unless the
proxy is in a non-standard location.
- ARC_LOCATION
- The location where ARC is installed can be specified by this variable. If
not specified the install location will be determined from the path to the
command being executed, and if this fails a WARNING will be given stating
the location which will be used.
- ARC_PLUGIN_PATH
- The location of ARC plugins can be specified by this variable. Multiple
locations can be specified by separating them by : (; in Windows). The
default location is $ARC_LOCATION/lib/arc (\ in Windows).
COPYRIGHT¶
APACHE LICENSE Version 2.0
AUTHOR¶
ARC software is developed by the NorduGrid Collaboration
(
http://www.nordugrid.org), please consult the AUTHORS file distributed with
ARC. Please report bugs and feature requests to
http://bugzilla.nordugrid.org
SEE ALSO¶
arccat(1),
arcclean(1),
arccp(1),
arcget(1),
arcinfo(1),
arckill(1),
arcls(1),
arcmigrate(1),
arcmkdir(1),
arcproxy(1),
arcrenew(1),
arcresub(1),
arcresume(1),
arcrm(1),
arcstat(1),
arcsync(1),
arctest(1)