Scroll to navigation

SGE_RESOURCE_QUOTA(5) Grid Engine File Formats SGE_RESOURCE_QUOTA(5)

NAME

sge_resource_quota - Grid Engine resource quota file format

DESCRIPTION

Resource quota sets (RQS) are a flexible way to set a maximum resource consumption for any job requests. They are used by the scheduler to select the next possible jobs for running. The job request quota application is done according to a set of user, project, cluster queue, host and PE filter criteria. RQS are applied to resource requests before considering the amount of resources defined (in order) at the global, host, and queue levels. If an RQS denies the request the other levels are not considered.

By using resource quota sets, administrators can define a fine-grained quota configuration, restricting some job requests to lesser resource usage and granting others higher usage.

Note: Jobs requesting an Advance Reservation (AR) are not honored by RQS, and are neither subject to the resulting limit, nor debited in the usage consumption.

A list of currently configured RQS can be displayed via the qconf(1) -srqsl option. The contents of each listed rqs definition can be shown via the -srqs switch. The output follows the format described below. New RQS can be created, and existing ones modified, via the -arqs, -mrqs and -drqs options to qconf(1).

A resource quota set defines a maximum resource quota for a particular job request. All of the configured and enabled rule sets apply all of the time. This means that if multiple resource quota sets are defined, the most restrictive set is used.

Every resource quota set consists of one or more resource quota rules. These rules are evaluated in order, and the first rule that matches a specific request will be used. A resource quota set always results in at most one effective resource quota rule for a specific request.

Note, Grid Engine allows backslashes (\) be used to escape newline characters. The backslash and the newline are replaced with a space character before any interpretation.

FORMAT

A resource quota set definition contains the following parameters one per-line in braces which enclose the whole set. See below for the formal syntax.

name

The resource quota set name.

enabled

If set to true the resource quota set is active and will be considered for scheduling decisions. The default value is false.

description

This description field is optional and can be set to an arbitrary string. The default value is NONE.

limit

Every resource quota set needs at least one resource quota rule definition, started by the limit field. It is possible to define multiple resource quota rules, separated by a new line, processed in order top to bottom.

A resource quota rule consists of an optional name, the filters for a specific job request, and the resource quota limit.

The tags for expressing a resource quota rule are:

The name of the rule (optional). The rule name must be unique within a resource quota set.
Contains a comma-separated list of user names or ACLs (see access_list(5)). This parameter filters jobs by user or ACL in the list. Any user not in the list will not be considered for the resource quota rule. The default value is '*', which means all users. An ACL is differentiated from a user name by prefixing the ACL name with '@'. To exclude a user or ACL from the rule, the name can be prefixed with '!'. The defined user or ACL names need not exist in the Grid Engine configuration.
Contains a comma-separated list of projects (see project(5)). This parameter filters jobs requesting a project in the list. Any project not in the list will not be considered for the resource quota rule. If no project filter is specified, all projects, and jobs with no requested project, match the rule. The value '*' means all jobs with requested projects. To exclude a project from the rule, the name can be prefixed with '!'. The value '!*' means only jobs with no project requested.
Contains a comma-separated list of PEs (see sge_pe(5)). This parameter filters jobs requesting a PE in the list. Any PE not in the list will not be considered for the resource quota rule. If no PE filter is specified, all PEs, and jobs with no requested PE, match the rule. The value '*' means all jobs requesting a PE. To exclude a PE from the rule, the name can be prefixed with '!'. The value '!*' means only jobs with no PE requested.
Contains a comma-separated list of cluster queues (see queue_conf(5)). This parameter filters jobs that may be scheduled in a queue in the list. Any queue not in the list will not be considered for the resource quota rule. The default value is '*', which means all queues. To exclude a queue from the rule, the name can be prefixed with '!'.
Contains a comma-separated list of hosts or hostgroups (see host(5) and hostgroup(5)). This parameter filters jobs that may be scheduled to a host in the list or a host contained in a hostgroup in the list. Any host not in the list will not be considered for the resource quota rule. The default value is '*', which means all hosts. To exclude a host or hostgroup from the rule, the name can be prefixed with '!'.
This mandatory field defines the quota for resource attributes for this rule. The quota is expressed by one or more comma-separated limit definitions referring to fixed or consumable resources (not load values). Two kinds of limit definition may be used:
Static limits set static values as quotas. Each limit consists of a complex attribute followed by an "=" sign and a value specification consistent with the complex attribute's type (see complex(5)).
A dynamic limit is a simple algebraic expression used to derive the limit value. The formula can reference complex attributes, whose value is used for the calculation of the resulting limit. The formula expression syntax is that of a sum of weighted complex values, that is:

{w1|$complex1[*w1]}[{+|-}{w2|$complex2[*w2]}[{+|-}...]]

The weighting factors (w1, ...) are positive integers or floating point numbers in double precision. The complex values (complex1, ...) must be of numerical type (INT, DOUBLE, MEMORY, or TIME), as specified by the complex's type in the complex list (see complex(5)) and defined either on global, queue, or host level to resolve the value.
Note: Dynamic limits can only be configured for a host-specific rule, and must be defined for an expanded host list (or individual host). Also, if a load value corresponding to a complex used is not available, a large value is used for it to suggest an overloaded condition. Dynamic limits may slow the scheduler significantly.

A complex form of limit may be used: "expanded" filters with the consumer list enclosed in braces ('{' '}'). This may be thought of as applying for each member of the list individually, as opposed to for all elements of a non-braced list in total. Alternatively, it is equivalent to an expansion into multiple instances of the rule, per the syntax which inspired it in shells such as bash(1). Thus

limit users {a, b} ... to ...
is equivalent to
limit users a ... to ...
limit users b ... to ...
where the text represented by the ellipses in each position is carried over to the expansion, and could be expanded itself. '{*}' represents a limit for each consumer of that type, as opposed to '*', which limits all the consumers together. E.g.
limit users * to slots=100
limits the total number of slots in use to 100, whereas
limit users {*} to slots=100
limits each user to 100 slots. ACLs and hostgroups in expanded lists are treated as if they are expanded into a list of their constituents before expanding the whole list. A '!' prefix is distributed through the expansion of ACLs or hostgroups, i.e.
limit users {!@acl,...} ...
where @acl has members user1, user2, ..., expands to
limit users {!user1,!user2,...} ...
and thus

limit users !user1 ...
limit users !user2 ...
...

Formal Syntax

ALL: '*'
SEPARATOR: ','
STRING: [^\n]*
QUOTE: '"'
S_EXPANDER: '{'
E_EXPANDER: '}'
NOT: '!'
BOOL: [tT][rR][uU][eE]
| 1
| [fF][aA][lL][sS][eE]
| 0
NAME: [a-zA-Z][a-zA-Z0-9_-]*
LISTVALUE: ALL | [NOT]STRING
LIST: LISTVALUE [SEPARATOR LISTVALUE]*
FILTER: LIST | S_EXPANDER LIST E_EXPANDER
RESOURCEPAIR: STRING=STRING
RESOURCE: RESOURCEPAIR [SEPARATOR RESOURCEPAIR]*
rule: "limit" ["name" NAME] ["users" FILTER]
["projects" FILTER] ["pes" FILTER] ["queues" FILTER]
["hosts" FILTER] "to" RESOURCE NL
ruleset_attributes: "name" NAME NL
["enabled" BOOL NL]
["description" QUOTE STRING QUOTE NL]
ruleset: "{"
ruleset_attributes
rule+
"}" NL
rulesets: ruleset*

NOTES

Please note that resource quotas are not enforced as job resource limits. Limiting, for example, h_vmem in a resource quota set does not result in a memory limit being set for job execution; it is necessary to specify such a limit on the job request, or as the complex's default value. Thus

limit users {*} to h_vmem=2G
will not restrict the memory a job can actually allocate to 2G, only what it can request, with the request actually enforcing the allocation.

The most restrictive rule in a set should be first in the limit List so that the scheduler can dispatch jobs efficiently by rejecting queues to consider as early as possible since subsequent rules in the list are not considered after one matches. This can be important in large clusters, in which RQS can significantly slow down scheduling.

EXAMPLES

The following is the simplest form of a resource quota set. It restricts all users together to a maximal use of 100 slots in the whole cluster. Similarly, "slots=0" could be used to prevent new jobs starting for draining the system.

=======================================================================
{

name max_u_slots
description "All users max use of 100 slots"
enabled true
limit to slots=100 } =======================================================================

The next example restricts user1 and user2 to requesting 6g virtual_free, and all other users to requesting 4g virtual_free, on each host in hostgroup lx_hosts.

=======================================================================
{

name max_virtual_free_on_lx_hosts
description "resource quota for virtual_free restriction"
enabled true
limit users {user1,user2} hosts {@lx_host} to virtual_free=6g
limit users {*} hosts {@lx_host} to virtual_free=4g } =======================================================================

The next example shows the use of a dynamic limit. It restricts the total slot usage by all users on each host to twice the value of num_proc (the number of processor units) on the host. (It would be more usual to use "slots=$num_proc" to prevent over-subscription of nodes.)

=======================================================================
{

name max_slots_on_every_host
enabled true
limit hosts {*} to slots=$num_proc*2 } =======================================================================

SEE ALSO

sge_intro(1), access_list(5), complex(5), host(5), hostgroup(5), qconf(1), qquota(1), project(5).

COPYRIGHT

See sge_intro(1) for a full statement of rights and permissions.

2012-04-02 SGE 8.1.3pre