Scroll to navigation

SGE_SHADOWD(8) Grid Engine Administrative Commands SGE_SHADOWD(8)

NAME

sge_shadowd - Grid Engine shadow master daemon

SYNOPSIS

sge_shadowd

DESCRIPTION

sge_shadowd is a "light weight" process which can be run on so-called shadow master hosts in a Grid Engine cluster to detect failure of the current Grid Engine master daemon, sge_qmaster(8), and to start-up a new sge_qmaster(8) on the host on which the sge_shadowd runs. If multiple shadow daemons are active in a cluster, they run a protocol which ensures that only one of them will start-up a new master daemon.

The hosts suitable as shadow master hosts must have shared root read/write access to the directory $SGE_ROOT/$SGE_CELL/common, as well as to the master daemon spool directory (by default $SGE_ROOT/$SGE_CELL/spool/qmaster). The names of the shadow master hosts need to be contained in the file $SGE_ROOT/$xQS_NAME_Sxx_CELL/common/shadow_masters.

RESTRICTIONS

sge_shadowd may only be started by root.

ENVIRONMENT VARIABLES

Specifies the location of the Grid Engine standard configuration files.
If set, specifies the default Grid Engine cell. To address a Grid Engine cell sge_shadowd uses (in order of precedence):

The name of the cell specified in the environment variable SGE_CELL, if it is set.

The name of the default cell, i.e. default.

If set, specifies that debug information should be written to stderr. In addition the level of detail in which debug information is generated is defined.
If set, specifies the TCP port on which sge_qmaster(8) is expected to listen for communication requests. Most installations will use a services map entry for the service "sge_qmaster" instead to define that port.
This variable controls the time for which sge_shadowd pauses if a takeover bid fails. This value is used only when there are multiple sge_shadowd instances and they are contending to be the master. The default is 600 seconds.
This variable controls the interval between sge_shadowd checks of the heartbeat file (60 seconds by default).
This variable controls the interval between attempts by a sge_shadowd instance to take over when the heartbeat file has not changed. The default is 240 seconds.

FILES

<sge_root>/<cell>/common
	Default configuration directory
<sge_root>/<cell>/common/shadow_masters
	Shadow master hostname file.
<sge_root>/<cell>/spool/qmaster
	Default master daemon spool directory
<sge_root>/<cell>/spool/qmaster/heartbeat
	The heartbeat file.

SEE ALSO

sge_intro(1), sge_conf(5), sge_qmaster(8)

COPYRIGHT

See sge_intro(1) for a full statement of rights and permissions.

2007-11-08 SGE 8.1.3pre