table of contents
- bookworm 252.31-1~deb12u1
- bookworm-backports 254.16-1~bpo12+1
- testing 256.7-3
- unstable 257~rc2-3
SD_NOTIFY(3) | sd_notify | SD_NOTIFY(3) |
NAME¶
sd_notify, sd_notifyf, sd_pid_notify, sd_pid_notifyf, sd_pid_notify_with_fds, sd_pid_notifyf_with_fds, sd_notify_barrier, sd_pid_notify_barrier - Notify service manager about start-up completion and other service status changes
SYNOPSIS¶
#include <systemd/sd-daemon.h>
int sd_notify(int unset_environment, const char *state);
int sd_notifyf(int unset_environment, const char *format, ...);
int sd_pid_notify(pid_t pid, int unset_environment, const char *state);
int sd_pid_notifyf(pid_t pid, int unset_environment, const char *format, ...);
int sd_pid_notify_with_fds(pid_t pid, int unset_environment, const char *state, const int *fds, unsigned n_fds);
int sd_pid_notifyf_with_fds(pid_t pid, int unset_environment, const int *fds, size_t n_fds, const char *format, ...);
int sd_notify_barrier(int unset_environment, uint64_t timeout);
int sd_pid_notify_barrier(pid_t pid, int unset_environment, uint64_t timeout);
DESCRIPTION¶
sd_notify() may be called by a service to notify the service manager about state changes. It can be used to send arbitrary information, encoded in an environment-block-like string. Most importantly, it can be used for start-up or reload completion notifications.
If the unset_environment parameter is non-zero, sd_notify() will unset the $NOTIFY_SOCKET environment variable before returning (regardless of whether the function call itself succeeded or not). Further calls to sd_notify() will then silently do nothing, and the variable is no longer inherited by child processes.
The state parameter should contain a newline-separated list of variable assignments, similar in style to an environment block. A trailing newline is implied if none is specified. The string may contain any kind of variable assignments, but see the next section for a list of assignments understood by the service manager.
Note that systemd will accept status data sent from a service only if the NotifyAccess= option is correctly set in the service definition file. See systemd.service(5) for details.
Note that sd_notify() notifications may be attributed to units correctly only if either the sending process is still around at the time PID 1 processes the message, or if the sending process is explicitly runtime-tracked by the service manager. The latter is the case if the service manager originally forked off the process, i.e. on all processes that match NotifyAccess=main or NotifyAccess=exec. Conversely, if an auxiliary process of the unit sends an sd_notify() message and immediately exits, the service manager might not be able to properly attribute the message to the unit, and thus will ignore it, even if NotifyAccess=all is set for it.
Hence, to eliminate all race conditions involving lookup of the client's unit and attribution of notifications to units correctly, sd_notify_barrier() may be used. This call acts as a synchronization point and ensures all notifications sent before this call have been picked up by the service manager when it returns successfully. Use of sd_notify_barrier() is needed for clients which are not invoked by the service manager, otherwise this synchronization mechanism is unnecessary for attribution of notifications to the unit.
sd_notifyf() is similar to sd_notify() but takes a printf()-like format string plus arguments.
sd_pid_notify() and sd_pid_notifyf() are similar to sd_notify() and sd_notifyf() but take a process ID (PID) to use as originating PID for the message as first argument. This is useful to send notification messages on behalf of other processes, provided the appropriate privileges are available. If the PID argument is specified as 0, the process ID of the calling process is used, in which case the calls are fully equivalent to sd_notify() and sd_notifyf().
sd_pid_notify_with_fds() is similar to sd_pid_notify() but takes an additional array of file descriptors. These file descriptors are sent along the notification message to the service manager. This is particularly useful for sending "FDSTORE=1" messages, as described above. The additional arguments are a pointer to the file descriptor array plus the number of file descriptors in the array. If the number of file descriptors is passed as 0, the call is fully equivalent to sd_pid_notify(), i.e. no file descriptors are passed. Note that file descriptors sent to the service manager on a message without "FDSTORE=1" are immediately closed on reception.
sd_pid_notifyf_with_fds() is a combination of sd_pid_notify_with_fds() and sd_notifyf(), i.e. it accepts both a PID and a set of file descriptors as input, and processes a format string to generate the state string.
sd_notify_barrier() allows the caller to synchronize against reception of previously sent notification messages and uses the BARRIER=1 command. It takes a relative timeout value in microseconds which is passed to ppoll(2). A value of UINT64_MAX is interpreted as infinite timeout.
sd_pid_notify_barrier() is just like sd_notify_barrier(), but allows specifying the originating PID for the notification message.
WELL-KNOWN ASSIGNMENTS¶
The following assignments have a defined meaning:
READY=1
RELOADING=1
Added in version 217.
STOPPING=1
Added in version 217.
MONOTONIC_USEC=...
Added in version 253.
STATUS=...
Added in version 233.
NOTIFYACCESS=...
Added in version 254.
ERRNO=...
Added in version 233.
BUSERROR=...
Added in version 233.
EXIT_STATUS=...
Added in version 254.
MAINPID=...
Added in version 233.
WATCHDOG=1
WATCHDOG=trigger
Added in version 243.
WATCHDOG_USEC=...
Added in version 233.
EXTEND_TIMEOUT_USEC=...
Added in version 236.
FDSTORE=1
The service manager will accept messages for a service only if its FileDescriptorStoreMax= setting is non-zero (defaults to zero, see systemd.service(5)). The service manager will set the $FDSTORE environment variable for services that have the file descriptor store enabled, see systemd.exec(5).
If FDPOLL=0 is not set and the file descriptors are pollable (see epoll_ctl(2)), then any EPOLLHUP or EPOLLERR event seen on them will result in their automatic removal from the store.
Multiple sets of file descriptors may be sent in separate messages, in which case the sets are combined. The service manager removes duplicate file descriptors (those pointing to the same object) before passing them to the service.
This functionality should be used to implement services that can restart after an explicit request or a crash without losing state. Application state can either be serialized to a file in /run/, or better, stored in a memfd_create(2) memory file descriptor. Use sd_pid_notify_with_fds() to send messages with "FDSTORE=1". It is recommended to combine FDSTORE= with FDNAME= to make it easier to manage the stored file descriptors.
For further information on the file descriptor store see the File Descriptor Store[1] overview.
Added in version 219.
FDSTOREREMOVE=1
Added in version 236.
FDNAME=...
The name may consist of arbitrary ASCII characters except control characters or ":". It may not be longer than 255 characters. If a submitted name does not follow these restrictions, it is ignored.
Note that if multiple file descriptors are submitted in a single message, the specified name will be used for all of them. In order to assign different names to submitted file descriptors, submit them in separate messages.
Added in version 233.
FDPOLL=0
Added in version 246.
BARRIER=1
Added in version 246.
The notification messages sent by services are interpreted by the service manager. Unknown assignments are ignored. Thus, it is safe (but often without effect) to send assignments which are not in this list. The protocol is extensible, but care should be taken to ensure private extensions are recognizable as such. Specifically, it is recommend to prefix them with "X_" followed by some namespace identifier. The service manager also sends some messages to its notification socket, which may then consumed by a supervising machine or container manager further up the stack. The service manager sends a number of extension fields, for example X_SYSTEMD_UNIT_ACTIVE=, for details see systemd(1).
RETURN VALUE¶
On failure, these calls return a negative errno-style error code. If $NOTIFY_SOCKET was not set and hence no status message could be sent, 0 is returned. If the status was sent, these functions return a positive value. In order to support both service managers that implement this scheme and those which do not, it is generally recommended to ignore the return value of this call. Note that the return value simply indicates whether the notification message was enqueued properly, it does not reflect whether the message could be processed successfully. Specifically, no error is returned when a file descriptor is attempted to be stored using FDSTORE=1 but the service is not actually configured to permit storing of file descriptors (see above).
NOTES¶
Functions described here are available as a shared library, which can be compiled against and linked to with the libsystemd pkg-config(1) file.
The code described here uses getenv(3), which is declared to be not multi-thread-safe. This means that the code calling the functions described here must not call setenv(3) from a parallel thread. It is recommended to only do calls to setenv() from an early phase of the program when no other threads have been started.
These functions send a single datagram with the state string as payload to the socket referenced in the $NOTIFY_SOCKET environment variable. If the first character of $NOTIFY_SOCKET is "/" or "@", the string is understood as an AF_UNIX or Linux abstract namespace socket (respectively), and in both cases the datagram is accompanied by the process credentials of the sending service, using SCM_CREDENTIALS. If the string starts with "vsock:" then the string is understood as an AF_VSOCK address, which is useful for hypervisors/VMMs or other processes on the host to receive a notification when a virtual machine has finished booting. Note that in case the hypervisor does not support SOCK_DGRAM over AF_VSOCK, SOCK_SEQPACKET will be used instead. "vsock-stream", "vsock-dgram" and "vsock-seqpacket" can be used instead of "vsock" to force usage of the corresponding socket type. The address should be in the form: "vsock:CID:PORT". Note that unlike other uses of vsock, the CID is mandatory and cannot be "VMADDR_CID_ANY". Note that PID1 will send the VSOCK packets from a privileged port (i.e.: lower than 1024), as an attempt to address concerns that unprivileged processes in the guest might try to send malicious notifications to the host, driving it to make destructive decisions based on them.
Standalone Implementations¶
Note that, while using this library should be preferred in order to avoid code duplication, it is also possible to reimplement the simple readiness notification protocol without external dependencies, as demonstrated in the following self-contained examples from several languages:
C
/* SPDX-License-Identifier: MIT-0 */ /* Implement the systemd notify protocol without external dependencies.
* Supports both readiness notification on startup and on reloading,
* according to the protocol defined at:
* https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html
* This protocol is guaranteed to be stable as per:
* https://systemd.io/PORTABILITY_AND_STABILITY/ */ #define _GNU_SOURCE 1 #include <errno.h> #include <inttypes.h> #include <signal.h> #include <stdbool.h> #include <stddef.h> #include <stdlib.h> #include <stdio.h> #include <sys/socket.h> #include <sys/un.h> #include <time.h> #include <unistd.h> #define _cleanup_(f) __attribute__((cleanup(f))) static void closep(int *fd) {
if (!fd || *fd < 0)
return;
close(*fd);
*fd = -1; } static int notify(const char *message) {
union sockaddr_union {
struct sockaddr sa;
struct sockaddr_un sun;
} socket_addr = {
.sun.sun_family = AF_UNIX,
};
size_t path_length, message_length;
_cleanup_(closep) int fd = -1;
const char *socket_path;
/* Verify the argument first */
if (!message)
return -EINVAL;
message_length = strlen(message);
if (message_length == 0)
return -EINVAL;
/* If the variable is not set, the protocol is a noop */
socket_path = getenv("NOTIFY_SOCKET");
if (!socket_path)
return 0; /* Not set? Nothing to do */
/* Only AF_UNIX is supported, with path or abstract sockets */
if (socket_path[0] != '/' && socket_path[0] != '@')
return -EAFNOSUPPORT;
path_length = strlen(socket_path);
/* Ensure there is room for NUL byte */
if (path_length >= sizeof(socket_addr.sun.sun_path))
return -E2BIG;
memcpy(socket_addr.sun.sun_path, socket_path, path_length);
/* Support for abstract socket */
if (socket_addr.sun.sun_path[0] == '@')
socket_addr.sun.sun_path[0] = 0;
fd = socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0);
if (fd < 0)
return -errno;
if (connect(fd, &socket_addr.sa, offsetof(struct sockaddr_un, sun_path) + path_length) != 0)
return -errno;
ssize_t written = write(fd, message, message_length);
if (written != (ssize_t) message_length)
return written < 0 ? -errno : -EPROTO;
return 1; /* Notified! */ } static int notify_ready(void) {
return notify("READY=1"); } static int notify_reloading(void) {
/* A buffer with length sufficient to format the maximum UINT64 value. */
char reload_message[sizeof("RELOADING=1\nMONOTONIC_USEC=18446744073709551615")];
struct timespec ts;
uint64_t now;
/* Notify systemd that we are reloading, including a CLOCK_MONOTONIC timestamp in usec
* so that the program is compatible with a Type=notify-reload service. */
if (clock_gettime(CLOCK_MONOTONIC, &ts) < 0)
return -errno;
if (ts.tv_sec < 0 || ts.tv_nsec < 0 ||
(uint64_t) ts.tv_sec > (UINT64_MAX - (ts.tv_nsec / 1000ULL)) / 1000000ULL)
return -EINVAL;
now = (uint64_t) ts.tv_sec * 1000000ULL + (uint64_t) ts.tv_nsec / 1000ULL;
if (snprintf(reload_message, sizeof(reload_message), "RELOADING=1\nMONOTONIC_USEC=%" PRIu64, now) < 0)
return -EINVAL;
return notify(reload_message); } static int notify_stopping(void) {
return notify("STOPPING=1"); } static volatile sig_atomic_t reloading = 0; static volatile sig_atomic_t terminating = 0; static void signal_handler(int sig) {
if (sig == SIGHUP)
reloading = 1;
else if (sig == SIGINT || sig == SIGTERM)
terminating = 1; } int main(int argc, char **argv) {
struct sigaction sa = {
.sa_handler = signal_handler,
.sa_flags = SA_RESTART,
};
int r;
/* Setup signal handlers */
sigemptyset(&sa.sa_mask);
sigaction(SIGHUP, &sa, NULL);
sigaction(SIGINT, &sa, NULL);
sigaction(SIGTERM, &sa, NULL);
/* Do more service initialization work here ... */
/* Now that all the preparations steps are done, signal readiness */
r = notify_ready();
if (r < 0) {
fprintf(stderr, "Failed to notify readiness to $NOTIFY_SOCKET: %s\n", strerror(-r));
return EXIT_FAILURE;
}
while (!terminating) {
if (reloading) {
reloading = false;
/* As a separate but related feature, we can also notify the manager
* when reloading configuration. This allows accurate state-tracking,
* and also automated hook-in of 'systemctl reload' without having to
* specify manually an ExecReload= line in the unit file. */
r = notify_reloading();
if (r < 0) {
fprintf(stderr, "Failed to notify reloading to $NOTIFY_SOCKET: %s\n", strerror(-r));
return EXIT_FAILURE;
}
/* Do some reconfiguration work here ... */
r = notify_ready();
if (r < 0) {
fprintf(stderr, "Failed to notify readiness to $NOTIFY_SOCKET: %s\n", strerror(-r));
return EXIT_FAILURE;
}
}
/* Do some daemon work here ... */
sleep(5);
}
r = notify_stopping();
if (r < 0) {
fprintf(stderr, "Failed to report termination to $NOTIFY_SOCKET: %s\n", strerror(-r));
return EXIT_FAILURE;
}
/* Do some shutdown work here ... */
return EXIT_SUCCESS; }
Python
#!/usr/bin/env python3 # SPDX-License-Identifier: MIT-0 # # Implement the systemd notify protocol without external dependencies. # Supports both readiness notification on startup and on reloading, # according to the protocol defined at: # https://www.freedesktop.org/software/systemd/man/latest/sd_notify.html # This protocol is guaranteed to be stable as per: # https://systemd.io/PORTABILITY_AND_STABILITY/ import errno import os import signal import socket import sys import time reloading = False terminating = False def notify(message):
if not message:
raise ValueError("notify() requires a message")
socket_path = os.environ.get("NOTIFY_SOCKET")
if not socket_path:
return
if socket_path[0] not in ("/", "@"):
raise OSError(errno.EAFNOSUPPORT, "Unsupported socket type")
# Handle abstract socket.
if socket_path[0] == "@":
socket_path = "\0" + socket_path[1:]
with socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM | socket.SOCK_CLOEXEC) as sock:
sock.connect(socket_path)
sock.sendall(message) def notify_ready():
notify(b"READY=1") def notify_reloading():
microsecs = time.clock_gettime_ns(time.CLOCK_MONOTONIC) // 1000
notify(f"RELOADING=1\nMONOTONIC_USEC={microsecs}".encode()) def notify_stopping():
notify(b"STOPPING=1") def reload(signum, frame):
global reloading
reloading = True def terminate(signum, frame):
global terminating
terminating = True def main():
print("Doing initial setup")
global reloading, terminating
# Set up signal handlers.
print("Setting up signal handlers")
signal.signal(signal.SIGHUP, reload)
signal.signal(signal.SIGINT, terminate)
signal.signal(signal.SIGTERM, terminate)
# Do any other setup work here.
# Once all setup is done, signal readiness.
print("Done setting up")
notify_ready()
print("Starting loop")
while not terminating:
if reloading:
print("Reloading")
reloading = False
# Support notifying the manager when reloading configuration.
# This allows accurate state tracking as well as automatically
# enabling 'systemctl reload' without needing to manually
# specify an ExecReload= line in the unit file.
notify_reloading()
# Do some reconfiguration work here.
print("Done reloading")
notify_ready()
# Do the real work here ...
print("Sleeping for five seconds")
time.sleep(5)
print("Terminating")
notify_stopping() if __name__ == "__main__":
sys.stdout.reconfigure(line_buffering=True)
print("Starting app")
main()
print("Stopped app")
ENVIRONMENT¶
$NOTIFY_SOCKET
EXAMPLES¶
Example 1. Start-up Notification
When a service finished starting up, it might issue the following call to notify the service manager:
sd_notify(0, "READY=1");
Example 2. Extended Start-up Notification
A service could send the following after completing initialization:
sd_notifyf(0, "READY=1\n"
"STATUS=Processing requests...\n"
"MAINPID=%lu",
(unsigned long) getpid());
Example 3. Error Cause Notification
A service could send the following shortly before exiting, on failure:
sd_notifyf(0, "STATUS=Failed to start up: %s\n"
"ERRNO=%i",
strerror_r(errnum, (char[1024]){}, 1024),
errnum);
Example 4. Store a File Descriptor in the Service Manager
To store an open file descriptor in the service manager, in order to continue operation after a service restart without losing state, use "FDSTORE=1":
sd_pid_notify_with_fds(0, 0, "FDSTORE=1\nFDNAME=foobar", &fd, 1);
Example 5. Eliminating race conditions
When the client sending the notifications is not spawned by the service manager, it may exit too quickly and the service manager may fail to attribute them correctly to the unit. To prevent such races, use sd_notify_barrier() to synchronize against reception of all notifications sent before this call is made.
sd_notify(0, "READY=1"); /* set timeout to 5 seconds */ sd_notify_barrier(0, 5 * 1000000);
HISTORY¶
sd_pid_notify(), sd_pid_notifyf(), and sd_pid_notify_with_fds() were added in version 219.
sd_notify_barrier() was added in version 246.
sd_pid_notifyf_with_fds() and sd_pid_notify_barrier() were added in version 254.
SEE ALSO¶
systemd(1), sd-daemon(3), sd_listen_fds(3), sd_listen_fds_with_names(3), sd_watchdog_enabled(3), daemon(7), systemd.service(5)
NOTES¶
- 1.
- File Descriptor Store
systemd 256.7 |