table of contents
| SAUNAFS-URAFT.CFG(5) | SAUNAFS-URAFT.CFG(5) |
NAME¶
saunafs-uraft.cfg - main configuration file for saunafs-uraft
DESCRIPTION¶
The file saunafs-uraft.cfg contains configuration of SaunaFS HA suite.
This configuration is consumed by:
SYNTAX¶
Syntax is:
OPTION = VALUE
Lines starting with # character are ignored.
OPTIONS¶
Configuration options:
URAFT_NODE_ADDRESS
This option should be specified multiple times in order to contain information on every node in the cluster.
Example:
URAFT_NODE_ADDRESS = node1:9427
URAFT_NODE_ADDRESS = node2
URAFT_NODE_ADDRESS = 192.168.0.1:9427
URAFT_ID
It identifies on which node this uraft instance runs on.
Node numbers start from 0 and have the same order as URAFT_NODE_ADDRESS
entries.
For example, if this configuration resides on node with hostname node2, its URAFT_ID should be set to 1.
LOCAL_MASTER_ADDRESS
LOCAL_MASTER_MATOCL_PORT
URAFT_PORT
URAFT_STATUS_PORT
URAFT_FLOATING_IP
URAFT_FLOATING_NETMASK
URAFT_FLOATING_IFACE
Important: If the network interface used for the floating IP becomes unavailable, the failover
mechanism will not restore it. As a result, the floating IP cannot be recovered, causing SaunaFS
services that depend on it to become unavailable.
The failover mechanism is based on the URAFT_FLOATING_IP_CHECK_PERIOD option.
See its documentation for more details.
ELECTION_TIMEOUT_MIN
ELECTION_TIMEOUT_MAX
URAFT_ELECTOR_MODE
new leader.
but cannot itself be elected leader.
When a node runs in elector mode, the metadata service is not required, which reduces its memory consumption.
HEARTBEAT_PERIOD
LOCAL_MASTER_CHECK_PERIOD
master is alive. This value drives how frequently the daemon calls saunafs-uraft-helper isalive
on metadata-capable nodes. (default: 250)
URAFT_FLOATING_IP_CHECK_PERIOD
the floating IP is alive. A value of 0 disables floating IP monitoring and its automatic recovery
mechanism. Adjust this setting based on network stability and failover requirements. (default: 500)
URAFT_CHECK_CMD_PERIOD
URAFT_GETVERSION_TIMEOUT
If this timeout is too low, the daemon may repeatedly log timeouts
and keep a node blocked for promotions
(blocked_promote=1),
preventing elections. (default: 100).
URAFT_PROMOTE_TIMEOUT
URAFT_DEMOTE_TIMEOUT
URAFT_DEAD_HANDLER_TIMEOUT
QUORUM_LOSS_GRACE_HEARTBEATS
This parameter defines how many heartbeats may be missed in a row
before the system declares quorum
loss and demotes the current Leader. Increasing this value helps tolerate
short-lived network
glitches or transient latency spikes, reducing unnecessary demotions and
improving overall
cluster stability. The default value is 5.
TIMING RELATIONSHIPS¶
Raft-style leader election relies on the relationship between heartbeat and election timeouts.
At a high level:
between ELECTION_TIMEOUT_MIN and ELECTION_TIMEOUT_MAX.
Practical guidance:
ELECTION_TIMEOUT_MIN >= 10 * HEARTBEAT_PERIOD
but not so close that elections synchronize.
demotion window ~= QUORUM_LOSS_GRACE_HEARTBEATS * HEARTBEAT_PERIOD
EXAMPLES¶
The examples below show complete configurations for three common
environments.
Each node uses the same file contents except for
URAFT_ID (and optionally
URAFT_ELECTOR_MODE).
Example 1: LAN / Low Packet Loss¶
This configuration targets low-latency LAN environments where packet loss is rare and leader churn is unlikely.
# Cluster membership URAFT_NODE_ADDRESS = 10.0.0.11:9427 URAFT_NODE_ADDRESS = 10.0.0.12:9427 URAFT_NODE_ADDRESS = 10.0.0.13:9427 # Per-node setting (0, 1, or 2 depending on which node) URAFT_ID = 0 # Local metadata service LOCAL_MASTER_ADDRESS = localhost LOCAL_MASTER_MATOCL_PORT = 9421 LOCAL_MASTER_CHECK_PERIOD = 250 # Raft timing HEARTBEAT_PERIOD = 20 ELECTION_TIMEOUT_MIN = 400 ELECTION_TIMEOUT_MAX = 600 QUORUM_LOSS_GRACE_HEARTBEATS = 10 # Helper timeouts URAFT_GETVERSION_TIMEOUT = 200 # Status endpoint URAFT_STATUS_PORT = 9428 # Floating IP URAFT_FLOATING_IP = 10.0.0.100 URAFT_FLOATING_NETMASK = 24 URAFT_FLOATING_IFACE = eth0 URAFT_FLOATING_IP_CHECK_PERIOD = 500
Example 2: WAN/VPN / Higher Loss and Jitter¶
This configuration targets deployments where voting nodes communicate across a WAN or where the floating IP is effectively reached via VPN.
Symptoms of overly aggressive timing in these environments can
include leader ping-pong effect (rapid leadership churn)
and prolonged periods without a stable Leader.
The strategy is:
# Cluster membership (example with 5 voters) URAFT_NODE_ADDRESS = 172.18.0.4:9427 URAFT_NODE_ADDRESS = 172.18.0.5:9427 URAFT_NODE_ADDRESS = 172.18.0.6:9427 URAFT_NODE_ADDRESS = 172.18.0.7:9427 URAFT_NODE_ADDRESS = 172.18.0.8:9427 # Per-node setting URAFT_ID = 0 # Optional: elector nodes (set URAFT_ELECTOR_MODE=1 on electors) URAFT_ELECTOR_MODE = 0 # Local metadata service LOCAL_MASTER_ADDRESS = localhost LOCAL_MASTER_MATOCL_PORT = 9421 LOCAL_MASTER_CHECK_PERIOD = 500 # Raft timing (slower but more stable over lossy links) HEARTBEAT_PERIOD = 100 ELECTION_TIMEOUT_MIN = 3000 ELECTION_TIMEOUT_MAX = 4500 QUORUM_LOSS_GRACE_HEARTBEATS = 30 # Helper timeouts (avoid timeouts during transient stalls) URAFT_GETVERSION_TIMEOUT = 1000 # Status endpoint URAFT_STATUS_PORT = 9428 # Floating IP URAFT_FLOATING_IP = 172.18.0.10 URAFT_FLOATING_NETMASK = 32 URAFT_FLOATING_IFACE = lo URAFT_FLOATING_IP_CHECK_PERIOD = 1000
Example 3: 5-Node Topology (3 Metadata + 2 Electors)¶
This example shows a common production topology:
This reduces the risk that a metadata node becomes unavailable due to resource pressure, while still increasing the number of voters to improve resilience.
Important
Elector nodes are still voting members. They must have stable
network connectivity to
the metadata nodes; otherwise they can contribute to leader churn.
All nodes share the same base configuration. Per-node differences are:
# Cluster membership (5 voters total) URAFT_NODE_ADDRESS = 10.0.0.11:9427 # metadata URAFT_NODE_ADDRESS = 10.0.0.12:9427 # metadata URAFT_NODE_ADDRESS = 10.0.0.13:9427 # metadata URAFT_NODE_ADDRESS = 10.0.0.21:9427 # elector-only URAFT_NODE_ADDRESS = 10.0.0.22:9427 # elector-only # Per-node setting URAFT_ID = 0 # Set to 0 on metadata nodes (IDs 0-2), set to 1 on elector nodes (IDs 3-4) URAFT_ELECTOR_MODE = 0 # Local metadata service (used only on metadata nodes) LOCAL_MASTER_ADDRESS = localhost LOCAL_MASTER_MATOCL_PORT = 9421 LOCAL_MASTER_CHECK_PERIOD = 250 # Raft timing (LAN-friendly defaults) HEARTBEAT_PERIOD = 20 ELECTION_TIMEOUT_MIN = 400 ELECTION_TIMEOUT_MAX = 600 QUORUM_LOSS_GRACE_HEARTBEATS = 10 # Helper timeouts URAFT_GETVERSION_TIMEOUT = 200 # Status endpoint URAFT_STATUS_PORT = 9428 # Floating IP (managed by the Leader, which must be a metadata node) URAFT_FLOATING_IP = 10.0.0.100 URAFT_FLOATING_NETMASK = 24 URAFT_FLOATING_IFACE = eth0 URAFT_FLOATING_IP_CHECK_PERIOD = 500
TROUBLESHOOTING¶
Repeated "Isalive timeout" lines
Nodes stuck with blocked_promote=1
The most common causes are repeated isalive timeouts or repeated dead status.
Leader churn / ping-pong leadership
REPORTING BUGS¶
Report bugs to the GitHub repository <https://github.com/leil-io/saunafs> as an issue.
COPYRIGHT¶
Copyright 2008-2009 Gemius SA
Copyright 2013-2019 Skytechnology sp. z o.o.
Copyright 2023 Leil Storage OÜ
SaunaFS is free software: you can redistribute it and/or modify it
under the
terms of the GNU General Public License as published by the Free Software
Foundation, version 3.
SaunaFS is distributed in the hope that it will be useful, but
WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A
PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with SaunaFS. If not, see <http://www.gnu.org/licenses/>.
SEE ALSO¶
| 2026-03-27 |