DevOpsil
Keepalived
91%
Fresh

Keepalived VRRP: Automatic Failover for Load Balancers

Riku TanakaRiku Tanaka25 min read

You set up HAProxy or Nginx as a load balancer, and now your application servers have redundancy. But the load balancer itself is a single point of failure. If it goes down, everything behind it becomes unreachable. Keepalived solves this by providing automatic failover between two or more load balancer nodes using the VRRP protocol. Failover happens in under two seconds with zero client-side configuration changes.

This guide covers the VRRP protocol in depth, complete Keepalived configurations for both master and backup nodes, health check scripting, virtual IP management, split-brain prevention, unicast mode for cloud environments, notification and alerting, integration with HAProxy and Nginx, multi-VIP setups, and a thorough production checklist.

Why You Need Failover

A single load balancer means one server failure takes your entire application offline. Even with the most reliable hardware, you will eventually face kernel panics, NIC failures, misconfigured firewall rules, botched upgrades, or simple power supply failures. The question is not whether your load balancer will fail -- it is when.

The cost of downtime depends on your business, but even a five-minute outage can mean lost revenue, damaged trust, and SLA violations. For any production system that matters, load balancer failover is not optional.

The Standard Architecture

The pattern is simple and proven:

                    DNS: example.com
                           |
                    Virtual IP (192.168.1.100)
                           |
                 +---------+---------+
                 |                   |
            LB1 (MASTER)       LB2 (BACKUP)
            192.168.1.10        192.168.1.20
            Keepalived          Keepalived
            HAProxy             HAProxy
                 |                   |
         +-------+-------+   +------+------+
         |       |       |   |      |      |
       App1    App2    App3  (same backends)

Clients and DNS records point to the virtual IP (VIP), not to any physical server. Keepalived ensures exactly one node holds that VIP at any time. When the master becomes unavailable, the backup takes ownership of the VIP within seconds. Clients reconnect transparently -- they never know a failover happened.

Failover Timeline

Here is what happens during a typical failover event:

TimeEvent
T+0sHAProxy crashes on LB1 (MASTER)
T+2sKeepalived health check detects HAProxy is down (interval 2s)
T+4sSecond check confirms failure (fall threshold met)
T+6sThird check confirms -- effective priority drops below BACKUP
T+7sLB2 stops receiving VRRP advertisements from LB1
T+8sLB2 sends gratuitous ARP, claims VIP 192.168.1.100
T+8sLB2 begins serving traffic as new MASTER
T+8sNetwork switches update MAC tables
T+9sExisting TCP connections are reset; clients reconnect

Total failover time: approximately 6-10 seconds depending on health check intervals and advertisement timing. For most web applications, this is imperceptible to users because browsers automatically retry failed requests.

VRRP Protocol Deep Dive

VRRP (Virtual Router Redundancy Protocol) is defined in RFC 5798. It was designed specifically for router redundancy but works perfectly for load balancer failover. Understanding the protocol helps you debug issues and tune parameters.

Core Concepts

  • Virtual IP (VIP): A floating IP address that is not permanently assigned to any physical interface. The active MASTER node binds it to its network interface. When failover occurs, the new MASTER binds the VIP and sends a gratuitous ARP to update network switches.

  • VRID (Virtual Router ID): An identifier from 1 to 255 that groups nodes into a failover set. All nodes competing for the same VIP must share the same VRID. You can run multiple VRIDs on the same network for different VIPs.

  • Priority: A value from 1 to 254 that determines which node becomes MASTER. The highest priority wins. Default is 100. The value 255 is reserved for the IP address owner (the node whose physical IP matches the VIP).

  • Advertisement Interval: How often the MASTER announces it is alive. Default is 1 second. Shorter intervals mean faster failover detection but more network traffic.

  • Preemption: Whether a higher-priority node reclaims the VIP after recovering from a failure. Preemption is enabled by default but is often disabled in production to avoid unnecessary double-failovers.

How VRRP Elections Work

  1. All nodes in a VRID start up and compare priorities.
  2. The node with the highest effective priority (base priority plus/minus health check adjustments) becomes MASTER.
  3. The MASTER multicasts VRRP advertisements to 224.0.0.18 (or unicasts to configured peers) every advert_int seconds.
  4. BACKUP nodes listen for these advertisements. If a BACKUP misses (3 * advert_int) + skew_time worth of advertisements, it assumes the MASTER is down.
  5. The BACKUP with the highest priority transitions to MASTER, binds the VIP, and sends gratuitous ARPs.
  6. If multiple BACKUPs exist, the highest-priority one wins. Ties are broken by the highest physical IP address.

The skew time prevents simultaneous transitions when multiple BACKUPs detect the MASTER is down. It is calculated as (256 - priority) / 256 seconds, meaning higher-priority nodes transition faster.

VRRP Packet Structure

A VRRP advertisement contains:

FieldSizeContent
Version4 bitsVRRP version (3 for RFC 5798)
Type4 bitsAlways 1 (ADVERTISEMENT)
Virtual Router ID8 bitsVRID (1-255)
Priority8 bitsSender's priority (1-254)
Count IP Addrs8 bitsNumber of VIPs
Auth Type8 bitsAuthentication type
Advert Int16 bitsAdvertisement interval in centiseconds
Checksum16 bitsStandard IP checksum
IP AddressesVariableList of VIPs

VRRP uses IP protocol number 112 (not TCP or UDP). This is important for firewall rules -- you must allow protocol 112, not a specific port.

Installing Keepalived

Ubuntu / Debian

sudo apt update
sudo apt install -y keepalived

RHEL / Rocky / Alma

sudo dnf install -y keepalived

Building from Source (Latest Features)

sudo apt install -y build-essential libssl-dev libnl-3-dev libnl-genl-3-dev
wget https://www.keepalived.org/software/keepalived-2.2.8.tar.gz
tar xzf keepalived-2.2.8.tar.gz
cd keepalived-2.2.8
./configure --prefix=/usr --sysconfdir=/etc --enable-json
make -j$(nproc)
sudo make install

Post-Installation Setup

Enable and start the service:

sudo systemctl enable keepalived
sudo systemctl start keepalived

Allow VRRP traffic through the firewall:

# iptables
sudo iptables -A INPUT -p vrrp -j ACCEPT
sudo iptables -A INPUT -d 224.0.0.18/32 -j ACCEPT

# firewalld
sudo firewall-cmd --add-protocol=vrrp --permanent
sudo firewall-cmd --add-rich-rule='rule family="ipv4" destination address="224.0.0.18" accept' --permanent
sudo firewall-cmd --reload

# ufw
sudo ufw allow in proto vrrp

Enable non-local IP binding so the node can bind the VIP even though it is not its own address:

# /etc/sysctl.d/99-keepalived.conf
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1

Apply with sudo sysctl -p /etc/sysctl.d/99-keepalived.conf.

Configuring MASTER and BACKUP Nodes

The configuration file lives at /etc/keepalived/keepalived.conf.

MASTER Node (LB1)

# /etc/keepalived/keepalived.conf on LB1

global_defs {
    router_id LB1
    enable_script_security
    script_user root
    max_auto_priority -1
    vrrp_garp_master_delay 5
    vrrp_garp_master_repeat 3
    vrrp_garp_master_refresh 60
    vrrp_garp_master_refresh_repeat 2
}

vrrp_script chk_haproxy {
    script "/usr/bin/killall -0 haproxy"
    interval 2
    weight -20
    fall 3
    rise 2
    user root
}

vrrp_script chk_http {
    script "/etc/keepalived/check_http.sh"
    interval 3
    weight -30
    fall 2
    rise 3
    timeout 2
    user root
}

vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 110
    advert_int 1

    authentication {
        auth_type PASS
        auth_pass MySecret123
    }

    virtual_ipaddress {
        192.168.1.100/24 dev eth0 label eth0:vip
    }

    track_script {
        chk_haproxy
        chk_http
    }

    track_interface {
        eth0 weight -50
    }

    notify_master "/etc/keepalived/notify.sh MASTER VI_1"
    notify_backup "/etc/keepalived/notify.sh BACKUP VI_1"
    notify_fault  "/etc/keepalived/notify.sh FAULT VI_1"
    notify_stop   "/etc/keepalived/notify.sh STOP VI_1"

    # Smtp notifications (optional)
    smtp_alert
}

BACKUP Node (LB2)

# /etc/keepalived/keepalived.conf on LB2

global_defs {
    router_id LB2
    enable_script_security
    script_user root
    max_auto_priority -1
    vrrp_garp_master_delay 5
    vrrp_garp_master_repeat 3
    vrrp_garp_master_refresh 60
    vrrp_garp_master_refresh_repeat 2
}

vrrp_script chk_haproxy {
    script "/usr/bin/killall -0 haproxy"
    interval 2
    weight -20
    fall 3
    rise 2
    user root
}

vrrp_script chk_http {
    script "/etc/keepalived/check_http.sh"
    interval 3
    weight -30
    fall 2
    rise 3
    timeout 2
    user root
}

vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1

    authentication {
        auth_type PASS
        auth_pass MySecret123
    }

    virtual_ipaddress {
        192.168.1.100/24 dev eth0 label eth0:vip
    }

    track_script {
        chk_haproxy
        chk_http
    }

    track_interface {
        eth0 weight -50
    }

    notify_master "/etc/keepalived/notify.sh MASTER VI_1"
    notify_backup "/etc/keepalived/notify.sh BACKUP VI_1"
    notify_fault  "/etc/keepalived/notify.sh FAULT VI_1"
    notify_stop   "/etc/keepalived/notify.sh STOP VI_1"

    smtp_alert
}

The differences between MASTER and BACKUP configs are minimal: state, priority, and router_id. Everything else is identical. This makes configuration management straightforward.

Configuration Parameter Reference

ParameterMASTER ValueBACKUP ValuePurpose
stateMASTERBACKUPInitial role on startup
priority110100Election priority (higher wins)
router_idLB1LB2Unique identifier in logs
virtual_router_id5151Must match on both nodes
advert_int11Must match on both nodes
auth_passMySecret123MySecret123Must match on both nodes
interfaceeth0eth0Must be the same network segment

Gratuitous ARP Tuning

The vrrp_garp_* parameters control how aggressively the new MASTER announces its ownership of the VIP:

ParameterValuePurpose
vrrp_garp_master_delay5Delay (seconds) before first GARP after becoming MASTER
vrrp_garp_master_repeat3Number of GARP packets to send initially
vrrp_garp_master_refresh60Send periodic GARPs every 60 seconds
vrrp_garp_master_refresh_repeat2Number of GARP packets per refresh

Periodic GARP refresh is important in environments with switches that have short ARP cache timeouts. Without it, some switches may "forget" the MAC-to-IP mapping and fail to route traffic to the MASTER.

Virtual IP Addresses

The VIP is the address your DNS records point to. When the MASTER takes over, it sends a gratuitous ARP announcing it now owns that IP, causing network switches to update their MAC tables immediately.

Single VIP

virtual_ipaddress {
    192.168.1.100/24 dev eth0 label eth0:vip
}

The label eth0:vip makes the VIP visible with a clear name in ip addr show output, which helps during debugging.

Multiple VIPs

You can assign multiple VIPs for different services:

virtual_ipaddress {
    192.168.1.100/24 dev eth0 label eth0:web
    192.168.1.101/24 dev eth0 label eth0:api
    192.168.1.102/24 dev eth0 label eth0:admin
}

Multi-Interface VIPs

For environments with separate networks for different traffic types:

virtual_ipaddress {
    192.168.1.100/24 dev eth0 label eth0:vip
    10.0.0.100/24 dev eth1 label eth1:internal
}

Active-Active with Multiple VRRP Instances

Instead of having one active and one standby node, you can run active-active by splitting VIPs across two VRRP instances. Each node is MASTER for one VIP and BACKUP for the other:

# On LB1:
vrrp_instance VI_WEB {
    state MASTER
    virtual_router_id 51
    priority 110
    virtual_ipaddress {
        192.168.1.100/24
    }
}

vrrp_instance VI_API {
    state BACKUP
    virtual_router_id 52
    priority 100
    virtual_ipaddress {
        192.168.1.101/24
    }
}
# On LB2:
vrrp_instance VI_WEB {
    state BACKUP
    virtual_router_id 51
    priority 100
    virtual_ipaddress {
        192.168.1.100/24
    }
}

vrrp_instance VI_API {
    state MASTER
    virtual_router_id 52
    priority 110
    virtual_ipaddress {
        192.168.1.101/24
    }
}

With this setup, DNS routes www.example.com to 192.168.1.100 (handled by LB1) and api.example.com to 192.168.1.101 (handled by LB2). Both nodes are actively serving traffic, and either can absorb the other's load during a failure.

Health Check Scripts

The vrrp_script block defines health checks that influence the MASTER election. When a check fails, the node's effective priority drops by the weight value. If this drops the MASTER below the BACKUP's priority, failover occurs.

Basic Process Check

vrrp_script chk_haproxy {
    script "/usr/bin/killall -0 haproxy"
    interval 2      # Check every 2 seconds
    weight -20      # Subtract 20 from priority on failure
    fall 3          # Mark as failed after 3 consecutive failures
    rise 2          # Mark as recovered after 2 consecutive successes
    timeout 2       # Script timeout in seconds
    user root       # User to run the script as
}

The killall -0 command checks if a process exists without sending any signal. Exit code 0 means the process is running; non-zero means it is not found.

HTTP Health Check

For a deeper check that verifies the load balancer is actually serving traffic, not just that the process exists:

#!/bin/bash
# /etc/keepalived/check_http.sh

# Check if HAProxy is responding to HTTP requests
response=$(curl -s -o /dev/null -w "%{http_code}" \
    http://127.0.0.1:80/health \
    --max-time 2 \
    --connect-timeout 1)

if [ "$response" = "200" ]; then
    exit 0
else
    logger -t keepalived "HTTP health check failed: status=$response"
    exit 1
fi

HAProxy Stats Socket Check

The most thorough check queries HAProxy's stats socket to verify backends are available:

#!/bin/bash
# /etc/keepalived/check_haproxy_backends.sh

# Check if HAProxy is running
/usr/bin/killall -0 haproxy 2>/dev/null || exit 1

# Check if at least one backend server is UP
backend_status=$(echo "show stat" | socat stdio /run/haproxy/admin.sock 2>/dev/null | \
    grep "web_servers" | grep -c "UP")

if [ "$backend_status" -ge 1 ]; then
    exit 0
else
    logger -t keepalived "No healthy backends in web_servers pool"
    exit 1
fi

Nginx Health Check

For Keepalived paired with Nginx instead of HAProxy:

#!/bin/bash
# /etc/keepalived/check_nginx.sh

# Check if Nginx master process is running
if ! /usr/bin/pgrep -x nginx > /dev/null 2>&1; then
    logger -t keepalived "Nginx process not found"
    exit 1
fi

# Check if Nginx is actually responding
response=$(curl -s -o /dev/null -w "%{http_code}" \
    http://127.0.0.1/nginx_status \
    --max-time 2 \
    --connect-timeout 1)

if [ "$response" = "200" ]; then
    exit 0
else
    logger -t keepalived "Nginx status check failed: status=$response"
    exit 1
fi

Make all scripts executable:

chmod +x /etc/keepalived/check_http.sh
chmod +x /etc/keepalived/check_haproxy_backends.sh
chmod +x /etc/keepalived/check_nginx.sh

Weight Calculation

Understanding how weights interact with priorities is essential for correct failover behavior:

ScenarioMASTER PriorityWeightEffective PriorityBACKUP PriorityResult
All healthy1100110100LB1 is MASTER
HAProxy down (-20)110-2090100LB2 takes over
HTTP check down (-30)110-3080100LB2 takes over
Both checks fail110-5060100LB2 takes over

The rule: set weights so that the worst-case effective priority of the MASTER is always below the BACKUP's priority. If MASTER is 110 and BACKUP is 100, any weight of -11 or more will trigger failover.

Warning: if the weight is too small (e.g., -5 with priorities 110/100), a health check failure will not cause failover because the MASTER's effective priority (105) is still higher than the BACKUP (100). Always verify your math.

Track Script and Track Interface

Track Script

The track_script directive inside vrrp_instance links health checks to the VRRP instance:

vrrp_instance VI_1 {
    track_script {
        chk_haproxy weight -20
        chk_http weight -30
    }
}

You can override weights in the track_script block, which takes precedence over the weight defined in the vrrp_script block.

Track Interface

Monitor network interfaces. If a tracked interface goes down, the node immediately transitions to FAULT state:

vrrp_instance VI_1 {
    track_interface {
        eth0 weight -50
        eth1 weight -30
    }
}

Interface tracking is faster than script-based checks because Keepalived receives kernel notifications instantly when an interface state changes.

Track File

Keepalived can also track the contents of a file, which is useful for manual failover triggers:

vrrp_track_file manual_override {
    file /etc/keepalived/manual_priority
    weight -100
}

vrrp_instance VI_1 {
    track_file {
        manual_override
    }
}

Write a value to trigger manual failover:

# Force this node to give up MASTER
echo 1 > /etc/keepalived/manual_priority

# Return to normal
echo 0 > /etc/keepalived/manual_priority

Notification Scripts

Notification scripts run when state transitions occur. Use them for logging, alerting, running post-failover tasks, or triggering external automation:

#!/bin/bash
# /etc/keepalived/notify.sh

STATE=$1
INSTANCE=$2
HOSTNAME=$(hostname)
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
LOGFILE="/var/log/keepalived-state.log"

# Log the transition
echo "$TIMESTAMP - $HOSTNAME transitioning to $STATE for $INSTANCE" >> "$LOGFILE"

case $STATE in
    "MASTER")
        logger -t keepalived "$HOSTNAME is now MASTER for $INSTANCE"

        # Send Slack notification
        if [ -n "$SLACK_WEBHOOK_URL" ]; then
            curl -s -X POST "$SLACK_WEBHOOK_URL" \
                -H 'Content-type: application/json' \
                -d "{\"text\":\"$HOSTNAME is now MASTER for $INSTANCE (VIP 192.168.1.100)\"}" \
                > /dev/null 2>&1
        fi

        # Send PagerDuty event
        # curl -s -X POST https://events.pagerduty.com/v2/enqueue ...

        # Ensure HAProxy is running on this node
        systemctl is-active --quiet haproxy || systemctl start haproxy
        ;;

    "BACKUP")
        logger -t keepalived "$HOSTNAME is now BACKUP for $INSTANCE"

        # Optional: verify HAProxy is also running on BACKUP
        # (should be running and ready to serve if it becomes MASTER)
        systemctl is-active --quiet haproxy || systemctl start haproxy
        ;;

    "FAULT")
        logger -t keepalived "$HOSTNAME is in FAULT state for $INSTANCE"

        # Attempt to recover
        systemctl restart haproxy
        sleep 2

        # If recovery worked, Keepalived will detect it via health checks
        # and transition back to BACKUP or MASTER
        ;;

    "STOP")
        logger -t keepalived "Keepalived stopped on $HOSTNAME for $INSTANCE"
        ;;
esac

Make it executable:

chmod +x /etc/keepalived/notify.sh

Preempt vs Nopreempt

By default, VRRP is preemptive -- if the original MASTER recovers, it reclaims the VIP from the BACKUP. This causes two failover events: one when the MASTER fails, and another when it recovers. Each failover disrupts active connections.

Preempt Mode (Default)

T0: LB1 is MASTER (priority 110)
T1: LB1 HAProxy crashes
T2: LB2 becomes MASTER (failover #1 -- connections disrupted)
T5: LB1 HAProxy is restarted
T6: LB1 reclaims MASTER (failover #2 -- connections disrupted again)
# IMPORTANT: Both nodes must have state BACKUP for nopreempt to work

# LB1:
vrrp_instance VI_1 {
    state BACKUP       # Not MASTER
    nopreempt
    priority 110       # Higher priority still wins on simultaneous startup
}

# LB2:
vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    priority 100
}

With nopreempt:

T0: LB1 becomes MASTER on startup (higher priority)
T1: LB1 HAProxy crashes
T2: LB2 becomes MASTER (failover #1 -- connections disrupted)
T5: LB1 HAProxy is restarted
T6: LB1 becomes BACKUP -- VIP stays on LB2 (no second disruption)

Nopreempt is recommended for production because it minimizes unnecessary failovers. The VIP stays on whichever node is currently serving successfully.

Critical note: both nodes must have state BACKUP for nopreempt to work correctly. If you set one to state MASTER, it will always preempt regardless of the nopreempt directive.

Split-Brain Scenarios and Unicast Mode

Split-brain is the most dangerous failure mode in any HA system. It occurs when both nodes believe they are MASTER and both hold the same VIP. This causes IP conflicts, unpredictable routing, and data corruption if both nodes process writes.

What Causes Split-Brain

  1. Network partition: The link between LB1 and LB2 fails, but both can still reach clients. Neither sees the other's VRRP advertisements, so both promote themselves.
  2. Multicast filtering: A switch or firewall blocks multicast traffic (224.0.0.18). VRRP advertisements do not reach the peer.
  3. CPU starvation: The MASTER is so overloaded that it cannot send VRRP advertisements on time. The BACKUP assumes it is dead.
  4. Firewall misconfiguration: Protocol 112 (VRRP) is blocked between the nodes.

Unicast Mode

The most reliable prevention is to switch from multicast to unicast. Each node sends VRRP advertisements directly to its peer's IP:

# LB1:
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 110
    nopreempt

    unicast_src_ip 192.168.1.10
    unicast_peer {
        192.168.1.20
    }

    virtual_ipaddress {
        192.168.1.100/24
    }
}
# LB2:
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 100
    nopreempt

    unicast_src_ip 192.168.1.20
    unicast_peer {
        192.168.1.10
    }

    virtual_ipaddress {
        192.168.1.100/24
    }
}

Unicast mode is required in most cloud environments (AWS, GCP, Azure, DigitalOcean) where multicast traffic is not supported between instances.

Gateway-Aware Fencing

A fencing script adds an extra layer of protection. Before a node promotes itself to MASTER, it checks whether it can reach the network gateway. If it cannot, the node is likely isolated and should not take the VIP:

#!/bin/bash
# /etc/keepalived/check_fence.sh
#
# Logic:
# - If we CAN reach the gateway, we are properly connected to the network.
# - If we CANNOT reach the gateway, WE might be the isolated node,
#   so we should drop our priority and NOT become MASTER.

GATEWAY="192.168.1.1"
PEER="192.168.1.20"
DNS_SERVER="8.8.8.8"

# Check multiple network endpoints to avoid false positives
gateway_ok=0
peer_ok=0
dns_ok=0

ping -c 1 -W 1 $GATEWAY > /dev/null 2>&1 && gateway_ok=1
ping -c 1 -W 1 $PEER > /dev/null 2>&1 && peer_ok=1
ping -c 1 -W 1 $DNS_SERVER > /dev/null 2>&1 && dns_ok=1

total=$((gateway_ok + peer_ok + dns_ok))

if [ $total -ge 2 ]; then
    # We can reach at least 2 out of 3 endpoints -- network is healthy
    exit 0
elif [ $gateway_ok -eq 0 ] && [ $dns_ok -eq 0 ]; then
    # Cannot reach gateway or DNS -- we are likely isolated
    logger -t keepalived "FENCE: Cannot reach gateway or DNS, dropping priority"
    exit 1
else
    # Ambiguous state -- err on the side of caution
    exit 0
fi
vrrp_script chk_fence {
    script "/etc/keepalived/check_fence.sh"
    interval 5
    weight -100
    fall 2
    rise 3
}

The weight of -100 ensures that an isolated node will always have lower priority than any healthy node, preventing split-brain.

Additional Split-Brain Defenses

DefenseHow It WorksLimitation
Unicast modeDirect peer communication instead of multicastRequires knowing peer IPs statically
Gateway fencingCheck network connectivity before promotingAdds latency to failover
VRRP sync groupsGroup multiple instances that must fail togetherComplexity
ARP monitoringVerify ARP table updates after failoverPlatform-dependent
External witnessThird node or service confirms which node should be MASTERAdditional infrastructure

Testing Failover

Test failover systematically before trusting it in production. Never deploy Keepalived without testing every failure mode.

Test 1: Service Failure

# On LB1 (MASTER), verify VIP is present
ip addr show eth0 | grep 192.168.1.100

# Stop HAProxy to simulate a crash
sudo systemctl stop haproxy

# On LB2, verify VIP has moved (should take 6-10 seconds)
watch -n 1 'ip addr show eth0 | grep 192.168.1.100'

# Check keepalived logs
sudo journalctl -u keepalived -f

# Restore HAProxy on LB1
sudo systemctl start haproxy

# With nopreempt: VIP stays on LB2
# With preempt: VIP moves back to LB1

Test 2: Full Server Reboot

# Reboot the MASTER
sudo reboot

# On LB2, verify it becomes MASTER within seconds
# After LB1 comes back, verify it stays as BACKUP (with nopreempt)

Test 3: Network Interface Failure

# On LB1, bring down the network interface
sudo ip link set eth0 down

# Verify LB2 takes over immediately (track_interface)
# Bring interface back up
sudo ip link set eth0 up

Test 4: Keepalived Service Failure

# Kill keepalived on the MASTER
sudo systemctl stop keepalived

# VIP should be released immediately
# LB2 should take over

Test 5: Simultaneous Startup

# Stop keepalived on both nodes
sudo systemctl stop keepalived  # on LB1
sudo systemctl stop keepalived  # on LB2

# Start both at the same time
# The higher-priority node should become MASTER
sudo systemctl start keepalived  # on both nodes simultaneously

Test 6: Load Under Failover

# Run continuous HTTP requests during failover
while true; do
    curl -s -o /dev/null -w "%{http_code} %{time_total}s\n" http://192.168.1.100/health
    sleep 0.1
done

# Trigger failover and observe:
# - How many requests fail (expect 1-3)
# - How long until recovery (expect under 10 seconds)

Integration with Configuration Management

Keep configurations synchronized between nodes using automation. Manual configuration drift is the number one cause of failover failures.

Ansible Playbook

---
- name: Configure Keepalived HA Pair
  hosts: load_balancers
  become: true

  vars:
    vip: "192.168.1.100"
    vrid: 51
    auth_pass: "{{ vault_keepalived_pass }}"

  tasks:
    - name: Install Keepalived
      apt:
        name: keepalived
        state: present

    - name: Deploy HAProxy config
      template:
        src: haproxy.cfg.j2
        dest: /etc/haproxy/haproxy.cfg
        validate: "haproxy -c -f %s"
      notify: reload haproxy

    - name: Deploy Keepalived config
      template:
        src: keepalived.conf.j2
        dest: /etc/keepalived/keepalived.conf
      notify: restart keepalived

    - name: Deploy health check scripts
      copy:
        src: "{{ item }}"
        dest: "/etc/keepalived/{{ item }}"
        mode: "0755"
      loop:
        - check_http.sh
        - check_fence.sh
        - notify.sh
      notify: restart keepalived

    - name: Enable sysctl settings
      sysctl:
        name: "{{ item.key }}"
        value: "{{ item.value }}"
        sysctl_file: /etc/sysctl.d/99-keepalived.conf
        reload: true
      loop:
        - { key: "net.ipv4.ip_nonlocal_bind", value: "1" }
        - { key: "net.ipv4.ip_forward", value: "1" }

  handlers:
    - name: reload haproxy
      systemd:
        name: haproxy
        state: reloaded

    - name: restart keepalived
      systemd:
        name: keepalived
        state: restarted

Keepalived Jinja2 Template

# keepalived.conf.j2
global_defs {
    router_id {{ inventory_hostname }}
    enable_script_security
    script_user root
}

vrrp_script chk_haproxy {
    script "/usr/bin/killall -0 haproxy"
    interval 2
    weight -20
    fall 3
    rise 2
}

vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface {{ ansible_default_ipv4.interface }}
    virtual_router_id {{ vrid }}
    priority {{ 110 if inventory_hostname == groups['load_balancers'][0] else 100 }}
    advert_int 1

    unicast_src_ip {{ ansible_default_ipv4.address }}
    unicast_peer {
{% for host in groups['load_balancers'] %}
{% if host != inventory_hostname %}
        {{ hostvars[host].ansible_default_ipv4.address }}
{% endif %}
{% endfor %}
    }

    authentication {
        auth_type PASS
        auth_pass {{ auth_pass }}
    }

    virtual_ipaddress {
        {{ vip }}/24
    }

    track_script {
        chk_haproxy
    }

    notify_master "/etc/keepalived/notify.sh MASTER VI_1"
    notify_backup "/etc/keepalived/notify.sh BACKUP VI_1"
    notify_fault  "/etc/keepalived/notify.sh FAULT VI_1"
}

Monitoring and Observability

Log Monitoring

# Watch state transitions in real time
sudo journalctl -u keepalived -f

# Check current VRRP state
sudo journalctl -u keepalived | grep -i "entering"

# View custom state log
tail -f /var/log/keepalived-state.log

Prometheus Monitoring

Use the keepalived_exporter to expose metrics:

# Install
wget https://github.com/mehdy/keepalived-exporter/releases/latest/download/keepalived-exporter-linux-amd64
chmod +x keepalived-exporter-linux-amd64
sudo mv keepalived-exporter-linux-amd64 /usr/local/bin/keepalived-exporter

Key metrics to alert on:

MetricAlert ConditionMeaning
keepalived_vrrp_stateChanged unexpectedlyVRRP state transition occurred
keepalived_vrrp_state{state="master"}Both nodes show masterSplit-brain detected
keepalived_script_status0 (failed)Health check script failing
keepalived_gratuitous_arp_delayHigh valueARP announcement delays

Quick Status Check Script

#!/bin/bash
# /usr/local/bin/keepalived-status.sh

echo "=== Keepalived Status ==="
echo ""

# Service status
systemctl is-active --quiet keepalived && echo "Service: RUNNING" || echo "Service: STOPPED"

# VIP status
if ip addr show | grep -q "192.168.1.100"; then
    echo "VIP: HELD (this node is MASTER)"
else
    echo "VIP: NOT HELD (this node is BACKUP)"
fi

# HAProxy status
systemctl is-active --quiet haproxy && echo "HAProxy: RUNNING" || echo "HAProxy: STOPPED"

# Recent state changes
echo ""
echo "=== Recent State Changes ==="
tail -5 /var/log/keepalived-state.log 2>/dev/null || echo "No state log found"

Production Checklist

Before going live with Keepalived failover, verify every item on this list:

Configuration

  • Identical HAProxy/Nginx configurations on all LB nodes
  • Virtual router ID is unique on the network segment
  • Authentication password is set and matches on all nodes
  • nopreempt enabled to avoid unnecessary double-failovers
  • Both nodes set to state BACKUP when using nopreempt

Network

  • Unicast mode enabled (required for cloud, recommended everywhere)
  • Firewall rules allow VRRP protocol (112) between nodes
  • net.ipv4.ip_nonlocal_bind = 1 set in sysctl
  • Gratuitous ARP parameters tuned for your switch environment

Health Checks

  • Health check scripts are executable and tested independently
  • Weight values calculated correctly: MASTER effective priority after failure must be less than BACKUP priority
  • fall and rise thresholds balance speed vs. stability
  • Scripts have timeouts to prevent hanging

Alerting

  • Notification scripts working and sending alerts to your on-call channel
  • Alerts fire on state transitions (especially unexpected MASTER changes)
  • Split-brain detection alert configured (both nodes reporting MASTER)

Testing

  • Tested service failure (HAProxy/Nginx crash)
  • Tested full server reboot of the MASTER
  • Tested network interface failure (ip link set down)
  • Tested Keepalived service failure
  • Tested simultaneous startup of both nodes
  • Tested failback behavior (with nopreempt, VIP stays)
  • Measured failover time under load

Monitoring

  • Logs being collected from journalctl -u keepalived and /var/log/keepalived-state.log
  • VIP ownership tracked in monitoring dashboard
  • HAProxy/Nginx health tracked separately from Keepalived state

Key Takeaways

  • Keepalived with VRRP gives you sub-ten-second failover for load balancers with minimal complexity. It is the standard solution used across the industry.
  • Use nopreempt with both nodes set to state BACKUP to avoid unnecessary VIP flapping when the original MASTER recovers. This is the single most impactful configuration choice.
  • Always use unicast mode in cloud environments where multicast is not available. Even in on-premise environments, unicast is more predictable.
  • Health check scripts should verify the actual service is working (HTTP response, backend availability), not just that a process ID exists. A running process that does not respond is worse than a crashed one.
  • Set weight values so that a failed health check drops the MASTER's effective priority below the BACKUP. Always verify your math: MASTER_PRIORITY - WEIGHT must be less than BACKUP_PRIORITY.
  • Implement gateway-aware fencing to prevent split-brain. If a node cannot reach the network gateway, it should not hold the VIP.
  • Test every failure mode before going to production: service crash, full reboot, network interface down, Keepalived stop, simultaneous startup. Untested failover is not failover -- it is hope.
  • Keep configurations synchronized between nodes using Ansible, Salt, or Puppet. Manual configuration drift is the top cause of failover failures.
  • Monitor VRRP state transitions. An unexpected MASTER change is often the first sign of a deeper network or application problem.
  • Active-active setups with multiple VRRP instances let you utilize both load balancer nodes under normal conditions, not just during failures.
Share:
Riku Tanaka
Riku Tanaka

SRE & Observability Engineer

If it's not measured, it doesn't exist. SLO-driven, metrics-obsessed, and the person who gets paged at 3 AM so you don't have to. Observability isn't optional.

Related Articles

LinuxQuick RefFresh

Linux Networking Commands: Cheat Sheet

Linux networking commands cheat sheet for troubleshooting — interfaces, routing, DNS lookups, connections, iptables firewalls, and tcpdump packet capture.

Aareez Asif·
3 min read