Page MenuHomeVyOS Platform

VRRP health-check script stops working when setting up a sync group
Closed, ResolvedPublicBUG

Description

Tested in VyOS 1.3-beta-202112120443 & VyOS 1.4-rolling-202112160318

VRRP health-check script stops working when setting up a sync group
To reproduce:

set interfaces ethernet eth0 address '1.1.1.37/24'
set interfaces ethernet eth0 description 'outside'

set high-availability vrrp group TST virtual-address '1.1.1.36/24'
set high-availability vrrp group TST health-check failure-count '3'
set high-availability vrrp group TST health-check interval '2'
set high-availability vrrp group TST health-check script '/config/scripts/vrrp-health.sh'
set high-availability vrrp group TST interface 'eth0'
set high-availability vrrp group TST priority '100'
set high-availability vrrp group TST vrid '1'

Health-check script works. Next:

set interfaces ethernet eth1 address '10.1.1.2/24'
set interfaces ethernet eth1 description 'inside'

set high-availability vrrp group TST_LAN virtual-address '10.45.1.1/24'
set high-availability vrrp group TST_LAN interface 'eth1'
set high-availability vrrp group TST_LAN priority '100'
set high-availability vrrp group TST_LAN vrid '2'

Health-check script works. Next:

set high-availability vrrp sync-group SYNCgrp member 'TST'
set high-availability vrrp sync-group SYNCgrp member TST_LAN
set high-availability vrrp sync-group SYNCgrp transition-script master '/config/scripts/vrrp-master.sh'

Health-check STOP working

Corresponding keepalived config:

vyos@vyos:~$ cat /run/keepalived/keepalived.conf
# Autogenerated by VyOS
# Do not edit this file, all your changes will be lost
# on next commit or reboot

global_defs {
    dynamic_interfaces
    script_user root
    notify_fifo /run/keepalived/keepalived_notify_fifo
    notify_fifo_script /usr/libexec/vyos/system/keepalived-fifo.py
}

vrrp_script healthcheck_TST {
    script "/config/scripts/vrrp-health.sh"
    interval 2
    fall 3
    rise 1
}
vrrp_instance TST {
    state BACKUP
    interface eth0
    virtual_router_id 1
    priority 100
    advert_int 1
    preempt_delay 0
    virtual_ipaddress {
        1.1.1.36/24
    }
    track_script {
        healthcheck_TST
    }
}
vrrp_instance TST_LAN {
    state BACKUP
    interface eth1
    virtual_router_id 2
    priority 100
    advert_int 1
    preempt_delay 0
    virtual_ipaddress {
        10.45.1.1/24
    }
}

vrrp_sync_group SYNCgrp {
    group {
        TST
        TST_LAN
    }
}

Details

Difficulty level
Unknown (require assessment)
Version
-
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Related Objects

Mentioned In
1.3.3
1.3.1

Event Timeline

Unknown Object (User) created this task.Dec 16 2021, 6:39 AM
Unknown Object (User) created this object in space S1 VyOS Public.
Viacheslav changed the subtype of this task from "Task" to "Bug".
Unknown Object (User) changed the task status from Open to Confirmed.Dec 16 2021, 5:27 PM
Unknown Object (User) added a subscriber: Unknown Object (User).

When sync group configure the keepalived report to log, looks like we need to use this script on sync_group

Dec 16 15:22:53 vyos Keepalived_vrrp[4766]: Warning - script healthcheck_XXX is not used
Unknown Object (User) added a comment.Dec 17 2021, 3:30 AM

Didn't notice this message, thanks!
Maybe we should add a corresponding sync_group command to the CLI?

sync-groups habe transition scripts, too

Unknown Object (User) added a comment.Dec 17 2021, 12:19 PM

Yes, but sync-groups dont have health-check scripts.
The best solution, in this case, is to implement health-check features for sync-group and do migration script.
We should not use health-check configured for a group if this group belongs to a sync-group

Unknown Object (User) changed the task status from Confirmed to Backport candidate.EditedDec 22 2021, 10:00 AM
Unknown Object (User) claimed this task.
Unknown Object (User) changed the subtype of this task from "Bug" to "Task".
Unknown Object (User) removed a subscriber: Viacheslav.

Duplicate PR:
https://github.com/vyos/vyos-1x/pull/1118
Request revoked

@Viacheslav PR is correct

Unknown Object (User) changed the task status from Backport candidate to Confirmed.Dec 22 2021, 10:05 AM
Unknown Object (User) changed the subtype of this task from "Task" to "Bug".Dec 22 2021, 10:11 AM
Unknown Object (User) added a subscriber: Viacheslav.Dec 23 2021, 4:34 AM
Viacheslav changed the task status from Confirmed to In progress.Dec 27 2021, 2:08 PM
Viacheslav moved this task from Need Triage to Finished on the VyOS 1.4 Sagitta board.
Unknown Object (User) added a comment.Jan 3 2022, 7:44 AM

Checked in 1.3-rolling-202201030317, health-check works

dmbaturin changed Issue type from Unspecified (please specify) to Bug (incorrect behavior).Mar 21 2022, 8:09 AM

Running '1.4-rolling-202303270317' i'm experiencing the opposite behaviour. A VRRP health-check script in a VRRP group that is a member of a VRRP sync group stops working (VRRP group immediately transitions to 'FAULT' state upon start of keepalived). If i take out the 'track_script' block in the produced '/run/keepalived/keepalived.conf' and restart keepalived (sudo systemctl restart keepalived) the health-check script functions as expected again. Any pointers ? Or shall I create a new issue containing the appropriate details ?