VRRP health-check script is not applied correctly in keepalived.conf
Closed, ResolvedPublicBUG
Actions

Assigned To

Authored By

	faekz0r
	Feb 6 2024, 5:46 PM

Description

The issue is that VRRP goes to fault state due to possibly conflicting keepalived configuration (generated by VyOS).

Everything works fine, until we introduce the vrrp sync-group configuration into play.

Seems like VyOS erroneously uses the health-check script both for "vrrp" instance and the vrrp_sync_group (seen below in the keepalived configuration).

VRRP and conntrack configuration:

set high-availability vrrp group vrrp address 169.254.0.254/24
set high-availability vrrp group vrrp health-check failure-count '1'
set high-availability vrrp group vrrp health-check interval '1'
set high-availability vrrp group vrrp health-check script '/config/scripts/bgp-check.sh'
set high-availability vrrp group vrrp interface 'gnv0'
set high-availability vrrp group vrrp preempt-delay '30'
set high-availability vrrp group vrrp priority '200'
set high-availability vrrp group vrrp track interface 'eth1'
set high-availability vrrp sync-group vrrp member 'vrrp'
set high-availability vrrp sync-group vrrp transition-script backup '/config/scripts/vrrp-states.sh BACKUP'
set high-availability vrrp sync-group vrrp transition-script fault '/config/scripts/vrrp-states.sh BACKUP'
set high-availability vrrp sync-group vrrp transition-script master '/config/scripts/vrrp-states.sh MASTER'
set high-availability vrrp sync-group vrrp transition-script stop '/config/scripts/vrrp-states.sh BACKUP'

set service conntrack-sync failover-mechanism vrrp sync-group 'vrrp'
set service conntrack-sync interface gnv0

/run/keepalived/keepalived.conf

# Autogenerated by VyOS
# Do not edit this file, all your changes will be lost
# on next commit or reboot

# Global definitions configuration block
global_defs {
    dynamic_interfaces
    script_user root
    notify_fifo /run/keepalived/keepalived_notify_fifo
    notify_fifo_script /usr/libexec/vyos/system/keepalived-fifo.py
}

vrrp_script healthcheck_vrrp {
    script "/config/scripts/bgp-check.sh"
    interval 1
    fall 1
    rise 1
}
vrrp_instance vrrp {
    state BACKUP
    interface gnv0
    virtual_router_id 1
    priority 200
    advert_int 1
    preempt_delay 30
    mcast_src_ip 169.254.0.1
    virtual_ipaddress {
        169.254.0.254/24
    }
    track_interface {
        eth1
    }
    track_script {
        healthcheck_vrrp
    }
}

vrrp_sync_group vrrp {
    group {
        vrrp
    }

    track_script {
        healthcheck_vrrp
    }
    notify_master "/usr/libexec/vyos/vyos-vrrp-conntracksync.sh master vrrp"
    notify_backup "/usr/libexec/vyos/vyos-vrrp-conntracksync.sh backup vrrp"
    notify_fault "/usr/libexec/vyos/vyos-vrrp-conntracksync.sh fault vrrp"
}

show vrrp log:

Feb 06 14:15:29 Keepalived_vrrp[2326]: (vrrp) track_script healthcheck_vrrp is configured on VRRP instance and sync group. Remove vrrp instance config

Seems like the Issue might be caused by this keepalived.conf.j2 template: https://github.com/vyos/vyos-1x/blob/da465d26b524fb26e0e9085e80a3ccaa6435eaa9/data/templates/high-availability/keepalived.conf.j2#L131

Should probably adjust the template logic to ensure that track_script is only configured in one place — either at the individual VRRP instance level or within the sync group, but not both?

This could possibly fix it with no further edits needed, but not entirely sure:

{% if group_config.health_check is vyos_defined and group_config.health_check.script is not vyos_defined %}
    track_script {
        healthcheck_{{ name }}
    }
{% endif %}

Details

Version: 1.5
Is it a breaking change?: Perfectly compatible
Issue type: Bug (incorrect behavior)

Related Objects

Mentioned In: rVYOSONEX19fecd46a298: vrrp: T6020: vrrp health-check script not applied correctly in keepalived.conf
rVYOSONEXaafdc29b444a: Merge pull request #3122 from HollyGurza/T6020-sagitta
rVYOSONEX2943d9bb0f65: vrrp: T6020: vrrp health-check script not applied correctly in keepalived.conf
rVYOSONEXef5c61b26e60: vrrp: T6020: vrrp health-check script not applied correctly in keepalived.conf
rVYOSONEX311791ab5368: Merge pull request #2966 from HollyGurza/T6020

Event Timeline

faekz0r created this task.Feb 6 2024, 5:46 PM

faekz0r updated the task description. (Show Details)

Viacheslav triaged this task as High priority.Feb 6 2024, 6:11 PM

pasik subscribed.Feb 6 2024, 8:03 PM

https://github.com/vyos/vyos-1x/pull/2966

dmbaturin changed the edit policy from "Custom Policy" to "Maintainers (Project)".Feb 12 2024, 11:22 AM

HollyGurza changed the task status from Open to In progress.Feb 12 2024, 11:31 AM

HollyGurza claimed this task.

dmbaturin edited projects, added VyOS 1.4 Sagitta (1.4.0-epa1); removed VyOS 1.4 Sagitta.Feb 15 2024, 5:39 PM

Restricted Repository Identity mentioned this in rVYOSONEX311791ab5368: Merge pull request #2966 from HollyGurza/T6020.Mar 7 2024, 4:20 PM

Restricted Repository Identity mentioned this in rVYOSONEXef5c61b26e60: vrrp: T6020: vrrp health-check script not applied correctly in keepalived.conf.Mar 7 2024, 4:20 PM

Restricted Repository Identity mentioned this in rVYOSONEX2943d9bb0f65: vrrp: T6020: vrrp health-check script not applied correctly in keepalived.conf.

HollyGurza closed this task as Unknown Status.Mar 8 2024, 3:53 AM

Viacheslav moved this task from Open to Backport Candidates on the VyOS 1.5 Circinus board.Mar 9 2024, 9:44 AM

Restricted Repository Identity mentioned this in rVYOSONEXaafdc29b444a: Merge pull request #3122 from HollyGurza/T6020-sagitta.Mar 12 2024, 10:33 AM

Restricted Repository Identity mentioned this in rVYOSONEX19fecd46a298: vrrp: T6020: vrrp health-check script not applied correctly in keepalived.conf.Mar 12 2024, 10:33 AM

dmbaturin renamed this task from vrrp health-check script not applied correctly in keepalived.conf to VRRP health-check script is not applied correctly in keepalived.conf.Mar 12 2024, 4:22 PM

dmbaturin changed the task status from Unknown Status to Resolved.

dmbaturin edited projects, added VyOS 1.4 Sagitta (1.4.0-epa2); removed VyOS 1.4 Sagitta (1.4.0-epa1).

dmbaturin removed a project: VyOS 1.5 Circinus.May 11 2024, 4:48 PM

VRRP health-check script is not applied correctly in keepalived.confClosed, ResolvedPublicBUGActions

Description

Details

Related Objects

Event Timeline

VRRP health-check script is not applied correctly in keepalived.conf
Closed, ResolvedPublicBUG
Actions