Page MenuHomeVyOS Platform

Keepalived VRRP IPv6 group enters in FAULT state
Closed, ResolvedPublicBUG

Description

To reproduce it set IPv6 address and vrrp configuration in one commit

set interfaces ethernet eth1 address '2001:db8::3/125'
set interfaces ethernet eth1 address '192.0.2.3/24'

set high-availability vrrp group 10 address 2001:db8::2/125
set high-availability vrrp group 10 description '**IPv6**'
set high-availability vrrp group 10 hello-source-address '2001:db8::3'
set high-availability vrrp group 10 interface 'eth1'
set high-availability vrrp group 10 peer-address '2001:db8::4'
set high-availability vrrp group 10 preempt-delay '5'
set high-availability vrrp group 10 priority '200'
set high-availability vrrp group 10 rfc3768-compatibility
set high-availability vrrp group 10 vrid '10'

set high-availability vrrp group 11 address 192.0.2.2/24
set high-availability vrrp group 11 description '**IPv4**'
set high-availability vrrp group 11 hello-source-address '192.0.2.3'
set high-availability vrrp group 11 interface 'eth1'
set high-availability vrrp group 11 peer-address '192.0.2.4'
set high-availability vrrp group 11 preempt-delay '5'
set high-availability vrrp group 11 priority '200'
set high-availability vrrp group 11 rfc3768-compatibility
set high-availability vrrp group 11 vrid '11'

One group in unexpected FAULT state, fixed only after restart VRRP

vyos@r14# run show vrrp 
  Name  Interface      VRID  State      Priority  Last Transition
------  -----------  ------  -------  ----------  -----------------
    10  eth1v10v6        10  FAULT           200  48s
    11  eth1v11v4        11  MASTER          200  45s
[edit]
vyos@r14# 


vyos@r14# run restart vrrp
[edit]
vyos@r14# 
[edit]
vyos@r14# 
[edit]
vyos@r14# run show vrrp 
  Name  Interface      VRID  State      Priority  Last Transition
------  -----------  ------  -------  ----------  -----------------
    10  eth1v10v6        10  MASTER          200  4s
    11  eth1v11v4        11  MASTER          200  4s
[edit]
vyos@r14#

The log
bind unicast_src 2001:db8::3 failed 99 - Cannot assign requested address

Aug 31 13:49:22 r14 Keepalived[4766]: Startup complete
Aug 31 13:49:22 r14 Keepalived_vrrp[4771]: bind unicast_src 2001:db8::3 failed 99 - Cannot assign requested address
Aug 31 13:49:22 r14 Keepalived_vrrp[4771]: (10): entering FAULT state (src address not configured)
Aug 31 13:49:22 r14 Keepalived_vrrp[4771]: (10) Entering FAULT STATE
Aug 31 13:49:22 r14 Keepalived_vrrp[4771]: (11) Entering BACKUP STATE (init)
Aug 31 13:49:22 r14 keepalived-fifo.py[4781]: Starting FIFO pipe for Keepalived
Aug 31 13:49:22 r14 keepalived-fifo.py[4781]: Attempt to load keepalived configuration aborted due to a commit in progress (attempt 1/20)
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: Loaded configuration: {'group': {'10': {'address': {'2001:db8::2/125': {}}, 'description': '**IPv6**', 'hello_source_address': '2001:db8::3', 'interface': 'eth1', 'peer_address': '2001:db8::4', 'preempt_delay': '5', 'priority': '200', 'rfc3768_compatibility': {}, 'vrid': '10'}, '11': {'address': {'192.0.2.2/24': {}}, 'description': '**IPv4**', 'hello_source_address': '192.0.2.3', 'interface': 'eth1', 'peer_address': '192.0.2.4', 'preempt_delay': '5', 'priority': '200', 'rfc3768_compatibility': {}, 'vrid': '11'}}}
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: PIPE already exist: /run/keepalived/keepalived_notify_fifo
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: Message reading start
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: Message processing start
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: Received message: INSTANCE "10" FAULT 200
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: INSTANCE 10 changed state to FAULT
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: Received message: INSTANCE "10" FAULT 200
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: INSTANCE 10 changed state to FAULT
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: Received message: INSTANCE "11" BACKUP 200
Aug 31 13:49:23 r14 keepalived-fifo.py[4781]: INSTANCE 11 changed state to BACKUP
Aug 31 13:49:25 r14 Keepalived_vrrp[4771]: (11) Entering MASTER STATE

Details

Difficulty level
Normal (likely a few hours)
Version
VyOS 1.4-rolling-202308310021, VyOS 1.3-stable-202308240442
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

At the moment of applying vrrp configuration eth1 IPv6 address

inet6 2001:db8::3/125 scope global tentative

Then

inet6 2001:db8::3/125 scope global

Tentative address

Tentative address
    An address whose uniqueness on a link is being verified. When an address is configured on a network interface (either manually or automatically), the address is initially in the tentative state. Such an address is not considered to be assigned to an interface. An interface discards received packets addressed to a tentative address, but accepts Neighbor Discovery packets related to Duplicate Address Detection (DAD) for the tentative address.
Viacheslav claimed this task.
Viacheslav moved this task from Need Triage to Finished on the VyOS 1.3 Equuleus (1.3.4) board.
Viacheslav reopened this task as Needs testing.EditedSep 5 2023, 8:31 PM

Cannot pass smoketest for virtual-server which also uses vrrp

Traceback (most recent call last):
  File "/usr/libexec/vyos/conf_mode/high-availability.py", line 198, in <module>
    apply(c)
  File "/usr/libexec/vyos/conf_mode/high-availability.py", line 179, in apply
    for group, group_config in ha['vrrp']['group'].items():
                               ~~^^^^^^^^
KeyError: 'vrrp'



[[high-availability]] failed
Commit failed


======================================================================
FAIL: test_01_ha_virtual_server (__main__.TestHAVirtualServer.test_01_ha_virtual_server)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/libexec/vyos/tests/smoke/cli/test_ha_virtual_server.py", line 36, in tearDown
    self.assertTrue(process_named_running(PROCESS_NAME))
AssertionError: None is not true

----------------------------------------------------------------------
Ran 2 tests in 9.924s

FAILED (failures=1, errors=1)

the fix is simple

for group, group_config in ha.get('vrrp', {}).get('group', {}).items():

PR https://github.com/vyos/vyos-1x/pull/2204/commits/5f2926cf04e8a569bb25cd4121179d12b9e04c6c

dmbaturin changed Is it a breaking change? from Unspecified (possibly destroys the router) to Perfectly compatible.