Page MenuHomeVyOS Platform

BFD does not work with OSPFv3 via wireguard
Closed, WontfixPublicBUG

Description

I have a simple setup with a wireguard tunnel and unnumbered ends. Both OSPF and OSPFv3 work fine. However, when I enable bfd on OSPFv3 it doesn't work. The session never comes online and it remains in some weird state that is different on each end (bfd with ospf work fine).

Here is the status of the end that had bfd enabled first:

r4:~$ show protocols bfd peers 
Session count: 2
SessionId  LocalAddress                             PeerAddress                             Status         
=========  ============                             ===========                             ======         
2276869063 fe80::f1ce:78ff:fead:fb5f                fe80::f124:a7ff:fe2c:b392               init           
4055480072 23.153.128.145                           23.153.128.144                          up
r4:~$ show protocols bfd peer fe80::f124:a7ff:fe2c:b392
BFD Peer:
        peer fe80::f124:a7ff:fe2c:b392 local-address fe80::f03f:e2ff:fe93:5982 vrf default interface wg424
                ID: 2276869063
                Remote ID: 343292835
                Status: init
                Diagnostics: ok
                Remote diagnostics: ok
                Peer Type: dynamic
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo transmission interval: 50ms
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 1000ms
                        Transmission interval: 1000ms
                        Echo transmission interval: 50ms
wg424: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1412 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none 
    inet 23.153.128.145/31 scope global wg424
       valid_lft forever preferred_lft forever
    inet6 2620:18:6000:cd00::1/128 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::f1ce:78ff:fead:fb5f/64 scope link 
       valid_lft forever preferred_lft forever

and this is from the second:

r24:~$ show protocols bfd peers 
Session count: 2
SessionId  LocalAddress                             PeerAddress                             Status         
=========  ============                             ===========                             ======         
343292835  unknown                                  fe80::f1ce:78ff:fead:fb5f               down           
1693997098 23.153.128.144                           23.153.128.145                          up
r24:~$ show protocols bfd peer fe80::f1ce:78ff:fead:fb5f
BFD Peer:
        peer fe80::f1ce:78ff:fead:fb5f local-address fe80::f124:a7ff:fe2c:b392 vrf default interface wg244
                ID: 343292835
                Remote ID: 0
                Status: down
                Downtime: 32 minute(s), 5 second(s)
                Diagnostics: ok
                Remote diagnostics: ok
                Peer Type: dynamic
                Local timers:
                        Detect-multiplier: 3
                        Receive interval: 300ms
                        Transmission interval: 300ms
                        Echo transmission interval: 50ms
                Remote timers:
                        Detect-multiplier: 3
                        Receive interval: 1000ms
                        Transmission interval: 1000ms
                        Echo transmission interval: 0ms
wg244: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1412 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none 
    inet 23.153.128.144/31 scope global wg244
       valid_lft forever preferred_lft forever
    inet6 2620:18:6000:aa24::1/128 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::fc7d:ceff:fee2:d20a/64 scope link 
       valid_lft forever preferred_lft forever

Details

Version
VyOS 1.3-rolling-202105011026
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

Attaching packet captures from both ends filered with bfd && ipv6.addr == fe80::/64 rule

BTW, it appears ths fe80::... addresses used in bfd dialogue do not belong to any of the actual interfaces. It could be by design but this si something I discovered when trying to troubleshoot the session. This could also be the reason for mismatch between session ends if the other side does not know the peer address.

As WireGuard has bo MAC address a EUI64 link-local address is calculated artificially https://github.com/vyos/vyos-1x/blob/current/python/vyos/ifconfig/wireguard.py#L161

Maybe BFDd uses a different method for MAC calcultion?

BFDd amybe creating those addresses automatically. In theory, it doesn't matter what they are as long as both ends have a way of learning them. I'm not sure if this is a general issue or an issue only with wireguard but right now bfd doesn't work with OSPFv3 over wireguard.

Time Exceeded (3)

Frame 509: 120 bytes on wire (960 bits), 120 bytes captured (960 bits)
Raw packet data
Internet Protocol Version 6, Src: fe80::fc7d:ceff:fee2:d20a, Dst: fe80::f03f:e2ff:fe93:5982
    Payload Length: 80
    Next Header: ICMPv6 (58)
    Hop Limit: 2
    Source: fe80::fc7d:ceff:fee2:d20a
    Destination: fe80::f03f:e2ff:fe93:5982
Internet Control Message Protocol v6
    Type: Time Exceeded (3)
    Code: 0 (hop limit exceeded in transit)
    Checksum: 0x61c6 [correct]
    [Checksum Status: Good]
    Reserved: 00000000
    Internet Protocol Version 6, Src: fe80::f03f:e2ff:fe93:5982, Dst: fe80::f124:a7ff:fe2c:b392

The "Time exceeded" likely means that message is sent to a black hole. There are two bfd sessions running through the same tunnel - one for OSPF and the other for OSPFv3. Timeout settings are the same and the first one is established instantly and is running happily. The OSPFv3 one is not so much:

Here is the ping (router names are r4 and r24):

r4:~$ ping 2620:18:6000:aa24::1 interface wg424
PING 2620:18:6000:aa24::1(2620:18:6000:aa24::1) from 2620:18:6000:cd00::1 wg424: 56 data bytes
64 bytes from 2620:18:6000:aa24::1: icmp_seq=1 ttl=64 time=12.0 ms
64 bytes from 2620:18:6000:aa24::1: icmp_seq=2 ttl=64 time=12.2 ms
64 bytes from 2620:18:6000:aa24::1: icmp_seq=3 ttl=64 time=12.3 ms
64 bytes from 2620:18:6000:aa24::1: icmp_seq=4 ttl=64 time=12.6 ms
64 bytes from 2620:18:6000:aa24::1: icmp_seq=5 ttl=64 time=12.3 ms
64 bytes from 2620:18:6000:aa24::1: icmp_seq=6 ttl=64 time=12.4 ms
64 bytes from 2620:18:6000:aa24::1: icmp_seq=7 ttl=64 time=12.2 ms
^C
--- 2620:18:6000:aa24::1 ping statistics ---
7 packets transmitted, 7 received, 0% packet loss, time 15ms
rtt min/avg/max/mdev = 11.960/12.279/12.624/0.223 ms

same with link-local:

r4:~$ ping fe80::f43f:3fff:fe91:9d08 interface wg424
ping6: Warning: source address might be selected on device other than wg424.
PING fe80::f43f:3fff:fe91:9d08(fe80::f43f:3fff:fe91:9d08) from :: wg424: 56 data bytes
64 bytes from fe80::f43f:3fff:fe91:9d08%wg424: icmp_seq=1 ttl=64 time=12.0 ms
64 bytes from fe80::f43f:3fff:fe91:9d08%wg424: icmp_seq=2 ttl=64 time=12.4 ms
64 bytes from fe80::f43f:3fff:fe91:9d08%wg424: icmp_seq=3 ttl=64 time=12.1 ms
64 bytes from fe80::f43f:3fff:fe91:9d08%wg424: icmp_seq=4 ttl=64 time=12.1 ms
64 bytes from fe80::f43f:3fff:fe91:9d08%wg424: icmp_seq=5 ttl=64 time=12.3 ms
64 bytes from fe80::f43f:3fff:fe91:9d08%wg424: icmp_seq=6 ttl=64 time=12.1 ms
^C
--- fe80::f43f:3fff:fe91:9d08 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 12ms
rtt min/avg/max/mdev = 12.049/12.170/12.368/0.124 ms

and in the opposite direction:

r24:~$ ping 2620:18:6000:cd00::1 interface wg244
PING 2620:18:6000:cd00::1(2620:18:6000:cd00::1) from 2620:18:6000:aa24::1 wg244: 56 data bytes
64 bytes from 2620:18:6000:cd00::1: icmp_seq=1 ttl=64 time=11.9 ms
64 bytes from 2620:18:6000:cd00::1: icmp_seq=2 ttl=64 time=12.2 ms
64 bytes from 2620:18:6000:cd00::1: icmp_seq=3 ttl=64 time=12.2 ms
64 bytes from 2620:18:6000:cd00::1: icmp_seq=4 ttl=64 time=12.4 ms
64 bytes from 2620:18:6000:cd00::1: icmp_seq=5 ttl=64 time=12.0 ms
^C
--- 2620:18:6000:cd00::1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 9ms
rtt min/avg/max/mdev = 11.924/12.133/12.355/0.197 ms

and link-local:

r24:~$ ping fe80::f1dc:bcff:fe24:d827 interface wg244
ping6: Warning: source address might be selected on device other than wg244.
PING fe80::f1dc:bcff:fe24:d827(fe80::f1dc:bcff:fe24:d827) from :: wg244: 56 data bytes
64 bytes from fe80::f1dc:bcff:fe24:d827%wg244: icmp_seq=1 ttl=64 time=12.3 ms
64 bytes from fe80::f1dc:bcff:fe24:d827%wg244: icmp_seq=2 ttl=64 time=12.3 ms
64 bytes from fe80::f1dc:bcff:fe24:d827%wg244: icmp_seq=3 ttl=64 time=12.0 ms
64 bytes from fe80::f1dc:bcff:fe24:d827%wg244: icmp_seq=4 ttl=64 time=12.0 ms
64 bytes from fe80::f1dc:bcff:fe24:d827%wg244: icmp_seq=5 ttl=64 time=12.0 ms
64 bytes from fe80::f1dc:bcff:fe24:d827%wg244: icmp_seq=6 ttl=64 time=12.3 ms
^C
--- fe80::f1dc:bcff:fe24:d827 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 11ms
rtt min/avg/max/mdev = 11.987/12.152/12.325/0.141 ms

so, 12ms is well within 300ms window.

syncer edited projects, added VyOS 1.4 Sagitta; removed VyOS 1.3 Equuleus.

Can you please retry with the latest 1.4 image as the EUI64 address generation changed and is now "stable/predictive" like on ethernet interfaces

dmbaturin set Issue type to Unspecified (please specify).
dmbaturin changed Is it a breaking change? from Unspecified (possibly destroys the router) to Perfectly compatible.
dmbaturin changed Issue type from Unspecified (please specify) to Bug (incorrect behavior).
syncer claimed this task.
syncer subscribed.

this was abondoned