Page MenuHomeVyOS Platform

vrf_zones blocking ipv6 traffic
Confirmed, HighPublicBUG

Description

Left a comment on another closed ticket: https://vyos.dev/T3655#178710 which seemed entirely relevant but @Viacheslav left feedback indicating that it is an unspecified problem that is unrelated, feel free to change the title of this bug report to better address the problem if not accurate. Here is a demonstration of the problem:

table inet vrf_zones {
        map ct_iface_map {
                typeof iifname : ct zone
                elements = { "HE" : 132,
                             "WAN" : 128,
                             "eth0" : 128,
                             "tun0" : 132,
                             "eth1" : 256,
                             "eth2" : 384,
                             "veth0" : 132,
                             "veth1" : 256,
                             "VMNET" : 256,
                             "FASTNETMON" : 384 }
        }

        chain vrf_zones_ct_in {
                type filter hook prerouting priority raw; policy accept;
                counter packets 37682 bytes 9857007 ct original zone set iifname map @ct_iface_map
        }

        chain vrf_zones_ct_out {
                type filter hook output priority raw; policy accept;
                counter packets 10822 bytes 1502078 ct original zone set oifname map @ct_iface_map
        }
}

this table doesn't work for IPv6:

vyos@vyos:~$ sudo ip vrf exec VMNET ping 198.18.5.0
PING 198.18.5.0 (198.18.5.0) 56(84) bytes of data.
64 bytes from 198.18.5.0: icmp_seq=1 ttl=64 time=0.070 ms
^C
--- 198.18.5.0 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.070/0.070/0.070/0.000 ms
vyos@vyos:~$ sudo ip vrf exec VMNET ping6 2001:470:1f15:1ed:1::1
PING 2001:470:1f15:1ed:1::1(2001:470:1f15:1ed:1::1) 56 data bytes
^C
--- 2001:470:1f15:1ed:1::1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1039ms

deleting the table fixes it:

vyos@vyos:~$ sudo nft delete table inet vrf_zones
vyos@vyos:~$ sudo ip vrf exec VMNET ping6 2001:470:1f15:1ed:1::1
PING 2001:470:1f15:1ed:1::1(2001:470:1f15:1ed:1::1) 56 data bytes
64 bytes from 2001:470:1f15:1ed:1::1: icmp_seq=1 ttl=64 time=0.063 ms
64 bytes from 2001:470:1f15:1ed:1::1: icmp_seq=2 ttl=64 time=0.051 ms
^C
--- 2001:470:1f15:1ed:1::1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1014ms
rtt min/avg/max/mdev = 0.051/0.057/0.063/0.006 ms

I'm at a bit of a loss for how to "fix" this without having to delete the table, maybe the point here arises from the fact that I'm simply connecting two vrfs together with a veth pair:

set interfaces virtual-ethernet veth0 peer-name veth1
set interfaces virtual-ethernet veth1 peer-name veth0
set interfaces virtual-ethernet veth0 address 2001:470:1f15:1ed:1::1/80
set interfaces virtual-ethernet veth1 address 2001:470:1f15:1ed:1::2/80
set interfaces virtual-ethernet veth0 address 198.18.5.0/23
set interfaces virtual-ethernet veth1 address 198.18.4.1/23
set interfaces virtual-ethernet veth0 vrf HE
set interfaces virtual-ethernet veth1 vrf VMNET
vyos@vyos# sudo ip vrf exec VMNET ping6 2001:470:1f15:1ed:1::1
PING 2001:470:1f15:1ed:1::1(2001:470:1f15:1ed:1::1) 56 data bytes
^C
--- 2001:470:1f15:1ed:1::1 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1033ms

[edit]
vyos@vyos# sudo nft delete table inet vrf_zones
[edit]
vyos@vyos# sudo ip vrf exec VMNET ping6 2001:470:1f15:1ed:1::1
PING 2001:470:1f15:1ed:1::1(2001:470:1f15:1ed:1::1) 56 data bytes
64 bytes from 2001:470:1f15:1ed:1::1: icmp_seq=1 ttl=64 time=0.063 ms
64 bytes from 2001:470:1f15:1ed:1::1: icmp_seq=2 ttl=64 time=0.040 ms
^C
--- 2001:470:1f15:1ed:1::1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1053ms
rtt min/avg/max/mdev = 0.040/0.051/0.063/0.011 ms
[edit]
vyos@vyos#

Details

Version
1.5-rolling-202402270022
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

n.fort changed the task status from Open to Confirmed.Mar 6 2024, 1:32 PM

@paigeadelethompson I've noticed that issue some weeks ago too. @n.fort is already trying to fix it - so VyOS team is working on it.

c-po triaged this task as High priority.
c-po edited a custom field.

Something I just figured out is that the minute I do:

set nat source rule 100 outbound-interface eth0
set nat source rule 100 translation address masquerade

ipv6 stops working, and deleting the nat rule then commit doesn't fix it. After adding masquerade, a rule is added to two chains that were not there before:

chain vrf_zones_ct_in {
              type filter hook prerouting priority raw; policy accept;
              counter packets 288 bytes 67582 ct original zone set iifname map @ct_iface_map
      }

      chain vrf_zones_ct_out {
              type filter hook output priority raw; policy accept;
              counter packets 92 bytes 9056 ct original zone set oifname map @ct_iface_map
      }

and when you delete nat and commit, those two rules are still there

It only seems to affect IPv6 though and I'm not really sure why because the rules to seem to be matching ipv6:

/sbin/conntrack -f ipv6 -L
tcp      6 111 SYN_SENT src=fc01::b dst=fc01::a sport=54876 dport=179 zone-orig=102 [UNREPLIED] src=fc01::a dst=fc01::b sport=179 dport=54876 mark=0 use=1
tcp      6 110 SYN_SENT src=fc00::a dst=fc00::b sport=36484 dport=179 zone-orig=101 [UNREPLIED] src=fc00::b dst=fc00::a sport=179 dport=36484 mark=0 use=1
tcp      6 110 SYN_SENT src=fc01::a dst=fc01::b sport=43862 dport=179 zone-orig=101 [UNREPLIED] src=fc01::b dst=fc01::a sport=179 dport=43862 mark=0 use=1
tcp      6 111 SYN_SENT src=fc00::b dst=fc00::a sport=42132 dport=179 zone-orig=100 [UNREPLIED] src=fc00::a dst=fc00::b sport=179 dport=42132 mark=0 use=1
udp      17 29 src=fe80::52eb:1aff:fe77:e5ff dst=ff02::2 sport=8888 dport=8888 zone-orig=100 [UNREPLIED] src=ff02::2 dst=fe80::52eb:1aff:fe77:e5ff sport=8888 dport=8888 mark=0 use=1

diff before / after (a -> b):

63a64
>               counter ct original zone set iifname map @ct_iface_map
67a69,98
>               counter ct original zone set oifname map @ct_iface_map
>       }
> }
> table ip vyos_nat {
>       chain PREROUTING {
>               type nat hook prerouting priority dstnat; policy accept;
>               counter jump VYOS_PRE_DNAT_HOOK
>       }
> 
>       chain POSTROUTING {
>               type nat hook postrouting priority srcnat; policy accept;
>               counter jump VYOS_PRE_SNAT_HOOK
>               oifname "eth0" counter masquerade comment "SRC-NAT-100"
>       }
> 
>       chain VYOS_PRE_DNAT_HOOK {
>               return
>       }
> 
>       chain VYOS_PRE_SNAT_HOOK {
>               return
>       }
> }
> table ip vyos_static_nat {
>       chain PREROUTING {
>               type nat hook prerouting priority dstnat; policy accept;
>       }
> 
>       chain POSTROUTING {
>               type nat hook postrouting priority srcnat; policy accept;
138a170,174
>       chain PREROUTING_HELPER {
>               type filter hook prerouting priority filter - 5; policy accept;
>               counter jump VYOS_CT_HELPER
>       }
> 
147a184,188
>       chain OUTPUT_HELPER {
>               type filter hook output priority filter - 5; policy accept;
>               counter jump VYOS_CT_HELPER
>       }
> 
167c208
<               return
---
>               accept

Then after deleting nat and commiting (b -> c)

< table ip vyos_nat {
<       chain PREROUTING {
<               type nat hook prerouting priority dstnat; policy accept;
<               counter jump VYOS_PRE_DNAT_HOOK
<       }
< 
<       chain POSTROUTING {
<               type nat hook postrouting priority srcnat; policy accept;
<               counter jump VYOS_PRE_SNAT_HOOK
<               oifname "eth0" counter masquerade comment "SRC-NAT-100"
<       }
< 
<       chain VYOS_PRE_DNAT_HOOK {
<               return
<       }
< 
<       chain VYOS_PRE_SNAT_HOOK {
<               return
<       }
< }
< table ip vyos_static_nat {
<       chain PREROUTING {
<               type nat hook prerouting priority dstnat; policy accept;
<       }
< 
<       chain POSTROUTING {
<               type nat hook postrouting priority srcnat; policy accept;
<       }
< }

diff between A and C:

63a64
>               counter ct original zone set iifname map @ct_iface_map
67a69
>               counter ct original zone set oifname map @ct_iface_map
138a141,145
>       chain PREROUTING_HELPER {
>               type filter hook prerouting priority filter - 5; policy accept;
>               counter jump VYOS_CT_HELPER
>       }
> 
147a155,159
>       chain OUTPUT_HELPER {
>               type filter hook output priority filter - 5; policy accept;
>               counter jump VYOS_CT_HELPER
>       }
> 
167c179
<               return
---
>               accept

In my experience working with ct zone, and admittedly my experience could be totally one-off, but if you start tagging everything with a ct zone you have to specify the ct zone to match later otherwise it defaults to 0 (which is the default ct zone) and it will never get matched, so like in VYOS_CT_HELPER where you have

tcp dport 21 ct helper set "ftp_tcp" return

you would need to specify

tcp dport 21 ct zone 100 ct helper set "ftp_tcp" return

another example

add    rule    inet filter input                    ct zone 19006 ct state established                        accept
add    rule    inet filter output                   ct zone 19006 ct state new                                accept
add    rule    inet filter output                   ct zone 19006 ct state established                        accept

I could not get it to work any other way, unless you just don't use ct to match otherwise it always defaults to the default zone. Also I could not specify ct zone id in concatenations because it would not serialize / de-serialize correctly. I imagine it is possible but it was really obvious (segfaults, incorrect values) that it needs to be better defined in nftables / associated libraries at least to be used that way if not in general, there's very little documentation on the subject of ct zones around. I wanted to understand the philosophy behind original and reply zone but I couldn't find any information about it.

side note, if you flush ruleset, and only add:

table inet vrf_zones {
        map ct_iface_map {
                typeof iifname : ct zone
                elements = { "VM" : 102,
                             "eth0" : 100,
                             "eth1" : 102,
                             "CORE" : 101,
                             "veth0" : 101,
                             "veth1" : 100,
                             "veth2" : 101,
                             "veth3" : 102,
                             "INTERNET" : 100 }
        }

        chain vrf_zones_ct_in {
                type filter hook prerouting priority raw; policy accept;
                counter packets 800 bytes 233466 ct original zone set iifname map @ct_iface_map
        }

        chain vrf_zones_ct_out {
                type filter hook output priority raw; policy accept;
                counter packets 164 bytes 21824 ct original zone set oifname map @ct_iface_map
        }
}

in theory this would be inconsequential by itself but it actually does block IPv6 traffic by itself, I feel like this shouldn't be the case given that the default behavior without any chains should be to pass / accept, clearly it's not though and I think that's because ct zone is kinda just hanging and doing its own thing without explicitly telling it exactly what to do for every zone. Unless it has something to do with that reply zone that (as near as I can tell is being left unused) I'd say its perhaps a bit broken? Not really sure..

I decided to dig into this a little more and try to trace this out:

sudo nft add chain inet vrf_zones trace_chain { type filter hook prerouting priority -301\; }
sudo nft add rule inet vrf_zones trace_chain meta nftrace set 1
trace id 72ad1891 inet vrf_zones trace_chain packet: iif "veth2" ether saddr 2a:35:be:9a:30:2c ether daddr 06:c9:2e:2e:44:e8 ip6 saddr fc01::b ip6 daddr fc01::a ip6 dscp cs0 ip6 ecn not-ect ip6 hoplimit 64 ip6 flowlabel 191398 ip6 nexthdr ipv6-icmp ip6 length 64 icmpv6 type echo-request icmpv6 code no-route icmpv6 parameter-problem 3452502291 icmpv6 taddr ::a4d1:1e66:0:0:a606:800 
trace id 72ad1891 inet vrf_zones vrf_zones_ct_in packet: iif "veth2" ether saddr 2a:35:be:9a:30:2c ether daddr 06:c9:2e:2e:44:e8 ip6 saddr fc01::b ip6 daddr fc01::a ip6 dscp cs0 ip6 ecn not-ect ip6 hoplimit 64 ip6 flowlabel 191398 ip6 nexthdr ipv6-icmp ip6 length 64 icmpv6 type echo-request icmpv6 code no-route icmpv6 parameter-problem 3452502291 icmpv6 taddr ::a4d1:1e66:0:0:a606:800 
trace id ace0e54e inet vrf_zones trace_chain packet: iif "CORE" ether saddr 2a:35:be:9a:30:2c ether daddr 06:c9:2e:2e:44:e8 ip6 saddr fc01::b ip6 daddr fc01::a ip6 dscp cs0 ip6 ecn not-ect ip6 hoplimit 64 ip6 flowlabel 191398 ip6 nexthdr ipv6-icmp ip6 length 64 icmpv6 type echo-request icmpv6 code no-route icmpv6 parameter-problem 3452502291 icmpv6 taddr ::a4d1:1e66:0:0:a606:800 
trace id ace0e54e inet vrf_zones vrf_zones_ct_in packet: iif "CORE" ether saddr 2a:35:be:9a:30:2c ether daddr 06:c9:2e:2e:44:e8 ip6 saddr fc01::b ip6 daddr fc01::a ip6 dscp cs0 ip6 ecn not-ect ip6 hoplimit 64 ip6 flowlabel 191398 ip6 nexthdr ipv6-icmp ip6 length 64 icmpv6 type echo-request icmpv6 code no-route icmpv6 parameter-problem 3452502291 icmpv6 taddr ::a4d1:1e66:0:0:a606:800

Unfortunately doesn't help me understand why it was not causing this before applying vrf_zones_ct_in and vrf_zones_ct_out