This is a followup from my forum discussion what I believe is a bug: https://forum.vyos.io/t/why-do-my-outgoing-tcp-connections-fail-when-icmp-and-incoming-connections-are-ok/11185
Following MRE (complete config on 1.3-rolling-202305190616):
interfaces { ethernet eth0 { vif 2 { address dhcp description sonic vrf vrf_sonic } vif 3 { address 10.227.79.2/24 } } loopback lo { } } policy { local-route { rule 101 { destination 0.0.0.0/0 set { table local } } rule 102 { destination 0.0.0.0/0 set { table main } } rule 104 { destination 0.0.0.0/0 set { table 170 } } } } system { config-management { commit-revisions 100 } conntrack { modules { ftp h323 nfs pptp sip sqlnet tftp } } console { device ttyS0 { speed 115200 } } host-name Test1 login { user vyos { authentication { xxx } } } name-server eth0.2 ntp { server time1.vyos.net { } server time2.vyos.net { } server time3.vyos.net { } } syslog { global { facility all { level info } facility protocols { level debug } } } time-zone America/Los_Angeles } vrf { bind-to-all name vrf_sonic { table 170 } }
Setup
- eth0.2 gets IP address and default route via DHCP. It is enslaved to VRF vrf_sonic such that default route lands in table 170 (and not the main table)
- Local policy is created that takes precedence over the l3mdev (and other) rules: For all packet, first local table, then main table is consulted and finally table 170 (containing the default route)
- vrf bind-to-all is set such that the response packets for locally generated packets of processes which are NOT bound to the VRF device are still accepted, even though they are coming in through the VRF enslaved device (eth0.2).
Desired outcome
The desired outcome of the config above is identical as if no VRF and routing table 170 would be used in the first place.
In this case, the default route would directly land in the main table.
What fails
First, confirm config is as expected:
$ show vrf VRF name state mac address flags interfaces -------- ----- ----------- ----- ---------- vrf_sonic up 12:9e:9b:30:bf:9a noarp,master,up,lower_up eth0.2 $ show interfaces Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down Interface IP Address S/L Description --------- ---------- --- ----------- eth0 - u/u eth0.2 135.180.59.5/21 u/u sonic eth0.3 10.227.79.2/24 u/u lo 127.0.0.1/8 u/u ::1/128 $ show ip route table 170 Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP, T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP, F - PBR, f - OpenFabric, > - selected route, * - FIB route, q - queued, r - rejected, b - backup VRF default table 170: S>* 0.0.0.0/0 [210/0] via 135.180.56.1, eth0.2, weight 1, 00:14:26 C>* 135.180.56.0/21 is directly connected, eth0.2, 00:14:27 $ ip rule 101: from all lookup local 102: from all lookup main 104: from all lookup vrf_sonic 1000: from all lookup [l3mdev-table] 2000: from all lookup [l3mdev-table] unreachable 32765: from all lookup local 32766: from all lookup main 32767: from all lookup default
ICMP (ping) works:
$ ping 8.8.8.8 count 2 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. 64 bytes from 8.8.8.8: icmp_seq=1 ttl=113 time=4.24 ms 64 bytes from 8.8.8.8: icmp_seq=2 ttl=113 time=5.12 ms --- 8.8.8.8 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 3ms rtt min/avg/max/mdev = 4.239/4.680/5.122/0.446 ms
UDP (DNS) works as well:
$ dig @8.8.8.8 www.google.com. +short 142.251.32.36
However, TCP fails:
$ curl www.google.com curl: (7) Failed to connect to www.google.com port 80: Connection timed out
However, in the context of the VRF it works:
sudo ip vrf exec vrf_sonic curl www.google.com [...] call(this);</script></body></html>
Starting tcpdump in parallel with "curl www.google.com" reveals:
$ tcpdump -n -i eth0.2 'host 142.250.191.36' tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0.2, link-type EN10MB (Ethernet), capture size 262144 bytes 23:20:03.556659 IP 135.180.59.5.53818 > 142.250.191.36.80: Flags [S], seq 1275972659, win 64240, options [mss 1460,sackOK,TS val 3546963390 ecr 0,nop,wscale 7], length 0 23:20:03.561245 IP 142.250.191.36.80 > 135.180.59.5.53818: Flags [S.], seq 4072092939, ack 1275972660, win 65535, options [mss 1412,sackOK,TS val 3812757636 ecr 3546963390,nop,wscale 8], length 0 23:20:03.561328 IP 135.180.59.5.53818 > 142.250.191.36.80: Flags [R], seq 1275972660, win 0, length 0 23:20:04.581079 IP 135.180.59.5.53818 > 142.250.191.36.80: Flags [S], seq 1275972659, win 64240, options [mss 1460,sackOK,TS val 3546964414 ecr 0,nop,wscale 7], length 0 23:20:04.584924 IP 142.250.191.36.80 > 135.180.59.5.53818: Flags [S.], seq 4088086403, ack 1275972660, win 65535, options [mss 1412,sackOK,TS val 3812758659 ecr 3546964414,nop,wscale 8], length 0 23:20:04.585031 IP 135.180.59.5.53818 > 142.250.191.36.80: Flags [R], seq 1275972660, win 0, length 0
The SYN packet clearly goes out well and the Google server receives it. It responds with a SYN-ACK which is received by the VyOS box.
However, then VyOS responds with a RST (!) packet instead of ACK.
This is likely because the original request was not bound to the VRF interface but the default VRF but the response was received with the VRF enslaved interface (eth0.2). However, this is exactly the scenario that vrf bind-to-all should account for.
And indeed, it works for ICMP and UDP but it fails for TCP.
Hence this seems to be a clear bug which breaks source routing scenarios and should be fixed.