I have a VyOS instance in a AWS VPC with 2 interfaces as a VPN gateway (to HQ) and firewall.
Public interface with Elastic-IP in public subnet and second interface in private subnet. All traffic from instances in private subnet to internet or HQ goes to VyOS instance private interface.
Since vyos-1.2.0-rolling+201810220337 (kernel 4.14.65) we see about 8-10% packetloss from instances in private-subnet (test instance is openSUSE Leap 15) to HQ and internet (through VyOS router).
Packetloss only occures with packets smaller 240bytes, larger packets are not affected.
I switches to VyOS 1.2.0-rc3 (kernel 4.14.65), same problem.
I reverted to vyos-1.2.0-201810210337 (kernel 4.18.11) problem is gone.
Back to VyOS 1.2.0-rc3 (kernel 4.14.65), same problem again.
In VyOS I see dropped TX packets on public and private interfaces:
[email protected]:/var/log$ sudo ifconfig eth0 Link encap:Ethernet HWaddr 06:2a:e8:xx:xx:xx inet addr:172.16.101.16 Bcast:172.16.101.31 Mask:255.255.255.224 inet6 addr: fe80::42a:e8ff:fe45:5dd6/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:403401 errors:0 dropped:0 overruns:0 frame:0 TX packets:475753 errors:0 dropped:6133 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:88966065 (84.8 MiB) TX bytes:130760175 (124.7 MiB) eth1 Link encap:Ethernet HWaddr 06:3a:56:yy:yy:yy inet addr:172.16.100.10 Bcast:172.16.100.255 Mask:255.255.255.0 inet6 addr: fe80::43a:56ff:fe8a:8de7/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:358583 errors:0 dropped:0 overruns:0 frame:0 TX packets:298060 errors:0 dropped:4788 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:75408931 (71.9 MiB) TX bytes:59339562 (56.5 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:262 errors:0 dropped:0 overruns:0 frame:0 TX packets:262 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:23284 (22.7 KiB) TX bytes:23284 (22.7 KiB)
Everytime I test with small packets (ICMP echo smaller 240bytes for example) the TX dropped value increases on public and private interface in parallel. Checked online with:
watch -tn 1 "ifconfig -a | grep -A 5 eth1 | grep 'TX packets' | sed 's/^.* dropped:\\([0-9]\\{1,\\}\\) .*\$/\1/g'"
Here an MTR from test instance in private subnet to IP 1.1.1.1:
My traceroute [vUNKNOWN] aws-vm-test01 (172.16.100.12) 2018-10-25T11:45:18+0200 Keys: Help Display mode Restart statistics Order of fields quit Packets Pings Host Loss% Snt Last Avg Best Wrst StDev 1. 172.16.100.10 0.0% 120 0.5 0.5 0.3 1.4 0.1 2. ??? 3. ??? 4. ??? 5. ??? 6. ??? 7. 100.65.10.33 10.8% 120 1.8 1.1 0.8 14.1 1.3 8. ??? 9. ??? 10. 52.93.7.108 7.5% 120 3.4 11.5 2.7 50.4 12.1 11. 52.93.7.29 5.9% 119 1.4 1.6 1.2 9.8 0.9 12. inex1.as13335.net 10.9% 119 2.0 2.7 1.5 42.7 4.7 13. one.one.one.one 9.2% 119 1.6 1.6 1.4 4.0 0.3
If you have specific questions I can test again and report further information.