- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
All Stories
Aug 8 2023
Aug 7 2023
Nope, now i had to do
@aalmenar could you test this patch?
information that can be useful for this feature request :
If that was pppoe i'd have thought of arp, but here with fixed number of l2tp tunnels (22 tunnels from LACs) i don't think arp cache oversizes the table.
Some more information which i can't think of as a failure reason yet, but it looks strange, - just before the issue we see that LAC drops l2tp tunnel for some reason and starts to send SCCRQ with tid=0 as if it just started working. After a while accel-ppp daemon drops the old tunnels and starts the new ones for few LACs. This definitely cause massive (thousands) route updates between zebra and kernel i guess. Sometimes the system can stand this, sometimes it cant.
I tried digging through google if somebody else have encountered the same but I couldnt find any obvious hints (except for the zebra nexthop-group keep 1 already mentioned).
I checked the FRR version in the recent rolling release - it is release candidate still. Does it worth upgrading from 8.5.2? As for the possibility - yes, sure we can build latest image.
I added a comment to https://github.com/FRRouting/frr/issues/12239 so hopefully there might be some other commands or stuff to do other than the debug-commands to hunt this thing down.
Adding what was available this time. Will try to turn on debugs next time if we have another chance. Yes, the behavior was identical to previous.
And the logs looks the same as in your original post?
After 19 hours of production run since yesterday the failure occurred again despite the workaround applied. Routes are cleared from kernel for some reason. During the run we observed few l2tp tunnels drops followed by 600 to 6000 sessions drop. The reason is not clear for now but i'm not sure this should kill zebra functionality this way.
Fixed
set qos interface eth1 egress 'VyOS-HTB' set qos policy shaper VyOS-HTB bandwidth '100mbit' set qos policy shaper VyOS-HTB class 10 bandwidth '40%' set qos policy shaper VyOS-HTB class 10 description 'dscp_EF_ipprec_5_GETS' set qos policy shaper VyOS-HTB class 10 match AF11 ip dscp 'AF11' set qos policy shaper VyOS-HTB class 10 priority '1' set qos policy shaper VyOS-HTB class 10 queue-type 'fair-queue' set qos policy shaper VyOS-HTB class 20 bandwidth '30%' set qos policy shaper VyOS-HTB class 20 description 'dscp_AF4x_ipprec_4' set qos policy shaper VyOS-HTB class 20 match ef ip dscp 'EF' set qos policy shaper VyOS-HTB class 20 priority '2' set qos policy shaper VyOS-HTB class 20 queue-type 'fair-queue' set qos policy shaper VyOS-HTB default bandwidth '20%' set qos policy shaper VyOS-HTB default queue-type 'fq-codel'
Dont count on it - the way things works on internet is that there are alot of people complaining at stuff but very few who does something about it :-)
The way I use it is a bit weird. I have ESXi installed on the host and since it has no driver for it, i pass it through to vyos and then bridge it with a vmxnet interface so that hosts in the same virtual switch can use that interface instead of the usb one I use for ESXi remote access.
@c-po Tried with latest rolling 1.4-rolling-202308060317, rpki doesn't start automatically, one must do:
Latest rolling uses FRR 9.0. - could you re-test it please?
Aug 6 2023
Running into this as well on: 1.4-rolling-202307260317
Lets keep this one open for some more time and see if the issue is resolved or not.
If it crashes it should be reported upstream to kernel.org (and the maintainer for the r8169 driver) since VyOS is using the latest Linux Kernel LTS (current version 6.1.43 as of writing):
Aug 5 2023
There is a bugzilla opened for this issue: https://bugzilla.netfilter.org/show_bug.cgi?id=1697
I can confirm that updating blacklist now is vrf aware and functional:
PR created: https://github.com/vyos/vyos-1x/pull/2135
PR created: https://github.com/vyos/vyos-1x/pull/2135
Added task https://vyos.dev/T5440 to fix the issue of preconfig-script doesnt show up in /config/scripts after system upgrade (add system image).
The reason *I* use chrony with my linux qemu guests, is that it supports using the kvm_ptp to get the kvm hypervisor's time as sync source, and I then don't need the VM to chat with NTP servers.