Page MenuHomeVyOS Platform

BGP peers dropping randomly
Closed, ResolvedPublicBUG

Description

We are running 1.2.5-epa and experience BGP peers dropping. Mostly IPv6, sometimes IPv4.
The previous router was running Vyos 1.1.8 and did not have this problem with the same peers.

BGP log shows these lines:

Apr 02 xxxx:xxxx:43 tdcg-ams-er02 bgpd[1538]: %NOTIFICATION: sent to neighbor xxxx:xxxx:0:1::123:3 4/0 (Hold Timer Expired) 0 bytes
Apr 02 xxxx:xxxx:36 tdcg-ams-er02 bgpd[1538]: bgp_update_receive: rcvd End-of-RIB for IPv6 Unicast from xxxx:xxxx:0:1::123:3 in vrf default
Apr 02 xxxx:xxxx:34 tdcg-ams-er02 bgpd[1538]: %ADJCHANGE: neighbor xxxx:xxxx:0:1::123:3(Unknown) in vrf default Up
Apr 02 xxxx:xxxx:33 tdcg-ams-er02 bgpd[1538]: [EC 33554503] xxxx:xxxx:0:1::123:3 unrecognized capability code: 71 - ignored
Apr 02 xxxx:xxxx:32 tdcg-ams-er02 bgpd[1538]: %ADJCHANGE: neighbor xxxx:xxxx:0:1::123:3(Unknown) in vrf default Down BGP Notification send
Apr 02 xxxx:xxxx:32 tdcg-ams-er02 bgpd[1538]: %NOTIFICATION: sent to neighbor xxxx:xxxx:0:1::123:3 4/0 (Hold Timer Expired) 0 bytes

Details

Difficulty level
Unknown (require assessment)
Version
1.2.5-epa
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

There are no other reports about such issues
we need valid reproduction procedure for this, otherwise, not much that we can do

Upgraded the firmware of the X710 adapters from 6.0 to 6.8, waiting for Dell to get 7.0 and 7.2 ready. But for now the sessions are 18 hours stable so little optimistic that it was a firmware issue and not BGPd causing issues

@syncer Re-opening this. Had the first exact same incident on a different router with an IPv6 BGP session on a RJ45 connection, so that would rule out any issues with the Intel X710 card in relation to this issue.

any L2/L3 issue affecting TCP between the BGP speaker will cause this message. Looking forward to a TCP dump of the traffic when it occurs.

Merijn reopened this task as Needs testing.Apr 28 2020, 3:34 PM

@thomas-mangin the sessions are still stable, for 7 days now. The only thing changed was that max_size limit. Also no packetloss on the IPv6 connections has been observed during this time.

Some statistics from ipv6 bgp summary

RIB entries 167048, using 29 MiB of memory
Peers 59, using 1206 KiB of memory
Peer groups 12, using 768 bytes of memory

This router currently receives 2 full tables of 86758 routes
6 partials tables of 22000 to 45000 routes (some are full tables but with less preferred paths) and some 50 peers with <1000 routes,

It would really help testing this setting if someone else has about the same setup in number of peers and routes.

We've got full IPv4 and IPv6 routing tables on our VyOS boxes, and we *definitely* needed to increase net.ipv6.route.max_size (we picked 256k to give us some headroom).

Marek Isalski Today at 6:31 PM

I'm pretty sure that the contents of /proc/sys/net/ipv6/route/max_size is not a cache, but the actual number of slots that the routing table can contain. And I believe this because without changing it on our core routers, we were blackholing IPv6 traffic (we were learning the routes into FRR, but they were not being installed in the Linux kernel…).

Worse: if routes flapped, sometimes our loopback routes would be de-installed, something was learned from eBGP, and now the loopback would not end up reprogrammed into the Linux kernel's route table. This meant we had internal routing problems during our testing of VyOS back in April 2019. It then dawned on me: max_size needed to be increased (I had seen this before on a Quagga-based deployment, but hadn't expected VyOS to have a low number of IPv6 routes).

We deliberately set our routing table sizes to this in our production deployment:
sysctl:

custom:
    net.ipv4.route.max_size: 2097152
    net.ipv6.route.max_size: 262144
    net.ipv4.conf.all.log_martians: 0

Marek Isalski 1 hour ago
The reason why I don't set them both to 2 Billion is because, especially with IPv6, the Linux kernel inserts a temporary "cache" entry into the IPv6 route table radix tree. This uses up RAM. When there is no RAM left, the kernel might OOPS (this was the subject of the CVE-2018-19299 bug in MikroTik's RouterOS which had set the IPv6 max_size so big that it made the routers vulnerable to remote unauthenticated DoS by making them transit IPv6 packets to many destination addresses). (edited)

Marek Isalski 1 hour ago
So instead I set them to be big enough that I don't have to worry for the next 2-3 years, based on the rate of growth of the DFZ currently. If somebody needs more routes (e.g. they actually have millions of /32s on their internal network), they obvious can sysctl it themselves. But to me, it seemed a sensible default to start with a router which is "production-ready" for the DFZ, while not potentially being able to DoS itself through RAM exhaustion.

Marek Isalski 1 hour ago
The "cache" entries are cleaned up when the IPv6 table is full (and also periodically)… but there is still the limit of not being able to install more than max_size routes in the kernel — whether they've come from FRR, kernel routes for interfaces, or manually via ip route add

Just to confirm, increasing the route,max_size fixed this issue completely. I think it can be closed. But maybe we should set these settings by default before closing this.

Unknown Object (User) added a subscriber: Unknown Object (User).Aug 20 2020, 6:06 PM