Page MenuHomeVyOS Platform

BGPD crash in Vyos 1.2.5
Closed, InvalidPublicBUG

Description

Log from today. I think the first are not related but we were only changing some route-map rules and the only difference with other routers i can find is that there was a typo in the prefix in a prefix-list which resulted in the warning on the first line of log.

Apr 24 09:09:34 router1 bgpd[1210]: Prefix-list AS65001-own-prefixes-v4 prefix changed from 10.1.11.1/24 to 10.1.11.0/24 to match length
Apr 24 09:22:04 router1 bgpd[1210]: bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.1.11.1 in vrf default
Apr 24 09:41:06 router1 bgpd[1210]: bgp_update_receive: rcvd End-of-RIB for IPv4 Unicast from 10.1.11.1 in vrf default
Apr 24 10:21:40 router1 bgpd[1210]: Received signal 11 at 1587723700 (si_addr 0x8, PC 0x7fd15fe57316); aborting...
Apr 24 10:21:40 router1 bgpd[1210]: Backtrace for 18 stack frames:
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_backtrace_sigsafe+0x67) [0x7fd15fe3a3c7]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(zlog_signal+0x113) [0x7fd15fe3a823]15fe3a3c7]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0x73b85) [0x7fd15fe5bb85]5fe3a823]15fe3a3c7]
Apr 24 10:21:40 router1 bgpd[1210]: /lib/x86_64-linux-gnu/libpthread.so.0(+0xf890) [0x7fd15ec62890]bb85]5fe3a823]15fe3a3c7]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(route_map_upd8_dependency+0x1c6) [0x7fd15fe57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(route_map_add_match+0x1d7) [0x7fd15fe57aa7]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(generic_match_add+0x1d) [0x7fd15fe57b8d]a7]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0x3182f) [0x7fd15fe1982f]x7fd15fe57b8d]a7]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(cmd_execute_command+0xe7) [0x7fd15fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(cmd_execute+0xb7) [0x7fd15fe1bc57]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0x86995) [0x7fd15fe6e995]fe1bc57]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0x86c26) [0x7fd15fe6ec26]fe1bc57]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(+0x894e6) [0x7fd15fe714e6]fe1bc57]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(thread_call+0x60) [0x7fd15fe69530]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/x86_64-linux-gnu/frr/libfrr.so.0(frr_run+0xd8) [0x7fd15fe38658]530]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/frr/bgpd(main+0x322) [0x555dfb569ca2]run+0xd8) [0x7fd15fe38658]530]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fd15e8c9b45]530]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: /usr/lib/frr/bgpd(+0x3ad9c) [0x555dfb56bd9c]_main+0xf5) [0x7fd15e8c9b45]530]fe1bab7]]57316]
Apr 24 10:21:40 router1 bgpd[1210]: in thread vtysh_read scheduled from lib/vty.c:2645
Apr 24 10:21:40 router1 watchfrr[1174]: [EC 268435457] bgpd state -> down : read returned EOF
Apr 24 10:21:45 router1 watchfrr.sh[89343]: Cannot stop bgpd: pid 1210 not running
Apr 24 10:21:47 router1 watchfrr[1174]: bgpd state -> up : connect succeeded

Details

Difficulty level
Unknown (require assessment)
Version
1.2.5
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

We ave no RPKI filtering active yet, so https://github.com/FRRouting/frr/issues/5458 seems not related.

@Viacheslav it happened yesterday again but the stack trace was different. This time it was complaining that BGPD did not respond and the frr watch process tried to restart it, which of course did not help the situation.
I will continue to monitor but i think we can close this issue and wait for more details when it happens again.