Page MenuHomeVyOS Platform

BGP crash
Closed, ResolvedPublicBUG

Description

Currently running a recent nightly build

I was removing two BGP peers and on commit it killed the entire BGP process. VRRP was still going and primary on that router too. A bit ugly

All I did was enter two commands to delete BGP neighbors. i.e.

delete protocols bgp 65003 neighbor 2001:504:1::a500:5976:1
delete protocols bgp 65003 neighbor 198.32.160.107
commit

and all hell broke loose. BGP completely stopped.

Had to reboot and it came back ok.

Version: VyOS 1.2.0-rolling+201811081606
Built by: [email protected]
Built on: Thu 08 Nov 2018 16:06 UTC
Build ID: 1f00e88a-f512-452e-809a-d4a169b4a618

Architecture: x86_64
Boot via: installed image
System type: bare metal

Details

Difficulty level
Unknown (require assessment)
Version
VyOS 1.2.0-rolling+201811081606
Why the issue appeared?
Will be filled on close

Event Timeline

syncer triaged this task as Normal priority.Nov 11 2018, 7:37 PM
syncer edited projects, added VyOS 1.2 Crux (VyOS 1.2.0-rc8); removed VyOS 1.2 Crux.

Deleting neighbors, as such, works, so we need an exact reproducing procedure.

vyos@vyos-test-1# set protocols bgp 64444 neighbor 2001:db8::1 remote-as 64566
[edit]
vyos@vyos-test-1# set protocols bgp 64444 neighbor 192.0.2.1 remote-as 64577
[edit]
vyos@vyos-test-1# commit

vyos@vyos-test-1# delete protocols bgp 64444 neighbor 2001:db8::1 
[edit]
vyos@vyos-test-1# delete protocols bgp 64444 neighbor 192.0.2.1 
[edit]
vyos@vyos-test-1# commit

vyos@vyos-test-1# run show ip bgp summary 

IPv4 Unicast Summary:
BGP router identifier 19.46.2.253, local AS number 64444 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 1, using 21 KiB of memory

Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
10.217.32.254   4      64840       0      55        0    0    0    never       Active

Total number of neighbors 1
dmbaturin changed the task status from Open to On hold.Nov 28 2018, 11:35 PM
dmbaturin claimed this task.

I'm putting this on hold until we receive a reproducible procedure for testing this.

I think we can close this task.
Nothing like that has happened in the last few months.