Page MenuHomeVyOS Platform

VPP after enabling by default route-no-paths we get crash daemon and faulting address
Closed, ResolvedPublicBUG

Description

After enabling by default route-no-path we get crash daemon and faulting address
It occurs when you update the system, but sometimes it happens on the newly installed system.

To reproduce

delete vpp
commit

set vpp interfaces gre gre2 remote '192.0.2.25'
set vpp interfaces gre gre2 source-address '192.0.2.1'
set vpp settings interface eth1 driver 'dpdk'
set vpp settings unix poll-sleep-usec '12'
commit

set vpp interfaces gre gre2 kernel-interface 'vpptun12'
commit && ip link show dev vpptun12

logs after the second commit:

Feb 03 18:55:27 r16 vyos-configd[792]: Sending reply: SUCCESS with output
Feb 03 18:55:27 r16 vyos-configd[792]: Received message: {"type": "node", "last": true, "data": "VYOS_TAGNODE_VALUE=gre2/usr/libexec/vyos/conf_mode/vpp_interfaces_gre.py"}
Feb 03 18:55:27 r16 vpp[49748]: received signal SIGSEGV, PC 0x7f3f28603883, faulting address 0x7f4ee8179e90
Feb 03 18:55:27 r16 vpp[49748]: #0  0x00007f3f27c3c162 0x7f3f27c3c162
Feb 03 18:55:27 r16 vpp[49748]: #1  0x00007f3f2795c050 0x7f3f2795c050
Feb 03 18:55:27 r16 vpp[49748]: #2  0x00007f3f28603883 0x7f3f28603883
Feb 03 18:55:27 r16 vpp[49748]: #3  0x00007f3f27bdda9b 0x7f3f27bdda9b
Feb 03 18:55:27 r16 vpp[49748]: #4  0x00007f3f27be3045 vlib_main + 0x1195
Feb 03 18:55:27 r16 vpp[49748]: #5  0x00007f3f27c3b58a 0x7f3f27c3b58a
Feb 03 18:55:27 r16 vpp[49748]: #6  0x00007f3f27b9895c 0x7f3f27b9895c
Feb 03 18:55:27 r16 (udev-worker)[50001]: Network interface NamePolicy= disabled on kernel command line.
Feb 03 18:55:27 r16 netplugd[1075]: eth1: state INNING flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001002 BROADCAST,MULTICAST
Feb 03 18:55:27 r16 vyos-configd[792]: Sending reply: SUCCESS with output
Feb 03 18:55:27 r16 vyos-configd[792]: scripts_called: ['vpp', 'vpp_interfaces_gre_gre2']
Feb 03 18:55:27 r16 systemd[1]: vpp.service: Main process exited, code=killed, status=6/ABRT
Feb 03 18:55:27 r16 systemd[1]: vpp.service: Failed with result 'signal'.

All work fine If we use the same config with ignore kernel-routes option

set vpp settings lcp ignore-kernel-routes

Details

Version
VyOS 1.5-rolling-202502030007
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

Viacheslav triaged this task as High priority.

The bug is not reproduced via API

from vyos.vpp.interface import GREInterface
a = GREInterface(ifname='gre0', source_address='192.0.2.1', remote='203.0.113.25', tunnel_type='l3', kernel_interface='vpp-gre0')
a.add()
a.kernel_add()

Test with gdb

# 01 first console
delete vpp
commit

set vpp settings interface eth1 driver 'dpdk'
set vpp settings unix poll-sleep-usec '12'
commit

sudo gdb -p $(pgrep vpp)
# press 'c'


# 02 another console/ssh
set vpp interfaces gre gre2 remote '192.0.2.25'
set vpp interfaces gre gre2 source-address '192.0.2.1'
set vpp interfaces gre gre2 kernel-interface vpptun12

commit && ip link show

GDB

vyos@r14# set vpp settings interface eth1 driver 'dpdk'
[edit]
vyos@r14# set vpp settings unix poll-sleep-usec '12'
[edit]
vyos@r14# commit
[edit]
vyos@r14# 
[edit]
vyos@r14# sudo gdb -p $(pgrep vpp)


GNU gdb (Debian 13.1-3) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 25303
[New LWP 25321]
[New LWP 25414]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f582b8ed505 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) 
(gdb) 
(gdb) c
Continuing.

Thread 1 "vpp_main" received signal SIGSEGV, Segmentation fault.
0x00007f582c51006b in ip4_mtrie_16_lookup_step (dst_address_byte_index=2, dst_address=<optimized out>, current_leaf=<optimized out>)
    at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vnet/ip/ip4_mtrie.h:215
215	/__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vnet/ip/ip4_mtrie.h: No such file or directory.
(gdb) 
Continuing.

Thread 1 "vpp_main" received signal SIGABRT, Aborted.
0x00007f582b8a8ebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) backtrace
#0  0x00007f582b8a8ebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f582b859fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f582b844472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x000055f7a66c4e8a in os_exit (code=<optimized out>) at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vpp/vnet/main.c:464
#4  <signal handler called>
#5  0x00007f582c51006b in ip4_mtrie_16_lookup_step (dst_address_byte_index=2, dst_address=<optimized out>, current_leaf=<optimized out>)
    at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vnet/ip/ip4_mtrie.h:215
#6  ip4_fib_forwarding_lookup (addr=<optimized out>, fib_index=3) at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vnet/fib/ip4_fib.h:146
#7  ip4_local_check_src (is_receive_dpo=1, error0=<synthetic pointer>, last_check=<synthetic pointer>, ip0=<optimized out>, b=0x1002618180)
    at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vnet/ip/ip4_forward.c:1544
#8  ip4_local_inline (vm=0x7f57eb6cd700, node=<optimized out>, frame=<optimized out>, head_of_feature_arc=<optimized out>, is_receive_dpo=1)
    at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vnet/ip/ip4_forward.c:1837
#9  0x00007f582badf37b in dispatch_node (last_time_stamp=<optimized out>, frame=0x7f57eea787c0, dispatch_state=VLIB_NODE_STATE_POLLING, type=VLIB_NODE_TYPE_INTERNAL, node=0x7f57ebcade40, vm=0x7f57eb6cd700)
    at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vlib/main.c:949
#10 dispatch_pending_node (vm=vm@entry=0x7f57eb6cd700, pending_frame_index=pending_frame_index@entry=4, last_time_stamp=<optimized out>)
    at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vlib/main.c:1106
#11 0x00007f582bae4985 in vlib_main_or_worker_loop (is_main=1, vm=<optimized out>) at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vlib/main.c:1581
#12 vlib_main_loop (vm=<optimized out>) at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vlib/main.c:1706
#13 vlib_main (vm=<optimized out>, vm@entry=0x7f57eb6cd700, input=input@entry=0x7f57e4466fa0)
    at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vlib/main.c:2017
#14 0x00007f582bb3d23a in thread0 (arg=140015588660992) at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vlib/unix/main.c:659
#15 0x00007f582ba982bc in clib_calljmp () at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vppinfra/longjmp.S:123
#16 0x00007ffee3ee7ca0 in ?? ()
#17 0x00007f582bb3e876 in vlib_unix_main (argc=<optimized out>, argv=<optimized out>)
    at /__w/vyos-reusable-workflows/vyos-reusable-workflows/vyos-build/scripts/package-build/vpp/vpp/src/vlib/unix/main.c:755
#18 0x000000000003e000 in ?? ()
#19 0x000000000003e000 in ?? ()
#20 0x000000000003e000 in ?? ()
#21 0x000000000000b8b4 in ?? ()
#22 0x000000000000b8b4 in ?? ()
#23 0x0000000000001000 in ?? ()
#24 0x0000000600000001 in ?? ()
#25 0x0000000000049a90 in ?? ()
#26 0x000000000004aa90 in ?? ()
#27 0x000000000004aa90 in ?? ()
#28 0x0000000000001db0 in ?? ()
#29 0x0000000000002f40 in ?? ()
#30 0x0000000000001000 in ?? ()
#31 0x0000000600000002 in ?? ()
#32 0x0000000000049ce0 in ?? ()
#33 0x000000000004ace0 in ?? ()
#34 0x000000000004ace0 in ?? ()
#35 0x00000000000001f0 in ?? ()
#36 0x00000000000001f0 in ?? ()
#37 0x0000000000000008 in ?? ()
#38 0x0000000400000004 in ?? ()
#39 0x0000000000000238 in ?? ()
#40 0x0000000000000238 in ?? ()
#41 0x0000000000000238 in ?? ()
#42 0x0000000000000024 in ?? ()
#43 0x0000000000000024 in ?? ()
#44 0x0000000000000004 in ?? ()
--Type <RET> for more, q to quit, c to continue without paging-- c
#45 0x000000046474e550 in ?? ()
#46 0x0000000000045790 in ?? ()
#47 0x0000000000045790 in ?? ()
#48 0x0000000000045790 in ?? ()
#49 0x0000000000000a2c in ?? ()
#50 0x0000000000000a2c in ?? ()
#51 0x0000000000000004 in ?? ()
#52 0x000000066474e551 in ?? ()
#53 0x0000000000000000 in ?? ()
(gdb) 
(gdb) 
(gdb) c
Continuing.
Couldn't get registers: No such process.
(gdb) [Thread 0x7f582b6cdf40 (LWP 25303) exited]
[Thread 0x7f582b6cdf40 (LWP 25414) exited]
[Thread 0x7f57e20776c0 (LWP 25321) exited]
[New process 25303]

Program terminated with signal SIGABRT, Aborted.
The program no longer exists.

Small note, to reproduce it, you have to have at least one route not related to vpp interface, like the default route via eth0, but vpp interface eth1

delete vpp
commit

set interfaces ethernet eth0 address '192.168.122.14/24'
set interfaces ethernet eth1 address '192.0.2.1/30'
set protocols static route 0.0.0.0/0 next-hop 192.168.122.1

set vpp interfaces gre gre2 remote '192.0.2.25'
set vpp interfaces gre gre2 source-address '192.0.2.1'
set vpp settings interface eth1 driver 'dpdk'
set vpp settings unix poll-sleep-usec '12'

set vpp interfaces gre gre2 kernel-interface 'vpptun12'
commit && ip link show dev vpptun12

CI can pass this test because it doesn't have additional routes to import by VPP and is not related to the updated/newly installed system.

For example first two test without any routes and the third test after enabling default route

vyos@r16:~$ /usr/libexec/vyos/tests/smoke/cli/test_vpp.py
test_01_vpp_basic (__main__.TestVPP.test_01_vpp_basic) ... ok
test_02_vpp_vxlan (__main__.TestVPP.test_02_vpp_vxlan) ... ok
test_03_vpp_gre (__main__.TestVPP.test_03_vpp_gre) ... ok
test_04_vpp_geneve (__main__.TestVPP.test_04_vpp_geneve) ... skipped 'Skipping this test geneve index always is 0'
test_05_vpp_loopback (__main__.TestVPP.test_05_vpp_loopback) ... ok
test_06_vpp_bonding (__main__.TestVPP.test_06_vpp_bonding) ... skipped 'Skipping temporary bonding, sometimes get recursion T7117'
test_07_vpp_bridge (__main__.TestVPP.test_07_vpp_bridge) ... ok
test_08_vpp_ipip (__main__.TestVPP.test_08_vpp_ipip) ... ok
test_09_vpp_xconnect (__main__.TestVPP.test_09_vpp_xconnect) ... ok
test_10_vpp_driver_options (__main__.TestVPP.test_10_vpp_driver_options) ... ok
test_11_vpp_cpu_settings (__main__.TestVPP.test_11_vpp_cpu_settings) ... ok

----------------------------------------------------------------------
Ran 11 tests in 135.879s

OK (skipped=2)
vyos@r16:~$ 
vyos@r16:~$ 
vyos@r16:~$ /usr/libexec/vyos/tests/smoke/cli/test_vpp.py
test_01_vpp_basic (__main__.TestVPP.test_01_vpp_basic) ... ok
test_02_vpp_vxlan (__main__.TestVPP.test_02_vpp_vxlan) ... ok
test_03_vpp_gre (__main__.TestVPP.test_03_vpp_gre) ... ok
test_04_vpp_geneve (__main__.TestVPP.test_04_vpp_geneve) ... skipped 'Skipping this test geneve index always is 0'
test_05_vpp_loopback (__main__.TestVPP.test_05_vpp_loopback) ... ok
test_06_vpp_bonding (__main__.TestVPP.test_06_vpp_bonding) ... skipped 'Skipping temporary bonding, sometimes get recursion T7117'
test_07_vpp_bridge (__main__.TestVPP.test_07_vpp_bridge) ... ok
test_08_vpp_ipip (__main__.TestVPP.test_08_vpp_ipip) ... ok
test_09_vpp_xconnect (__main__.TestVPP.test_09_vpp_xconnect) ... ok
test_10_vpp_driver_options (__main__.TestVPP.test_10_vpp_driver_options) ... ok
test_11_vpp_cpu_settings (__main__.TestVPP.test_11_vpp_cpu_settings) ... ok

----------------------------------------------------------------------
Ran 11 tests in 133.499s

OK (skipped=2)
vyos@r16:~$ 
vyos@r16:~$ 
vyos@r16:~$ conf
[edit]
vyos@r16# set protocols static route 0.0.0.0/0 next-hop 192.168.122.1
[edit]
vyos@r16# commit
[edit]
vyos@r16# exit
Warning: configuration changes have not been saved.
exit
vyos@r16:~$ 
vyos@r16:~$ 
vyos@r16:~$ /usr/libexec/vyos/tests/smoke/cli/test_vpp.py
test_01_vpp_basic (__main__.TestVPP.test_01_vpp_basic) ... ok
test_02_vpp_vxlan (__main__.TestVPP.test_02_vpp_vxlan) ... ok
test_03_vpp_gre (__main__.TestVPP.test_03_vpp_gre) ... ERROR
test_03_vpp_gre (__main__.TestVPP.test_03_vpp_gre) ... FAIL
test_04_vpp_geneve (__main__.TestVPP.test_04_vpp_geneve) ... skipped 'Skipping this test geneve index always is 0'
test_05_vpp_loopback (__main__.TestVPP.test_05_vpp_loopback) ... ERROR
test_06_vpp_bonding (__main__.TestVPP.test_06_vpp_bonding) ... skipped 'Skipping temporary bonding, sometimes get recursion T7117'
test_07_vpp_bridge (__main__.TestVPP.test_07_vpp_bridge) ... ERROR
test_08_vpp_ipip (__main__.TestVPP.test_08_vpp_ipip) ... ERROR
test_09_vpp_xconnect (__main__.TestVPP.test_09_vpp_xconnect) ... ERROR
test_10_vpp_driver_options (__main__.TestVPP.test_10_vpp_driver_options) ... ERROR
test_11_vpp_cpu_settings (__main__.TestVPP.test_11_vpp_cpu_settings) ... ERROR
tearDownClass (__main__.TestVPP) ... ERROR

One of the possible reasons is a lack of IP addresses (like source of the tunnels IP) on the VPP interface
Passed tests 5 times with IP address on eth1 successful
The latest test without an IP address on eth1 fails

It is not a problem for the API calls.
Via API calls, all work fine regardless of whether they exist address or not.

>>> from vyos.vpp.interface import GREInterface
>>> a = GREInterface(ifname='gre0', source_address='192.0.2.1', remote='203.0.113.25', tunnel_type='l3', kernel_interface='vpp-gre0')
>>> a.add()
>>> a.kernel_add()
>>> 

vyos@r14:~$ 
vyos@r14:~$ ip link show dev vpp-gre0
257: vpp-gre0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN mode DEFAULT group default qlen 500
    link/none 
vyos@r14:~$

Smoketests attached with and without IP address

Viacheslav changed the task status from Open to In progress.Mar 20 2025, 11:57 AM
Viacheslav assigned this task to denys.haryachyy.
Viacheslav moved this task from Need Triage to Completed on the VyOS Rolling board.