Brief description
During an attempt to start VPP on a VM, the Linux kernel crashes, and the VM becomes totally unresponsive.
Testing environment
- Cloud: Azure
- Instance type: Standard D4pls v6 (4 vcpus, 8 GiB memory)
- Accelerated networking: enabled
- Physical NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex Virtual Function] [15b3:101a] (rev 80) (firmware 16.30.5026 (MSF0000000041))
How to reproduce
- Prepare the system for VPP:
set system option kernel cpu disable-nmi-watchdog set system option kernel cpu isolate-cpus '2-3' set system option kernel cpu nohz-full '2-3' set system option kernel cpu rcu-no-cbs '2-3' set system option kernel disable-hpet set system option kernel disable-mce set system option kernel disable-mitigations set system option kernel disable-softlockup set system option kernel memory hugepage-size 2M hugepage-count '1024'
- Reboot.
- Configure VPP and try to commit:
set vpp settings cpu corelist-workers '3' set vpp settings cpu main-core '2' set vpp settings interface eth0 driver 'dpdk' set vpp settings memory main-heap-size '1G' set vpp settings unix poll-sleep-usec '1000'
After the commit, the following will be visible in OOB console:
# commit [ 275.953481] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0 [ 275.958710] Mem abort info: [ 275.960411] ESR = 0x0000000096000005 [ 275.962605] EC = 0x25: DABT (current EL), IL = 32 bits [ 275.965599] SET = 0, FnV = 0 [ 275.967334] EA = 0, S1PTW = 0 [ 275.969178] FSC = 0x05: level 1 translation fault [ 275.971965] Data abort info: [ 275.973574] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 275.976734] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 275.979828] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 275.983317] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001bca64000 [ 275.987068] [00000000000000a0] pgd=08000001bd017003, p4d=08000001bd017003, pud=0000000000000000 [ 275.992330] Internal error: Oops: 0000000096000005 [#1] SMP [ 275.995537] Modules linked in: vhost_net vhost vhost_iotlb tap tun xfrm_user xfrm_algo uio_pci_generic uio_hv_generic uio vfio_pci vfio_pci_core irqbypass vfio_iommu_type1 vfio xt_tcpudp nft_compat nf_nat_tftp nf_conntrack_tftp nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_h323 nf_conntrack_h323 nf_nat_ftp nf_conntrack_ftp nft_ct nft_chain_nat nf_nat nf_tables nfnetlink_cthelper nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nfnetlink af_packet binfmt_misc aes_ce_blk aes_ce_cipher crct10dif_ce polyval_ce polyval_generic ghash_ce sha3_ce sha3_generic sha512_ce sha512_arm64 sha2_ce sha256_arm64 sha1_ce hv_balloon hv_utils evdev sg tcp_bbr sch_fq_codel mpls_iptunnel mpls_router ip_tunnel br_netfilter bridge stp llc fuse efi_pstore configfs ip_tables x_tables autofs4 usb_storage ohci_hcd uhci_hcd ehci_hcd squashfs lz4_decompress loop overlay ext4 crc16 mbcache jbd2 nls_cp437 vfat fat efivarfs nls_ascii mlx5_ib ib_uverbs ib_core mlx5_core mlxfw pci_hyperv pci_hyperv_intf sd_mod hv_storvsc scsi_transport_fc [ 275.995597] scsi_mod scsi_common hv_netvsc hyperv_keyboard hv_vmbus [ 276.052259] CPU: 3 PID: 4156 Comm: vpp_wk_0 Not tainted 6.6.117-vyos #1 [ 276.056199] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 06/10/2025 [ 276.062645] pstate: 624010c5 (nZCv daIF +PAN -UAO +TCO -DIT +SSBS BTYPE=--) [ 276.066717] pc : hv_uio_channel_cb+0x10/0x28 [uio_hv_generic] [ 276.070120] lr : vmbus_isr+0x26c/0x30c [hv_vmbus] [ 276.072808] sp : ffff80008001bf00 [ 276.074767] x29: ffff80008001bf00 x28: ffff0001b189c100 x27: ffffd6b0125ae20c [ 276.078927] x26: ffff00018535dea0 x25: ffff00018535dc00 x24: ffff295262144000 [ 276.083068] x23: ffff0002befc84a8 x22: ffffd6b05ce844a8 x21: 0000000000000001 [ 276.087298] x20: ffff000186f0b200 x19: 000000000000000c x18: 0000000000000000 [ 276.091509] x17: ffff295262144000 x16: ffffd6b05c8344d0 x15: 0000000000000000 [ 276.095632] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 276.099745] x11: 0000000000000040 x10: ffff00010000a468 x9 : ffff00010000a460 [ 276.103961] x8 : ffff000100400948 x7 : 0000000000000000 x6 : 0000000000000000 [ 276.108157] x5 : ffff000100400920 x4 : 000000000000000c x3 : 0000000000000000 [ 276.112264] x2 : 0000000000001000 x1 : 0000000000000001 x0 : 0000000000000000 [ 276.116602] Call trace: [ 276.118066] hv_uio_channel_cb+0x10/0x28 [uio_hv_generic] [ 276.121167] vmbus_isr+0x26c/0x30c [hv_vmbus] [ 276.123735] vmbus_percpu_isr+0x10/0x20 [hv_vmbus] [ 276.126567] handle_percpu_devid_irq+0xa4/0x234 [ 276.129228] handle_irq_desc+0x40/0x58 [ 276.131463] generic_handle_domain_irq+0x1c/0x28 [ 276.134543] gic_handle_irq+0x50/0x134 [ 276.136796] call_on_irq_stack+0x30/0x48 [ 276.139085] do_interrupt_handler+0xd4/0xd8 [ 276.141542] el0_interrupt+0x58/0x17c [ 276.143758] __el0_irq_handler_common+0x18/0x24 [ 276.146333] el0t_64_irq_handler+0x10/0x1c [ 276.148742] el0t_64_irq+0x190/0x194 [ 276.150730] Code: d503233f a9bf7bfd 910003fd f9400800 (f9405000) [ 276.154236] ---[ end trace 0000000000000000 ]--- [ 276.196421] Kernel panic - not syncing: Oops: Fatal exception in interrupt [ 276.200695] SMP: stopping secondary CPUs [ 276.203019] Kernel Offset: 0x56afdbcf0000 from 0xffff800080000000 [ 276.206551] PHYS_OFFSET: 0x0 [ 276.208221] CPU features: 0x1,001c0001,f0024443,1501faab [ 276.211446] Memory Limit: none [ 276.290498] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---
Note
While this PR is still not merged, to gain access to the OOB console, you need to:
sudo sed -i '/^syntax:expression:/ s/ttyS\[0-9\]+|/ttyS[0-9]+|ttyAMA[0-9]+|/' /opt/vyatta/share/vyatta-cfg/templates/system/console/device/node.def
And the configuration:
set system console device ttyAMA0