Page MenuHomeVyOS Platform

VPP CGNAT commit failure and runtime errors with DPDK (driver ice) on custom build
Closed, ResolvedPublicBUG

Description

Title: VPP CGNAT commit failure and runtime errors with DPDK (driver ice) on custom build

Environment

VyOS: custom build with VPP enabled

Kernel boot params:

nosmt mitigations=off isolcpus=0-23 nohz_full=0-23 rcu_nocbs=0-23 rcu_nocb_poll audit=0 \
intel_idle.max_cstate=0 processor.max_cstate=0 intel_pstate=disable \
intel_iommu=on iommu=pt modules-load=vfio,vfio_iommu_type1,vfio_pci

NIC: Intel (driver ice)

RAM: 64GB (hugepages configured)

Configuration

interfaces {
    ethernet eth0 {
        address 172.29.40.30/30
        hw-id 80:30:e0:3b:61:18
        vrf mgmt
    }
    ethernet eth4 {
        hw-id b4:96:91:b3:ae:96
        mtu 1500
    }
    ethernet eth5 {
        hw-id b4:96:91:b3:ae:97
        mtu 1500
    }
    loopback lo { }
}

protocols {
    static {
        route 0.0.0.0/0 {
            next-hop 206.0.9.21{ }
        }
        route 100.99.0.0/24 {
            next-hop 172.29.49.1 { }
        }
    }
}

vrf {
    name mgmt {
        protocols {
            static {
                route 0.0.0.0/0 {
                    next-hop 172.29.40.29 { }
                }
            }
        }
        table 100
    }
}

system {
    host-name CGNAT
    option { reboot-on-upgrade-failure 5 }
    sysctl {
        parameter net.core.rmem_default { value 134217728 }
        parameter net.core.rmem_max     { value 536870912 }
        parameter net.core.wmem_default { value 134217728 }
        parameter net.core.wmem_max     { value 536870912 }
    }
}

vpp {
    interfaces {
        bonding bond0 {
            kernel-interface vpptun10
            member {
                interface eth4
                interface eth5
            }
            mode 802.3ad
        }
    }
    kernel-interfaces {
        vpptun10 {
            vif 144 { address 172.29.49.2/30 }
            vif 145 { address 206.0.9.22/30 }
        }
    }
    nat {
        cgnat {
            interface {
                inside  vpptun10.144
                outside vpptun10.145
            }
            rule 10 {
                inside-prefix  100.99.0.0/24
                outside-prefix 206.0.15.248/29
            }
        }
    }
    settings {
        cpu {
            corelist-workers 2-20
            main-core 1
        }
        interface eth4 { driver dpdk }
        interface eth5 { driver dpdk }
    }
}

Steps to reproduce

Boot system with the kernel parameters above.

Apply the configuration shown.

Run commit.

Expected behavior

Commit should succeed.

VPP should start with CGNAT enabled.

Interfaces should initialize with MTU and bonding as configured.

Actual behavior

Commit hangs or crashes with repeated CRITICAL:VyOS StdErr spam in vyos-configd.

VPP logs show:

set interface mtu: unknown input

rte_eth_dev_set_mtu failed (rv -16)

Secondary MAC Addresses not supported for interface index 0

clib_c11_violation: s1 NULL / s2 NULL

Additional info

Logs:

journalctl -u vyos-configd -n 200
journalctl -u vpp -n 200

show repeated crashes on commit.

Problem persists across reboots.

vyos@vyos:~$ show ver
Version:          VyOS 1.5-rolling-202508230455
Release train:    current
Release flavor:   generic

Built by:         [email protected]
Built on:         Sat 23 Aug 2025 04:55 UTC
Build UUID:       2f9818cc-6532-4aab-8e79-38bc0810b838
Build commit ID:  9d533529aacd33

Architecture:     x86_64
Boot via:         installed image
System type:      bare metal
Secure Boot:      disabled

Hardware vendor:  HPE
Hardware model:   ProLiant DL360 Gen10
Hardware S/N:     xxxx
Hardware UUID:    xxxx-xxxx-xx-xxxx-xxxx

Details

Version
VyOS 1.5-rolling-202508230455
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

c-po updated the task description. (Show Details)
interface {
    inside  vpptun10.144
    outside vpptun10.145
}

It is necessary to check, but probably VPP sees those interfaces (as BondEthernet or tap)

set vpp interfaces bonding bond0 kernel-interface 'vpptun10'
set vpp kernel-interfaces vpptun10 vif 145 address '192.0.2.5/24'
set vpp settings interface eth0 driver 'dpdk'
set vpp settings interface eth1 dpdk-options promisc
set vpp settings interface eth1 driver 'dpdk'
set vpp settings memory default-hugepage-size '2M'
set vpp settings memory main-heap-page-size 'default-hugepage'
set vpp settings memory main-heap-size '4G'
set vpp settings statseg page-size '2M'
set vpp settings statseg size '1G'
set vpp settings unix poll-sleep-usec '222'

interfaces

vyos@r14# run show vpp interfaces 
Kernel        Dataplane          Type    IP Address         MAC                  MTU  State
------------  -----------------  ------  -----------------  -----------------  -----  -------
              BondEthernet0      bond                       02:fe:b2:6a:1a:45   9000  up
              BondEthernet0.145  bond                       00:00:00:00:00:00   9000  up
...
vpptun10      tap4098            virtio                     02:fe:1b:56:41:6b   9000  up
vpptun10.145  tap4098.145        virtio                     00:00:00:00:00:00      0  up

And VPP does not see 192.0.2.5 in the addresses of FIB

vpp# show interface addr
BondEthernet0 (up):
BondEthernet0.145 (up):
eth0 (up):
  L3 192.168.122.14/24
eth1 (up):
  L3 192.0.2.1/30
eth1.11 (up):
  L3 10.0.11.1/30
eth1.12 (up):
  L3 10.0.12.1/30
eth1.13 (up):
  L3 10.0.13.1/30
eth1.14 (up):
  L3 10.0.14.1/30
eth1.15 (up):
  L3 10.0.15.1/30
eth1.16 (up):
  L3 10.0.16.1/30
eth1.17 (up):
  L3 10.0.17.1/30
eth1.18 (up):
  L3 10.0.18.1/30
eth1.19 (up):
  L3 10.0.19.1/30
eth1.20 (up):
  L3 10.0.20.1/30
eth1.21 (up):
  L3 10.0.21.1/30
eth1.22 (up):
  L3 10.0.22.1/30
eth1.23 (up):
  L3 10.0.23.1/30
eth1.24 (up):
  L3 10.0.24.1/30
eth1.25 (up):
  L3 10.0.25.1/30
eth1.26 (up):
  L3 10.0.26.1/30
local0 (dn):
tap4096 (up):
tap4097 (up):
tap4097.11 (up):
tap4097.12 (up):
tap4097.13 (up):
tap4097.14 (up):
tap4097.15 (up):
tap4097.16 (up):
tap4097.17 (up):
tap4097.18 (up):
tap4097.19 (up):
tap4097.20 (up):
tap4097.21 (up):
tap4097.22 (up):
tap4097.23 (up):
tap4097.24 (up):
tap4097.25 (up):
tap4097.26 (up):
tap4098 (up):
tap4098.145 (up):
vpp# 
vpp# 
vpp# show ip fib 192.0.2.5
ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto flowlabel ] epoch:0 flags:none locks:[adjacency:1, default-route:1, lcp-rt:1, ]
0.0.0.0/0 fib:0 index:0 locks:3
  lcp-rt-dynamic refs:1 src-flags:added,contributing,active,
    path-list:[192] locks:2 flags:shared, uPRF-list:190 len:1 itfs:[1, ]
      path:[314] pl-index:192 ip4 weight=1 pref=20 attached-nexthop:  oper-flags:resolved,
        192.168.122.1 eth0
      [@0]: ipv4 via 192.168.122.1 eth0: mtu:1500 next:5 flags:[] 52540016840852540077fa360800

  default-route refs:1 entry-flags:drop, src-flags:added,
    path-list:[0] locks:1 flags:drop, uPRF-list:0 len:0 itfs:[]
      path:[0] pl-index:0 ip4 weight=1 pref=0 special:  cfg-flags:drop,
        [@0]: dpo-drop ip4

 forwarding:   unicast-ip4-chain
  [@0]: dpo-load-balance: [proto:ip4 index:1 buckets:1 uRPF:190 to:[0:0]]
    [0] [@5]: ipv4 via 192.168.122.1 eth0: mtu:1500 next:5 flags:[] 52540016840852540077fa360800
ipv4-VRF:1010, fib_index:1, flow hash:[src dst sport dport proto flowlabel ] epoch:0 flags:none locks:[lcp-rt:1, ]
0.0.0.0/0 fib:1 index:7 locks:2
  default-route refs:1 entry-flags:drop, src-flags:added,contributing,active,
    path-list:[17] locks:2 flags:drop, uPRF-list:7 len:0 itfs:[]
      path:[17] pl-index:17 ip4 weight=1 pref=0 special:  cfg-flags:drop,
        [@0]: dpo-drop ip4

 forwarding:   unicast-ip4-chain
  [@0]: dpo-load-balance: [proto:ip4 index:8 buckets:1 uRPF:7 to:[0:0]]
    [0] [@0]: dpo-drop ip4
vpp#

@guerralucasdaniel7 Could you please share the topology map with all interfaces and networks included?

While there might be a bug or missing feature for that specific scenario, the overall configuration is quite confusing, and there’s a high chance it’s conceptually incorrect. The topology map would help clarify and resolve this.

natali-rs1985 changed the task status from Open to In progress.Oct 7 2025, 3:53 PM
natali-rs1985 claimed this task.
natali-rs1985 changed Is it a breaking change? from Behavior change to Perfectly compatible.

Hello,
Sorry for the delay in getting back to you.
I'll rebuild the scenario, design and update the configurations, and I'll bring you updates shortly, including the network design, hardware, and applications.
I can test with a CGNAT of approximately 15,000 clients, with 2,000 ports allocated per client, to test performance, reaching approximately 25G/s of aggregate traffic.

hi team.
i install the new version of vyos.
Would you like to suggest a configuration scheme that I could test?