Page MenuHomeVyOS Platform

Container network loses VRF on container restart
Closed, ResolvedPublicBUG

Description

Container networks set via set container network <name> lose their VRF association when the last member container is restarted.

run add container image alpine:latest
set vrf name test table 999
set container network test prefix 192.168.0.0/24
set container network test vrf test
set container name test image alpine:latest
run restart container test

Workaround: retriggering something in the network config (e.g., add and remove no-name-server) brings it back.
Workaround 2: don't let the network go empty, e.g., by running a sleep container

Originally tried on 1.5-rolling-202411260813 but seems to still be an issue in current rolling 2025.04.01-0021-rolling.

Details

Version
2025.04.01
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

Viacheslav triaged this task as Normal priority.Apr 1 2025, 9:00 AM
c-po changed the task status from Open to Confirmed.Nov 1 2025, 9:07 AM

Some more background information:

set interfaces ethernet eth0 vif 10 address '172.16.33.201/24'
set interfaces ethernet eth0 vif 10 vrf 'MGMT'
set vrf name MGMT protocols static route 0.0.0.0/0 next-hop 172.16.33.254 interface 'eth0.10'
set vrf name MGMT table '100'

set container name test image 'alpine:latest'
set container name test network test mac '02:02:04:a9:ce:42'
set container network test prefix '192.168.0.0/24'
set container network test vrf 'MGMT'

Check VRF routing

vyos@vyos:~$ show ip route vrf MGMT
Codes: K - kernel route, C - connected, L - local, S - static,
       R - RIP, O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric, t - Table-Direct,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

VRF MGMT:
S>* 0.0.0.0/0 [1/0] via 172.16.33.254, eth0.10, weight 1, 00:28:19
C>* 172.16.33.0/24 is directly connected, eth0.10, weight 1, 00:28:21
L>* 172.16.33.201/32 is directly connected, eth0.10, weight 1, 00:28:21
C>* 192.168.0.0/24 is directly connected, pod-test, weight 1, 00:00:13
L>* 192.168.0.1/32 is directly connected, pod-test, weight 1, 00:00:13  <<-- this is the POD
vyos@vyos:~$ ping 192.168.0.1 vrf MGMT
PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
64 bytes from 192.168.0.1: icmp_seq=1 ttl=64 time=0.022 ms
64 bytes from 192.168.0.1: icmp_seq=2 ttl=64 time=0.036 ms

The issue is during container restart:

Nov 01 10:22:10 kernel: pod-test: port 1(veth0) entered disabled state
Nov 01 10:22:10 systemd[1]: vyos-container-test.service: Failed with result 'exit-code'.
Nov 01 10:22:10 systemd[1]: Stopped VyOS Container test.
Nov 01 10:22:10 systemd[1]: Starting VyOS Container test...
Nov 01 10:22:10 kernel: pod-test: port 1(veth0) entered blocking state
Nov 01 10:22:10 kernel: pod-test: port 1(veth0) entered disabled state
Nov 01 10:22:10 kernel: veth0: entered allmulticast mode
Nov 01 10:22:10 kernel: veth0: entered promiscuous mode
Nov 01 10:22:10 kernel: pod-test: port 1(veth0) entered blocking state
Nov 01 10:22:10 kernel: pod-test: port 1(veth0) entered forwarding state

So the interface is removed and re-attached to the Kernel.

Its also written in the comment at https://github.com/vyos/vyos-1x/blob/89e4e09ece8e54303f3d5e1f1ab4927e2351b24a/src/conf_mode/container.py#L676-L691

Networks are started only as soon as there is a consumer. If only a network is created in the first place, no need to assign it to a VRF as there's no consumer, yet.

So if there is only one POD or all PODs restarted - the VRF assignment needs to be retriggered - restored.

c-po renamed this task from Container network loses VRF on container restart to Container network looses VRF on container restart.Nov 5 2025, 8:46 PM
c-po changed the task status from Confirmed to In progress.
c-po moved this task from Backlog to Finished on the VyOS 1.4 Sagitta (1.4.4) board.
dmbaturin renamed this task from Container network looses VRF on container restart to Container network loses VRF on container restart.Thu, Dec 4, 7:38 PM
dmbaturin changed Is it a breaking change? from Unspecified (possibly destroys the router) to Perfectly compatible.