Page MenuHomeVyOS Platform

FRRouting Configuration Loss on Abnormal Service Restart
Closed, ResolvedPublicBUG

Description

Summary

When the FRRouting (FRR) service on VyOS experiences an abnormal termination (e.g., an out-of-memory (OOM) kill), systemd automatically restarts the service. During this restart, the FRR configuration file /run/frr/config/frr.conf is overwritten. If the service was already running with an active configuration, this results in complete loss of routing configuration.

This issue is critical for production systems, where unexpected service restarts should not cause routing outages.

Affected Component

  • FRRouting (FRR) Service Management
  • File: /etc/systemd/system/frr.service.d/override.conf
  • Relevant Code: Link to Code

Steps to Reproduce

  1. Configure VyOS with the following example:
set interfaces ethernet eth0 address '192.0.2.1/24'
set protocols bgp address-family ipv4-unicast network 192.0.2.0/24
set protocols bgp system-as '65001'
commit
  1. Confirm that /run/frr/config/frr.conf contains:
frr version 10.2.2
frr defaults traditional
hostname vyos
service integrated-vtysh-config
!
router bgp 65001
 no bgp ebgp-requires-policy
 no bgp default ipv4-unicast
 no bgp network import-check
 !
 address-family ipv4 unicast
  network 192.0.2.0/24
 exit-address-family
exit
!
  1. Use the following script to simulate an OOM event:
#!/bin/bash
BGPDPID=$(pidof bgpd)
echo 1000 > /proc/$BGPDPID/oom_score_adj
echo f > /proc/sysrq-trigger
  1. After the system recovers, check /run/frr/config/frr.conf and /etc/frr/frr.conf.
  2. Observe that /etc/frr/frr.conf contains only the default minimal configuration:
log syslog
log facility local7

Expected Behavior

Upon service restart, if /run/frr/config/frr.conf already exists, it should be preserved.

Actual Behavior

FRR service restart logic indiscriminately overwrites /run/frr/config/frr.conf, leading to configuration loss.

Proposed Solution

Add a check in the service override to verify if /run/frr/config/frr.conf already exists. If it exists, do not recreate or overwrite it.

Details

Version
1.4.2, circinus, 2025.04.28-0020-rolling
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

zsdc changed the task status from Open to Confirmed.
zsdc triaged this task as Normal priority.

Adding a simple condition seems to do the work:

[Unit]
After=vyos-router.service

[Service]
LimitNOFILE=4096
ExecStartPre=/bin/bash -c 'if [ ! -f /run/frr/config/frr.conf ]; then \
             mkdir -p /run/frr/config; \
             echo "log syslog" > /run/frr/config/frr.conf; \
             echo "log facility local7" >> /run/frr/config/frr.conf; \
             chown frr:frr /run/frr/config/frr.conf; \
             chmod 664 /run/frr/config/frr.conf; \
             mount --bind /run/frr/config/frr.conf /etc/frr/frr.conf; \
fi;'
dmbaturin moved this task from Backlog to Finished on the VyOS 1.4 Sagitta (1.4.3) board.
dmbaturin moved this task from Open to Finished on the VyOS 1.5 Circinus (1.5-stream-2025-Q2) board.
dmbaturin moved this task from Need Triage to Completed on the VyOS Rolling board.