Summary
When the FRRouting (FRR) service on VyOS experiences an abnormal termination (e.g., an out-of-memory (OOM) kill), systemd automatically restarts the service. During this restart, the FRR configuration file /run/frr/config/frr.conf is overwritten. If the service was already running with an active configuration, this results in complete loss of routing configuration.
This issue is critical for production systems, where unexpected service restarts should not cause routing outages.
Affected Component
- FRRouting (FRR) Service Management
- File: /etc/systemd/system/frr.service.d/override.conf
- Relevant Code: Link to Code
Steps to Reproduce
- Configure VyOS with the following example:
set interfaces ethernet eth0 address '192.0.2.1/24' set protocols bgp address-family ipv4-unicast network 192.0.2.0/24 set protocols bgp system-as '65001' commit
- Confirm that /run/frr/config/frr.conf contains:
frr version 10.2.2 frr defaults traditional hostname vyos service integrated-vtysh-config ! router bgp 65001 no bgp ebgp-requires-policy no bgp default ipv4-unicast no bgp network import-check ! address-family ipv4 unicast network 192.0.2.0/24 exit-address-family exit !
- Use the following script to simulate an OOM event:
#!/bin/bash BGPDPID=$(pidof bgpd) echo 1000 > /proc/$BGPDPID/oom_score_adj echo f > /proc/sysrq-trigger
- After the system recovers, check /run/frr/config/frr.conf and /etc/frr/frr.conf.
- Observe that /etc/frr/frr.conf contains only the default minimal configuration:
log syslog log facility local7
Expected Behavior
Upon service restart, if /run/frr/config/frr.conf already exists, it should be preserved.
Actual Behavior
FRR service restart logic indiscriminately overwrites /run/frr/config/frr.conf, leading to configuration loss.
Proposed Solution
Add a check in the service override to verify if /run/frr/config/frr.conf already exists. If it exists, do not recreate or overwrite it.