Summary
A deployment of 400 VyOS routers (4 sites × 100 per IS-IS domain) on 2025.03.30.0020-rolling triggers unbounded growth in isisd memory usage. After ~30 minutes of full-mesh adjacency, isisd peaks above 2 GB and is OOM-killed, leading to FRR restarts and route flaps.
Affected Versions
VyOS build: 2025.03.30.0020-rolling
Topology & Scale
4 sites, each a single IS-IS domain over one shared L2 segment
100 routers per segment, full-mesh IS-IS adjacency
Analysis
The logs point to a leak in the IS-IS daemon’s SPF vertex-adjacency code path. During large-scale LSP processing, millions of “vertex adjacency” entries are built up in the fragment cache and never freed, causing isisd to consume over 2 GB of RAM (evidenced by ~1.6 GB of transparent huge-page allocations) before being killed by the OOM killer.
Key Logs & Metrics
Apr 17 11:45:15 … watchfrr: Thread Starvation – wakeup_send_echo() >4 s late Apr 17 11:52:27 … kernel: Out of memory: Killed process 1665 (isisd) total-vm:1405008kB, anon-rss:1395804kB Apr 17 11:52:29 … systemd: frr.service: Failed with result 'oom-kill' Apr 17 11:52:34 … watchfrr: all daemons up, restarting FRR 10.2.2 # Mem-Info: Node 0 anon_thp=1605632kB for isisd
The suspected reason for this issue is:
--- qmem isisd --- Type : Current# Size Total Max# MaxBytes ISIS SPF Vertex Adjacency : 36419888 64 2625814848 36419911 2625817304