Page MenuHomeVyOS Platform

Keepalived memory utilisation issue when constantly getting its state in JSON format
Closed, ResolvedPublicBUG



We are monitoring keepalived/VRRP by getting its state in JSON format every 10s, this is done by sending a particular signal to the keepalived process, as described in keepalived man pages.
Since we started doing it we noticed keepalived memory utilisation constantly growing until it consumes all the available memory and causes the firewall to crash. Keepalived version that comes with VyOS is 2.0.10. We upgraded it to 2.1.5 which is backported to debian buster and the issue disappeared.

● keepalived.service - Keepalive Daemon (LVS and VRRP)
   Loaded: loaded (/lib/systemd/system/keepalived.service; disabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/keepalived.service.d
   Active: active (running) since Wed 2021-05-12 05:57:56 UTC; 5h 17min ago
 Main PID: 15486 (keepalived)
    Tasks: 5 (limit: 546)
   Memory: 134.5M
   CGroup: /system.slice/keepalived.service
           ├─15486 /usr/sbin/keepalived --dont-fork --snmp
           ├─15499 /usr/sbin/keepalived --dont-fork --snmp
           └─15566 python3 /usr/libexec/vyos/system/ /run/keepalived_notify_fifo
              total        used        free      shared  buff/cache   available
Mem:          484Mi       367Mi        15Mi       8.0Mi       101Mi        92Mi
Swap:            0B          0B          0B

image.png (692×1 px, 63 KB)

I've built 2.0.10 with --mem-check to see logs but didn't see any issue reported.

---[ Keepalived memory dump for (VRRP Child process) ]---

---[ Keepalived memory dump summary for (VRRP Child process) ]---
Total number of bytes not freed...: 0
Number of entries not freed.......: 0
Maximum allocated entries.........: 271
Maximum memory allocated..........: 29217
Number of mallocs.................: 2348
Number of reallocs................: 380
Number of bad entries.............: 0
Number of buffer overrun..........: 0
Number of 0 size allocations......: 0

=> Program seems to be memory allocation safe...


Difficulty level
Unknown (require assessment)
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Package upgrade

Related Objects

Event Timeline

@krox2 The next rolling release will be with keepalived 2.1.5. Can you check?

Viacheslav changed the task status from Open to Needs testing.May 13 2021, 7:32 AM

@Viacheslav We have been running the new rolling realse in the lab since 24th May with no issues. Thanks for help.

c-po moved this task from Need Triage to Finished on the VyOS 1.3 Equuleus board.
c-po moved this task from Backlog to Finished on the VyOS 1.4 Sagitta board.