Page MenuHomeVyOS Platform

atop logs are not limited in size
Closed, ResolvedPublicBUG

Description

Atop stores accounting logs in two places: /run/atop/ and /var/log/atop/. By default, each day data from the first one moves to the second one. Also, data in the /var/log/atop/ rotating to keep only 28 last days (default value that we did not change).
The problem is that in rare cases files may be big enough to take more place than expected and we have no options to predetermine maximum size for both places where atop stores data.

To avoid potential problems with low memory and low storage deployments, we need to allow controlling a maximum size of both current and historical log files. It seems that two additional logrotate rules may help here.

Details

Difficulty level
Normal (likely a few hours)
Version
1.4-rolling-202108240117, 1.3-beta-202108230342
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Behavior change
Issue type
Bug (incorrect behavior)

Related Objects

StatusSubtypeAssignedTask
ResolvedBUGzsdc
ResolvedFEATURE REQUESTzsdc

Event Timeline

How about CLI set system syslog atop file 5
That means save the latest 5 files.

Also, I don't find anything in /run/atop/

vyos@r1-roll# ls -la /run/atop
ls: cannot access '/run/atop': No such file or directory
[edit]
vyos@r1-roll#
syncer added a subscriber: syncer.

let's do it but not in syslog section

syncer triaged this task as Normal priority.Oct 17 2021, 1:41 PM

@zsdc Provide please options to solve it.

zsdc changed the task status from Open to Confirmed.Oct 29 2021, 11:09 AM

After some investigation, we figured out several ways how to solve or at least mitigate the problem. From my point of view, the optimal for both developers and customers is the next one.

Depending on the package version, atop uses crontab or systemd timer for rotating logs and restarting the service (should be restarted for rotation).
Logrotate in its turn, use its own logic that can be controlled by us/customers more carefully. The problem here is that these two rotations are almost incompatible, which leads to multiple problems and unreasonable complex logic to combine them together.

Therefore, I think that we should disable all the services, cron items, timers that come with the atop package (because they are all tightly connected), and then do the next:

  1. Create our own atop systemd service or overwrite settings in the default one. This allows us to control exec options as we want.
  2. Add to CLI setting that will control: count of log files to keep, the maximum size of each file, the maximum days to keep log files.
  3. Based on these CLI options, create a logrotate config file that will monitor the atop file and rotate it if necessary. Restarting service during rotation will also be done by logrotate (currently this is done by atop scripts).

With this, we can set default size and give the ability to change it if necessary.

Nuances that we need to know:

  • logrotate timer should be changed to more often (currently it runs daily), otherwise, log files may grow up too fast and take too much space before rotation.
  • the individual size of log (also as total) can be bigger than a configured value because logrotate will trigger the condition only when size will overcome it. Unfortunately, there is no way to change this, regardless of rotation strategy or tools.
zsdc changed the task status from Confirmed to In progress.Nov 9 2021, 5:02 PM

Hardcoded version of the fix for 1.4:
https://github.com/vyos/vyos-1x/pull/1068
https://github.com/vyos/vyos-build/pull/201

Adding CLI after this is a trivial task. We can start from two entries - file size and count of files.

Viacheslav changed the task status from In progress to Needs testing.Jan 9 2022, 4:39 PM
dmbaturin added a subscriber: dmbaturin.

Backport to 1.3 is not worth the trouble since the issue is low-impact.