The following configuration takes 430 seconds to boot on my APU4C4 (quad core 1GHz, 4GB ECC RAM) board using VyOS 1.2.4.
Boot time increases to 1000 seconds running VyOS 1.3 rolling.
I hope this configuration helps performance testing.
c-po | |
Mar 1 2020, 2:48 PM |
F538228: br1.svg | |
May 20 2020, 10:16 AM |
F538239: image.png | |
May 20 2020, 10:16 AM |
F537693: bootchart-20200520-0404.svg | |
May 20 2020, 6:13 AM |
F478492: config.boot | |
Mar 1 2020, 2:48 PM |
The following configuration takes 430 seconds to boot on my APU4C4 (quad core 1GHz, 4GB ECC RAM) board using VyOS 1.2.4.
Boot time increases to 1000 seconds running VyOS 1.3 rolling.
I hope this configuration helps performance testing.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Wontfix | FEATURE REQUEST | dmbaturin | T1230 Improving Boot Time for Large Firewall Configurations | ||
Resolved | BUG | erkin | T2088 Increased boot time from 1.2.4 -> 1.3 rolling by 100% |
One of the offeners is
vyos@vyos# show system config-management commit-archive location tftp://172.16.20.200/CONFIGS/BR1.wueIII
removing this line will reduce the boot time to 700 seocnds (down by 30%) .. weird
0-18 kernel boot
18-41 system starting inc FRR
70-120 the python most of the time is spend/wasted in cli-shell-api - so I would think reading the configuration file. If we can optimise / reduce this number of calls it would be very good.
140-220 is more or less firewall setup with vyatta-firewall / vyatta-upset.pl / ip6tables
rounding numbes:
So 22% in cli-shell-api
So 36% in firewall
Please see attached the full log of the boot process. I try to supply a PNG later on but I guess it will be really huge.
Every call to /bin/cli-shell-api --show-working-only --show-show-defaults --show-ignore-edit showConfig takes multiples seconds (6?)
/usr/libexec/vyos/conf_mode/system-timezone.py call it twice.
/usr/libexec/vyos/conf_mode/nat.py call it twice
/usr/libexec/vyos/conf_mode/interfaces-loopback.py call it twice ... etc.
Looking at the ... seconds it takes to set the hostname ...
The code for /usr/libexec/vyos/conf_mode/host_name.py is spent waiting for vyos-hostd to come back
sudo apt install python3-pip sudo pip3 install -U pip setuptools wheel virtualenv pip3 install git+https://github.com/evanhempel/python-flamegraph.git python3 -m flamegraph -o perf.log /usr/libexec/vyos/conf_mode/host_name.py ./flamegraph.pl perf.log > unix-time.svg
Possibly it's in cases where it's first called in get_config(), then also later in verify() or apply() - the object is function local scope and isn't saved in a global variable. It would be a simple fix if Config() were initialised in main() and passed as an argument to any function that needs it.
I've created some proof of concept code that i think could help on this issue. https://github.com/runborg/vyos-1x/blob/main-cfg/src/conf_mode/main.py this is a conf-mode executor that handles multiple conf mode scripts. The reason i think this could seriously help on this issue is that as this is all running inside a single python tnterpreter, its able to load the config object once and pass it to all needed conf_mode scripts without a need for reinitialization.
The only big bottleneck with this script is that it only executes as a single vyatta-cfg priority so all executed scripts needs to be in a continous order without old bash dependencies.
As the current "priority map" there aren't a loot of concurrent python blocks, but i think many of the remaining bash/perl scripts could be moved to new places. https://pastebin.com/z6ZvkJKB
I did a quick test with 1.3.0-epa3 and this config loads in about 6 minutes in my VM (2 GB memory, 1 core at 1,6 GHz) at boot, some WireGuard config errors notwithstanding. After all the performance improvements, I think we can consider this resolved at this point.
Sorry, but I dont think this is fixed; I just attempted to upgrade my working 1.2.8 configuration to 1.3.
The 1.2.8 configuration will (eventually, after a very long time) fail with an out-of-memory condition when attempting to load on 1.3.
[ 812.814956] Out of memory: Killed process 1348 (python3) total-vm:3779656kB, anon-rss:3425936kB, file-rss:1808kB, shmem-rss:0kB, UID:0 pgtables:6784kB oom_score_adj:0 [ 813.965717] vyos-router[813]: Starting VyOS router: migrate rl-system firewall configure/usr/libexec/vyos/init/vyos-router: line 104: 1348 Killed sg ${GROUP} -c "$vyos_libexec_dir/vyos-boot-config-loader.py $BOOTFILE" [ 814.089133] vyos-router[813]: failed!
This is under KVM, so I just doubled the memory (from 4G to 8G) and tried. It takes longer, but eventually bombs. Watching in virt-top shows memory usage slowly increasing:
667221 libvirt+ 20 0 10.6g 8.1g 28044 S 137.7 1.6 48:39.51 qemu-sy+
On 1.2.8, the exact same configuration loaded in just about 5 minutes. I'm still waiting for the login prompt on 1.3 now, after ~20 minutes. Have to fall back to 1.2.8, unfortunately, perhaps I might try a 1.4 rolling when I get back from holiday.