Improve CLI value validator performance
Closed, ResolvedPublicFEATURE REQUEST
Actions

Assigned To

Authored By

	thomas-mangin
	May 7 2020, 8:41 PM

Description

Currently, there is a high cost to use python for the validators. The python interpreter must be forked at each test and this is not cheap:

vyos@vyos# time bash -c ""

real	0m0.001s
user	0m0.001s
sys	0m0.000s
[edit]
vyos@vyos# time python3 -c ""

real	0m0.019s
user	0m0.011s
sys	0m0.007s
[edit]
vyos@vyos# time python3 -c "import vyos.ifconfig"

real	0m0.173s
user	0m0.104s
sys	0m0.031s

So the bash code is 100-200x time faster than the python per call. However, even that code is not as fast as some compiled C would be as it forks and exec some other application.

Python performance is mostly "good enough" for most cases when the interpreter is running. If the python code is running in long-lived processes, then the initialisation impact of python becomes a moot point.

For example, using Unix socket, and a simple example from https://pymotw.com/3/socket/uds.html, (a simple echo program) gives the following results:

vyos@vyos:~$ time bash -c 'echo "test" | nc -U uds_socket | head -1'
test

real	0m0.004s
user	0m0.003s
sys	0m0.000s

This includes forking shell, nc and head and using echo, even so the performance is near bash.

So why would we want to use a long-lived python process over using C/OCaml/Bash for validation/
1 - it will keep the number of languages down in the project. Python is a widely known language.
2 - it will allow using the same validation code between the config / operational and validation code allowing consistency
3 - the C code could be adapted to use the Unix pipe, saving an expensive fork
4 - long lived code will let to other optimisation
5 - the framework set will also be available for configuration of operational mode, where the tools are more complex and better written in python

For the optimisation(4), for exmple, templating is currently be used to generate the configuration file for thrid party application. Like for fork, the inialisation is high and having long-lived program would improve performances.

I believe (5) is, however, the most compelling point, as the long term direction of the project should be considered. Currently, the XML is used to generate some files, then used by some C code ... It would make sense to have the XML being used by the same python code used for the configuration. Once all the logic is moved within the Python, this becomes possible. Also, possibly removing the need to even run as a daemon, as no forking will be required for anything and the initialisation cost may be acceptable. Some other feature also become available but this becomes off-topic (not performance-related).

The project already uses multiple languages and not all contributors are fluent in them all. I can count Perl, XML, Shell, C, Python, OCaml (and surely a few DSL). Python is likely to be the most known programming language by likely contributors.

Having all the code under python also open other options such as using entry-points to generate single applications for each of the validation.

I propose to use this ticket to:

discuss the pro and cons of all the approach
share numbers and performance about the different solutions for objective decision making
but leave out the other thing possible and instead use T2407

Some part of discussion already occurred in:

Details

Difficulty level: Unknown (require assessment)
Version: -
Why the issue appeared?: Will be filled on close
Is it a breaking change?: Unspecified (possibly destroys the router)
Issue type: Internal change (not visible to end users)

Related Objects

Mentioned In: T2418: Interfaces completion (list_interfaces.py) is slow
Mentioned Here: T5388: Something is fishy with commit and boot times when more than a few hundred static routes are being used
T2407: alternate installation for the vyos-1x python code
T2425: Rewrite all policy zebra filters to XML/Python style

Event Timeline

thomas-mangin created this task.May 7 2020, 8:41 PM

pasik subscribed.May 8 2020, 9:34 AM

I have implemented a "validator program" which is an entry point which will locate a named python program and run it. It uses the import mechanism of python at startup so the setup time is very high.

The same code was then used to create a unix socket daemon to evaluate performances. The daemon can be started using:

unix_daemon -v

The repository can be downloaded from:
https://github.com/thomas-mangin/vyos-1x/tree/T2433

1x test:

Calling the old/removed numeric.py python validator code once (same code different location)

vyos@vyos:~$ time python3 /usr/lib/python3/dist-packages/vyos/validators/numeric.py --positive 1

real	0m0.025s
user	0m0.017s
sys	0m0.008s

Calling the same code via a dispatcher

vyos@vyos:~$ time validator numeric --positive 1

real	0m0.106s
user	0m0.091s
sys	0m0.014s

(0.106+0.091+0.014)/(0.025+0.017+0.008)

= 4.22

So the dispatcher has a setup time which causes the program 4/5x time slower. The reason is the time it takes to load/parse the code from the vyos repo used (and a bit of dispatch), which is not used otherwise by the single numeric. So we can see this as the impact of loading a "larg'ish" library in python.

1000x time test:

validator code

vyos@vyos:~$ time sudo sh -c ' for ((n=0;n<1000;n++)); do validator numeric --positive 1; done'

real	1m38.290s
user	1m21.231s
sys	0m16.103s

That's painful!

numeric.py ython code

vyos@vyos:~$ time sudo sh -c ' for ((n=0;n<1000;n++)); do python3 /usr/lib/python3/dist-packages/vyos/validators/numeric.py --positive 1; done'

real	0m17.268s
user	0m14.894s
sys	0m2.250s

Using OCaml numeric

vyos@vyos:~$ time sudo sh -c ' for ((n=0;n<1000;n++)); do /usr/libexec/vyos/validators/numeric --positive 1; done'

real	0m0.583s
user	0m0.496s
sys	0m0.081s

(17.268+14.894+2.250)/(0.583+0.496+0.081)

= 29

So on my router, Ocaml is 29x faster than the python code (which matches the other test done by others).

Using Python numeric.py "unmodified" via unix socket server magic, to remove the cost of setup of python.

vyos@vyos:~$ time sudo sh -c ' for ((n=0;n<1000;n++)); do echo numeric --positive 1 | nc -U ./validator.socket > /dev/null; done'

real	0m1.304s
user	0m0.569s
sys	0m0.130s

(1.304+0.569+0.130)/(0.583+0.496+0.081)

= 1.72

So using python via a daemon is still slower than OCaml BUT:

no attempt to optimise parsing of command line (still use argparse) and the complex current syntax
it includes the cost of nc which could be saved from the C calling code

So I would argue that it should be possible to use Python code for all the checks and make it as fast as OCaml

thomas-mangin mentioned this in T2418: Interfaces completion (list_interfaces.py) is slow.May 30 2020, 3:45 PM

erkin set Issue type to Internal change (not visible to end users).Aug 30 2021, 6:18 AM

erkin removed a subscriber: Active contributors.

syncer edited projects, added VyOS 1.3 Equuleus (1.3.0); removed VyOS 1.3 Equuleus.Nov 6 2021, 11:24 AM

syncer edited projects, added VyOS 1.3 Equuleus (1.3.3); removed VyOS 1.3 Equuleus (1.3.0).Aug 29 2022, 7:05 AM

syncer edited projects, added VyOS 1.3 Equuleus (1.3.4); removed VyOS 1.3 Equuleus (1.3.3).Jul 12 2023, 9:42 PM

syncer edited projects, added VyOS 1.3 Equuleus (1.3.5); removed VyOS 1.3 Equuleus (1.3.4).Aug 25 2023, 9:30 PM

Viacheslav edited projects, added VyOS 1.4 Sagitta; removed VyOS 1.3 Equuleus (1.3.5).Aug 28 2023, 1:24 PM

Apachez subscribed.Aug 28 2023, 1:28 PM

Viacheslav triaged this task as Normal priority.Jan 20 2024, 12:59 AM

The most frequently-used validators are already in OCaml now, for the rest we'll need to create separate tasks.

dmbaturin renamed this task from Increase performance using unix socket to Improve CLI value validator performance.Mar 12 2024, 6:09 PM

dmbaturin edited projects, added VyOS 1.4 Sagitta (1.4.0-epa1); removed VyOS 1.4 Sagitta.

Is this related to the long commit and boot times when one have more than a handful routes or firewall rules as described in https://vyos.dev/T5388 ?

Improve CLI value validator performanceClosed, ResolvedPublicFEATURE REQUESTActions

Description

Details

Related Objects

Event Timeline

Improve CLI value validator performance
Closed, ResolvedPublicFEATURE REQUEST
Actions