Prometheus already have a bunch of exporters here
https://prometheus.io/docs/instrumenting/exporters/
We need to create one for VyOS
Prometheus already have a bunch of exporters here
https://prometheus.io/docs/instrumenting/exporters/
We need to create one for VyOS
You can scrap SNMP to prometheus. Not sure if you want any gauges not covered by snmp
https://wiki.vyos.net/wiki/SNMP
install snmp-exporter /etc/prometheus/prometheus.yml : ... scrape_configs: - job_name: 'snmp' static_configs: - targets: - 10.11.22.33 metrics_path: /snmp relabel_configs: - source_labels: [__address__] target_label: __param_target - source_labels: [__param_target] target_label: instance - target_label: __address__ replacement: 127.0.0.1:9116 # The SNMP exporter's real hostname:port. ...
SNMP can be used as a workaround, but it is not suitable for much more than a couple metrics because is it very inefficient. Moreover the prometheus node exporter provide many more metrics out of the box.
Yeah, I'm going to second the motion on this one. I've been down the road of SNMP before, and just navigating the available metrics and trying to figure out what they are is less than straightforward. I mean, I've been at this sysadmin thing for a bit over 10 years now and I couldn't manage something useable out of it. It might be because I'm not a great sysadmin, but a lot of people are not great sysadmins. 😛 Having this feature would be great for accessibility and discoverability.
If we were to turn this into a user story, I believe it could look like this:
As a VyOS administrator,
I want to be able to scrape Prometheus metrics off of the router,
So that I can monitor the health of the router
I think it could also be broken down into the following tasks:
From my experience, none of those tasks are particularly complex, it's just a bit of a list to process... and it's certainly not getting done overnight.
Other points of interest / opinion:
Quite interesting, support, in fact some information can not be captured from SNMP very well
ps: Please note that prometheus needs to depend on the compilation environment of go
We should avoid having a constellation of exporters, but favour having a single one. I feel like starting and stopping those would be pretty icky.
Handling multiple services is pretty easy with systemd. Having a super-exporter is an anti-pattern for Prometheus.
Most of the the metrics user would want out of VyOS are available in the node_exporter.
Also, starting with SNMP data doesn't seem to make a lot of sense. Reproducing SNMP is a bit of a non-goal in my opinion. Prometheus users are going to expect Prometheus-native data.
https://github.com/tynany/frr_exporter
To prevent forgetting, write the address of the exporter to task
@jack9603301 Do you know of a version of that FRR exporter that doesn't fork sub processes?
Do you know of a version of that FRR exporter that doesn't fork sub processes?
Please forgive me for not understanding what you mean
The frr_exporter linked uses os/exec to run an external binray, /usr/bin/vtysh. This is not a great way to build an exporter, as it can lead to a fork bomb. There is also the overhead of calling the external binary to gather data.
The frr_exporter linked uses os/exec to run an external binray, /usr/bin/vtysh. This is not a great way to build an exporter, as it can lead to a fork bomb. There is also the overhead of calling the external binary to gather data.
Just let it be started by systemed administration
No, that's not the problem. The exporter itself could potentially create thousands of sub processes if something were to go wrong.
There is a large amount of overhead in this process. Looking over some of the issues in the ffr_exporter's bug tracker shows it's pretty slow and problematic.
Most of Prometheus data is generated from the exporter. It is not collected and pushed in real time. When Prometheus queries, it can query relevant indications through the port exposed by the exporter. Therefore, I don't think it is possible to create thousands of sub processes/threads. What do you think?
I'm not sure you understand how this works.
Prometheus is a polling-based metrics collection system. When it scrapes the exporter, the exporter has to return the data.
The way the exporter works is that it uses exec to launch the external command to gather the data. This happens in real time for each scrape.
Because Prometheus is polling, and multiple Prometheus servers or humans can hit the /metrics endpoint, the collector must allow for concurrent scrapes.
So each concurrent scrape will fork new sub processes. And looking at the code, the exporter actually calls the sub process multiple times per scrape.
If those processes get stuck, they could build up, even if there are timeouts passed via context handling.
I think I understand what you mean. Don't worry. I'm also a user of Prometheus. I know how Prometheus works.
I think the feasible solution is to set the timeout
Once the time limit is exceeded, a sigkill signal is sent
Timeouts and SIGKILL don't always work. If process is stuck on IO, it will not exit.
Forking is _very_ bad, especially for an embedded router OS. Until this is fixed, I would _highly_ recommend against including this exporter. It is too dangerous to use.
It is true, but I just want to record it to avoid forgetting that another solution is to redevelop FRR and promote it in parallel with the official version of FRR (in other words, we can patch FRR or maintain a branch separately, then compile a version of our own, and get the indication directly from its code, but this work needs someone to do.)
It is more efficient to obtain monitoring data directly from service internal than external plug-in
If you can, make a patch, and then the automatic compilation script will automatically include the patch into the FRR source tree when compiling.)
The best possible solution would be for FRR to support Prometheus directly, rather than require an exporter.
I agree. Therefore, if someone understands the code structure of FRR, we can modify the implementation from within FRR according to Prometheus protocol framework, implement the exporter integration, and then generate a patch file. Set the automatic compilation script and automatically package it into DEB
https://git.freestone.net/cramer/frr-prometheus-stats
Hi, guys, I found an interesting script in frrouter's github repo. In fact, this is purely because someone wrote a script and submitted the following bug report:
https://github.com/FRRouting/frr/issues/5445
The address of this prometheus exporter script is as follows:
https://git.freestone.net/cramer/frr-prometheus-stats/-/raw/master/frr-prometheus-stats.py
Maybe some reference
This means that maybe we can set up our own exporter based on python3
Does anyone at least have an example of how to use the snmp exporter? For example a snmp.yml or generate one with the given mibs?
I do agree having an exporter would be really nice
@anthr76 we have ready telegraf exporter, maybe it will work for you?
https://docs.vyos.io/en/latest/configuration/service/monitoring.html
Prometheus-client already in 1.4
https://docs.vyos.io/en/latest/configuration/service/monitoring.html#prometheus-client
I wouldn't call telegraf a very good option. It does a very bad job of producing Prometheus metrics.
nothing special, the configuration described in our documentation
Prometheus Client exposes all metrics on /metrics (default) to be polled by a Prometheus server
This feature is missing documentation: https://docs.vyos.io/en/latest/configuration/service/monitoring.html