This has been on my extended todo list for a long time now, but there were always higher priority issues. I'm glad someone is working on it!
- Feed Queries
- All Stories
- Search
- Feed Search
- Transactions
- Transaction Logs
Jun 8 2020
Jun 7 2020
@mrozentsvayg look at https://community.openvpn.net/openvpn/ticket/360 (this is documented in the code comment right above your change as well). OpenVPN on Linux in server mode with standard protocols doesn't listen on IPv6, just IPv4. We need to force it to bind to a IPv6 socket using these undocumented *6 protocols, then it'll listen on both IPv4 and IPv6. This wouldn't be necessary if OpenVPN listened on IPv6 with the default protocols or autodetected whether the local or remote IPs are v4 or v6 and chose the correct socket type, but it doesn't. Complain to the above ticket 360. We're just working within the limitations imposed by OpenVPN.
Jun 6 2020
Can you ask the user if you want to start the migration failure fallback mechanism on the first boot of the new image when upgrading, and if the user chooses to enable this mechanism, you should let the user select an old secure image (execute the mechanism only on the first boot)?
I think 'configure; load; commit' is important to make debugging easier and faster. There are issues with the vyos-config-debug method: it needs a full reboot to test every change, which can take minutes (and one may not fix the bug in the first 10 tries even, depending on how sleep deprived one is) and it lacks a easy way to see the scripts stdout/stderr (it is discarded unless we enable airbag's debug log, which is yet another thing to have in mind) as the standard traceback that's logged to /tmp may not be enough to catch the exact error and we need to print some variables to look at them or something like that. But mainly the issue is rebooting is much slower than just doing load/commit.
Can't we just read config.boot into the session config of configtree or config? Wasn't that exactly what was done before? I'm 100% there was a function that read the config.boot file in config.py in case the config system wasn't initialised.
Right, that's obvious. The issue is that we need to know to *which* image to switch to, but it's easy to solve, as I'll describe.
Ah right, then all the things about replacing config.boot aren't necessary. I was thinking that /config was permanent between images, I don't know how I forgot that it lives inside each image separately.
Sorry, I don't know if I understand it wrong, but please allow me to express my opinion, but when you install and add a new image from the old image for upgrading, it may occur that it can't be used normally at startup config.boot Configuration (especially rolling update),
Please test with the latest 1.3 rolling image if the bug is now fixed.
Jun 5 2020
@thomas-mangin that's great if the POC already has the above - I'm on board with making it the replacement for vbash in that case. I can live without word jumping and line deletion (for now) if it has tab completion and history (I will miss reverse history search a lot as it doesn't search just the beginning of the line but the whole line for the pattern, I'm assuming prompt-toolkit just searches the beginning?). I do need to test it when I get some time to see if anything is still missing.
To add my 2 cents:
Another possibility would be to modify the VTunIf's bridgeable parameter when creating it. That wouldn't require a different name for tap and all the config migration that comes with it, but I don't know if it's possible.
I also don't think we can find out whether a interface is tun or tap from the up/down script (no parameter or environment variable seems to tell it this) so it'd be good if tun and tap had different names due to this too. It would require migrating the name of existing interfaces (from vtun to vtap if tunnel type is tap) under interfaces openvpn, interfaces bridge member, service router-advert, firewall, nat, possibly other places as well.
What happens when one remote-host is IPv4 and one is IPv6? The proposed fix would leave the protocol as udp6 in that case and the error would still be there.
duplicate of T2468
Writing the config parsing, editing, committing and saving shouldn't be that hard.
Re-implementing completion with the current set of features would be very hard.
A workaround would be to simply call interfaces-ethernet.py from the bond script itself after it removes it from the bond.
The consequence of migrating bond member config from the interface node to the bond node, the same problem exists for bridges as well.
Also regarding bridging: the current VTunIf says vtun is bridgeable. I doubt bridging a tun interface is wanted as that's a purely L3 tunnel. As the bridge member config syntax places the members under the bridge interface, the bridge interface determines if a interface is bridgeable by looking at its class definition. Thus to make openvpn in tun mode not bridgeable and tap mode bridgeable, those would need to be 2 different classes with different interface names ('vtun' and 'vtap'?). A hackish way is possible by making the bridge code check the openvpn config directly, but I highly dislike hackish solutions. Even T2241 was a 'hackish' solution that was necessary due to a previous bridge syntax migration without thinking about the consequences of it (moving the bridge member config under the bridge code makes syntactical sense, but it requires hackish workarounds like T2241 with the curernt way the config system operates)
Looking at the above errors:
Indeed, I didn't test client mode with the IPv6 patch, I assumed openvpn would use 'proto' for the listening socket only and not for the client socket (since it could detect which family the remote-host address is, it could select the correct socket, but it honors the 'proto' in the config) so my assumption was wrong. I appreciate the help.
Jun 2 2020
A significant part of the old config system is the bash-completion integration as well. I assume this is not integrated with bash but is a separate console that you start and takes over all stdin/stdout? Is it possible to implement the same completion output as there is now?
May 30 2020
I also couldn't immediately grasp how your suggestion would fit into the whole design concept of VyOS - now that I've had some time to think about it, it indeed makes sense. I was having problems seeing how having a complex daemon just for validators would make sense while still leaving everything else as is - it makes sense if everything is moved there. And only because of the current state of having to "marry" vyatta-cfg with Python and Python's slow initialization times. I guess in the future, if VyConf gets to a place where it can replace vyatta-cfg, it would still make sense to run everything in a single daemon.
@thomas-mangin I think there was a misunderstanding between us. The disagreement we had regarding the way to implement vyatta-cfg validators was because the validators are a integral part of vyatta-cfg operation. They are also simple and small as they only need to validate the types and constraints of config nodes. As they are tied to vyatta-cfg closely, which operates by executing a new process for each config node, that execution needs to be very fast. I was against your solution (a validator daemon in Python listening on a socket file and a companion client in a language that's faster to start up) just because it seemed needlessly complicated for what it needs to achieve. Node validation in vyatta-cfg is a case of simple constraints, not complex interdependencies that would require a higher level language. As we later do the complex validation in the configuration scripts that are written in python themselves, all the complexity can already be put there. Now you may be wondering why this validation is done in two places, it's because of the legacy of vyatta-cfg. In the old days of vyatta, many config nodes didn't have corresponding scripts at all, they were self contained and applied the config directly using system utilities and simple shell scripts that were part of the node definitions themselves. In that case, the config node validators were the only validation of a value that was done and each config node coould specify their own shell snippet or script to validate its own value. This made sense in that design concept.
It is also still an integral part of the shell environment: in config mode, a set command with a invalid value will return an error immediately as its validator returns an error. The configuration script can catch an error only when a commit is triggered.
Now that we are tacking a completely different design concept onto that, things become complex. If the new design says: "all new code must be python" but since we're marrying this new code with the old vyatta-cfg core (vyatta-cfg is still the heart and core of VyOS with Python being the "worker"), things will become very unoptimal and complex and bizarre in some places that wouldn't need to be that way and could be left simple. The above being an example of this complexity due to a design choice.
Possibly it's in cases where it's first called in get_config(), then also later in verify() or apply() - the object is function local scope and isn't saved in a global variable. It would be a simple fix if Config() were initialised in main() and passed as an argument to any function that needs it.
@c-po IMO the script should be kept but fixed so it builds all valid packages. Otherwise there's no way to build our own packages with one command. Sure we can build them one by one by manually cloning each repository, but that's automated by this script. There's a task I already opened for it.
I'm working on a larger set of patches for DNS, a fix for this will be included
This does not compile python scripts without a .py extension (there are several in src/services, src/utils that have #!/usr/bin/env python3)
May 28 2020
I haven't looked at how dhcp6c gets started currently. VyOS uses systemd to manage the services, but none of them should be set to enabled, they're all started manually via VyOS scripts. It's possible it's done differently in this case, I'm not going to speculate on something I don't know. I assumed it got started the same, when the interface script starts it.
On the dependency problem, I don't know how dhcp6c behaves when it's started with configured nonexistent interfaces. If it does cause a failure to start, that is an issue that needs to be fixed via another way. I'm not the implementer of this code so I'm not going to speculate on the best way to do it.
That's basically re-implementing and duplicating code from the ethernet script. It would work for bonding and for the link-local, but I'm thinking there may be other attributes that enslaving it to a bond (or bridge) may have changed (MTU?) that I don't know if they're changed back by the kernel after unenslaving it. It would quickly become a kludge.
You'd also need to do the same in the bridge interface, but there there can be any interface type enslaved, so you'd need to first get the interfaces config path (via Section). You'd end up with 2 pieces of code that are slightly different that duplicate code from the interface scripts (rather I think it's been moved into configdict.intf_from_dict).
It is possible, but I don't like it at all.
Sure, a new task would be very welcome so there's less spam in this task.
Why do you want to postpone dhcp6c startup? All the requirements and dependencies are there when the interface scripts start it. The interface is brought up before it's started. Other than waiting for a pppoe connection, yes, that would be worthwile. Each interface script has a priority so that other interfaces they depend on are configured before the one that depends on them, that's set in the priority tag in the XML definitions and done by vyatta-cfg. They're started sequentially by their priority value, not all at once.
@gadams I agree it's confusing, to change the syntax isn't hard, we just have to choose the best user-friendly syntax and behavior. It can be even accomplished without changing the syntax, by:
a) if 'dhcpv6-options delegate' is set, do the same as for 'parameters-only', plus start dhclient by add_addr('dhcpv6')
or
b) start dhclient if either 'dhcpv6-parameters' or 'address dhcpv6' is set but only assign an address in the 'address dhcpv6' case, may be the simplest option.
@gadams have you tried the above 2 settings: 'address dhcpv6' and 'dhcpv6-options parameters-only' without your patch to see if the client doesn't assign an address in that case?
@tbr thanks for clarifying that, I agree. So the way to do that would be to set 'address dhcpv6' and 'dhcpv6-options parameters-only'. That is slightly confusing at first, as the combination of those 2 options shouldn't actually assign an address. I haven't tried it but that's how I expect it should work, I don't use PD currently. If it does work my comments regarding new methods in scripts are entirely unneeded.
This is difficult to solve with the current config syntax where the bond and bridge members are under the bonding and bridge nodes. When modifying bond or bridge members, only interfaces-bonding.py or interfaces-bridge.py is called, which can't modify the interfaces themselves, as all the interface logic (adding and deleting addresses) is in the interface script itself (e.g. interfaces-ethernet.py). The thing that says which script to run is the owner attribute of the interface node, which is ran by vyatta-cfg scripts on commit.
Maybe we should add new methods to the Interface or DHCP class to allow starting just DHCPv6-PD without assigning an address to it? The way it's done now is by assigning an address with the value "dhcpv6" to the interface through the add/del_addr methods of Interface class. There needs to be a separate method for DHCPv6-PD without addressing (and generate a dhcpc config that doesn't assign the address, of course).
May 27 2020
ps afx shows frr processes still running. systemctl stop vyos-router; pkill -f '*.frr*.'; systemctl start vyos-router makes vyos-router start successfully, but with these errors:
May 26 2020
Will be done as part of T2486
May 25 2020
Upstream says this is normal behavior when DNSSEC is enabled, so the workaround that I'm working on (addNTA) is actually the proper fix.
May 24 2020
@jack9603301 Do you have a working RA on that interface? You can set service router-advert interface <if> prefix ::/64 for RA to advertise all prefixes on the interface. That way if the DHCPv6-PD prefix changes it will send advertisements for the new prefix automatically.
If this can be solved by a kernel update, there was talk about maybe having different build "flavors" in the past - one with all the hardware nic drivers, one without. The minimal image could then have the latest (5.x) kernel.
There's T2085 which prevents us from testing any newer kernel ourselves as it's built by Jenkinsfiles in the CI, we'd need to manually do the steps the CI does to build a kernel. I proposed a shared script solution for these repositories in that task that could be called from both the CI and vyos-build, this would allow anyone to build all packages, including the kernel, through vyos-build, just for cases like this.
May 23 2020
May 22 2020
May 21 2020
In T2486#64335, @jjakob wrote:Also, this is reproducible with pdns-recursor from upstream master (4.4.0) so upgrading won't help.
In T2486#64339, @jack9603301 wrote:I can summarize the following solutions, and maybe there are other solutions:
a) Fix the bug yourself
b) Use other storage mechanisms to resolve records to bypass
c) Self parsing hosts
If you mean we should maintain our own fork of powerdns, I'm against that. PowerDNS is open source and anyone can submit patches to it the same as VyOS. If you want to try fixing the bug in pdns-recursor, you can clone pdns, debug it, build it, test it and submit the patch at https://github.com/PowerDNS/pdns . Of course you have to oblige by their contribution guidelines that are listed there. They also have a IRC channel at OFTC #powerdns .
You mean that when pdns-recursor recursively forwards the request to the back-end recursive parsing service, the static entries in the query / etc / hosts will always return NXDOMAIN?
The full description and way to reproduce is at https://github.com/PowerDNS/pdns/issues/9136 since this is a pdns-recursor bug. But in essence, after pdns-recursor startup or restart, requests that come in to pdns-recursor (service dns forwarding in VyOS) for a domain from /etc/hosts work normally. Then a request for any other domain comes in, that gets forwarded via forward-zones-recurse (service dns forwarding name-server), for example google.com, that request gets resolved without errors, but causes this bug to manifest. After that, a request for any hostname from /etc/hosts returns NXDOMAIN.
I think the way to do this is in src/conf-mode/interfaces-ethernet.py in apply(), don't change the interfaces mac if eth['is_bond_member'] is set.
May 20 2020
I think this should be fixed by the one that broke this, or no? I don't have the time to do any real work right now. Maybe in a week or 2.
Not really, the change to nobody:nogroup was by c-po in https://github.com/vyos/vyos-1x/commit/f371946044696737d1649d9119665b96430d2328
The commit by me you referenced just fixed a bug that resulted from that change.
For get_bridge_member_config, ifname_from_config and maceui64 to be able
to be moved into ifconfig.interfaces T2366 needs to be done first,
otherwise functionality will break.
Definitely, I'm not saying NPT should be removed, just discouraged in favour of using routed public prefixes where available. If the user chooses tho use NPT, the option should definitely still be there.
That's a case where having the ability to assign addreses from the received prefix via DHCPv6 on the internal interface would allow internal hosts to get managed addresses from the prefix automatically without the use of NPT or SLAAC. But that isn't implemented yet AFAIK.
I agree. NPTv6 was acceptable as a stopgap measure as VyOS didn't support DHCPv6-PD. Now that we have that (even though it's still young and needs testing), NPTv6 should be actively discouraged in the docs except for unavoidable cases, e.g. where the ISP wants to only give the client a single /64 but the client wants multiple L2 segments with IPv6, each needing its own /64 segment - unless there is a better alternative way to solve that I don't know of, other than demanding a /56 from the ISP or switching ISPs.
We definitely shouldn't be setting permissions on the socket to 777 or 666 - whoever has write access to it can modify the DNS configuration (pdns-recursor) and can thus inject malicious DNS records or add himself as a DNS forwarder and do MITM attacks.
May 17 2020
After a new reboot, the DHCP nameservers were correctly added to resolv.conf and powerdns recursor.conf. I had system name-server and service dns forwarding name-server set to a static IP. But after deleting these two static nameserver nodes, the DHCP nameservers are missing from both resolv.conf and recursor.conf.
May 16 2020
After 2 release dhcp interface eth1 and one renew dhcp interface eth1, I now have 2 dhclients running, so there is a bug in the op-mode release/renew code.
4079 ? Ss 0:00 /sbin/dhclient -4 -nw -cf /var/lib/dhcp/dhclient_eth1.conf -pf /var/lib/dhcp/dhclient_eth1.pid -lf /var/lib/dhcp/dhclient_eth1.leases eth1 4305 ? Ss 0:00 /sbin/dhclient -q -nw -cf /var/lib/dhcp/dhclient_eth1.conf -pf /var/lib/dhcp/dhclient_eth1.pid -lf /var/lib/dhcp/dhclient_eth1.leases eth1
Passing passwords via command line arguments is very bad practice. Curl has a -u option, if passed just the user it prompts for the password on stdin. This can simply be passed via shell redirection.