Page MenuHomeVyOS Platform

Cannot create bond SR-IOV XCP-ng X540-T2
Closed, InvalidPublicBUG

Description

I have installed VyOS on Xenserver to work as a virtual L3 switch. I want to dedicate 4 interfaces to VyOS, passing them as SR-IOV interfaces and LACP bonding them with a downstream L2 switch.
When I create the bond it works fine for one interface, but it breaks when I try to add a second one.
In particular

set interfaces bonding bond0 member interface eth0
commit
save

works fine, but

set interfaces bonding bond0 member interface eth1
commit

results in

PermissionError: [Errno 1] Operation not permitted

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/libexec/vyos/conf_mode/interfaces-bonding.py", line 209, in <module>
    apply(c)
  File "/usr/libexec/vyos/conf_mode/interfaces-bonding.py", line 200, in apply
    b.update(bond)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/bond.py", line 401, in update
    self.add_port(interface)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/bond.py", line 262, in add_port
    ret = self.set_interface('bond_add_port', f'+{interface}')
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/control.py", line 182, in set_interface
    return self._set_sysfs(self.config, name, value)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/control.py", line 166, in _set_sysfs
    self._sysfs_set[name]['location'].format(**config), value)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/control.py", line 132, in _write_sysfs
    f.write(str(value))
PermissionError: [Errno 1] Operation not permitted



[[interfaces bonding bond0]] failed
Commit failed

The order in which I add the interfaces does not matter, and using various combinations of the 4 interfaces yields the same result.

The main issue seems to be however that VyOS is not able to set the MAC address for a SR-IOV interface. When running above commands i get in the guest dmesg

[  260.102308] ixgbevf 0000:00:06.0 eth0: NIC Link is Up 1 Gbps
[  260.108878] bond0: (slave eth0): Enslaving as a backup interface with an up link
[  260.153678] bond0: (slave eth1): Error -1 calling set_mac_address
[  260.161615] bond0: (slave eth1): Error -1 calling set_mac_address

and in the host dmesg

[35888.875860] ixgbe 0000:06:00.0 eth4: VF 0 attempted to set a new MAC address but it already has an administratively set MAC address CA:6D:EC:51:71:66
[35888.875865] ixgbe 0000:06:00.0 eth4: Check the VF driver and if it is not using the correct MAC address you may need to reload the VF driver

For completeness I include every piece of information in the log

Report Time:      2020-12-19 22:28:24
Image Version:    VyOS 1.3-rolling-202012190217
Release Train:    equuleus

Built by:         [email protected]
Built on:         Sat 19 Dec 2020 02:17 UTC
Build UUID:       b48a4b33-3129-4c7e-a597-d7936427320d
Build Commit ID:  09e7d7c379cfee

Architecture:     x86_64
Boot via:         installed image
System type:      Xen HVM guest

Hardware vendor:  Xen
Hardware model:   HVM domU
Hardware S/N:     1be847d5-9283-8688-ccf9-2fa97d0e4e58
Hardware UUID:    1be847d5-9283-8688-ccf9-2fa97d0e4e58

Details

Difficulty level
Unknown (require assessment)
Version
VyOS 1.3-rolling-202012190217
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

The issue turned out to be at the host level. By default SR-IOV VFs are not allowed to change their own mac addresses or send packets with a different mac address, and since this is required for LACP bonding the configuration would just fail.

One needs to enable trust and disable spoof-check (on XCP-ng, ip link set dev ethN vf X trust on and ip link set dev ethN vf X spoofchk off). The settings get reset at each reboot, so it necessary to create a script to set them at boot (if somebody has a better solution please tell me).

While this is not a VyOS problem, the error message is beyond unhelpful by complaining about not being allowed to write to some file when it has nothing to do with the real problem, so this should be fixed in my opinion.

Thanks for the feedback and telling us about how to solve this issue.

The error message indeed is bad but generated by the OS kernel. We should find a general way to handle this in a more proper way.

erkin set Issue type to Bug (incorrect behavior).Aug 29 2021, 11:58 AM
erkin removed a subscriber: Active contributors.