Add support for all active-active Ethernet Multihoming as alternative implementation to classical MLAG using our EVPN capabilities
Description
Details
- Version
- -
- Is it a breaking change?
- Perfectly compatible
Related Objects
- Mentioned In
- rVYOSONEXf1b577eb4c41: op-mode: bgp: T5698: fix "rd" route-distinguisher help string
rVYOSONEX2fc97aea3814: op-mode: bgp: T5698: add "es-vrf" and "next-hops" CLI commands
rVYOSONEX527928154c91: Merge pull request #2445 from c-po/sagitta
rVYOSONEX7a4d59acaf62: Merge pull request #2444 from vyos/mergify/bp/sagitta/pr-2416
rVYOSONEXf4b1df3c8407: op-mode: bgp: T5698: add "es-vrf" and "next-hops" CLI commands
rVYOSONEX43288b57d8dc: op-mode: bgp: T5698: fix "rd" route-distinguisher help string
rVYOSONEX91a65d295550: bgp: T5698: add support for EVPN Multihoming
rVYOSONEX062ac6bc4c04: bond: T5698: add support for EVPN Multihoming
rVYOSONEX937685608e61: bond: T5698: add support for EVPN Multihoming
rVYOSONEX1d67620e6567: bgp: T5698: add support for EVPN Multihoming
rVYOSONEX031a5c8a1b1a: Merge pull request #2416 from c-po/evpn-mh-t5698
Event Timeline
Both single-active and all-active should be supported when it comes to EVPN Multihoming.
Along with preempt and dont-preempt when it comes to single-active.
In the documentation it would also be handy to know what is the upper limit of number of devices sharing the same ESI with VyOS (if all are VyOS boxes)?
@Apachez this request stems from this issue: https://forum.vyos.io/t/evpn-vxlan-esi-lag-duplicate-packets/12581
In this case, the end devices are connected to Juniper switches using ESI-LAG (config in post); VyOS itself isn't using ESI-LAG at all (it just has a pair of NICs to the Juniper switches and its using ECMP with OSPF/BGP). The EVPN routes for the host 10.100.65.132 (which is connected to ESI-LAG) look like this:
* i[2]:[40302]:[48]:[26:72:5b:39:36:e3]:[32]:[10.100.65.132] 10.100.0.2 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 * i 10.100.0.2 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 * i 10.100.0.2 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 *>i 10.100.0.2 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 * i 10.100.0.2 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 * i 10.100.0.2 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 .... * i[2]:[40302]:[48]:[26:72:5b:39:36:e3]:[32]:[10.100.65.132] 10.100.0.3 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 ND:Proxy * i 10.100.0.3 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 ND:Proxy * i 10.100.0.3 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 ND:Proxy * i 10.100.0.3 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 ND:Proxy *>i 10.100.0.3 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 ND:Proxy * i 10.100.0.3 0 0 i ESI:00:42:12:30:00:00:00:20:00:03 RT:39136:268475758 ET:8 ND:Proxy
Above 10.100.0.2 and 10.100.0.3 are the two switches the host is connected to.
Any traffic to 10.100.65.132 from VyOS shows it leaving the vxlan interface once but getting duplicated and sent to BOTH of 10.100.0.2 and 10.100.0.3.
@shthead: Yes but when it comes to multihoming there are some additional settings that should exist aswell:
https://www.arista.com/en/um-eos/eos-vxlan-configuration#topic_ckc_dh4_ynb
redundancy single-active
vs.
redundancy all-active
but also:
designated-forwarder election hold-time
If you check the forum post, the interface on the Juniper side is configured with all-active (under the ESI configuration for the ae interface). On the QFX 5120 platform you cannot set the "single-active" option for the interface; only all-active is available.
but also:
designated-forwarder election hold-time
The designated forwarder election timer doesn't impact the above scenario (at least for Juniper).
@shthead: Im talking about features in VyOS. I dont care what others such as Juniper does or doesnt do.
When doing EVPN Multihoming you also need the ability to select how you want your end of the EVPN Multihoming to function when it comes to single-active vs all-active. Also when it comes to interoperability where the MH might consist of more than one vendor/model.
But also how long deadtime you wish to have when it comes to "designated-forwarder election hold-time".
So this is just my €0.05 when it comes to support for EVPN Multihoming in VyOS. Without the ability to affect the above (single/all-active and election hold-time) to me EVPN MH is nonexistent in VyOS. Arista for example struggled for some time before they got their EVPN MH to work as expected with the expected features (plenty of features missing the first years).
I think we may be talking about different things. The situation I am testing has the pair of QFX switches. An end server has a pair of interfaces in a bond with each interface going to one QFX. VyOS is configured with an interface to each QFX but not in a bond (just two different VLANs). The designated forwarder election for the segment to the end server is handled by the QFXs.
If both links for the server to QFX are up traffic exiting from VyOS to the end server is duplicated over both interfaces it has to the QFX. Here is what a tcpdump looks like of a ping, this happens for any protocols though:
01:37:35.736682 eth1 In IP 10.100.0.2.34705 > 10.100.0.10.4789: VXLAN, flags [I] (0x08), vni 302 IP 10.100.65.132 > 10.100.65.135: ICMP echo request, id 30619, seq 1, length 64 01:37:35.736683 eth1.2018 In IP 10.100.0.2.34705 > 10.100.0.10.4789: VXLAN, flags [I] (0x08), vni 302 IP 10.100.65.132 > 10.100.65.135: ICMP echo request, id 30619, seq 1, length 64 01:37:35.736719 vxlan0 P IP 10.100.65.132 > 10.100.65.135: ICMP echo request, id 30619, seq 1, length 64 01:37:35.736726 br0 In IP 10.100.65.132 > 10.100.65.135: ICMP echo request, id 30619, seq 1, length 64 01:37:35.736727 br0.302 In IP 10.100.65.132 > 10.100.65.135: ICMP echo request, id 30619, seq 1, length 64 01:37:35.736746 br0.302 Out IP 10.100.65.135 > 10.100.65.132: ICMP echo reply, id 30619, seq 1, length 64 01:37:35.736748 br0 Out IP 10.100.65.135 > 10.100.65.132: ICMP echo reply, id 30619, seq 1, length 64 01:37:35.736752 vxlan0 Out IP 10.100.65.135 > 10.100.65.132: ICMP echo reply, id 30619, seq 1, length 64 01:37:35.736764 eth0.2019 Out IP 10.100.0.10.53993 > 10.100.0.3.4789: VXLAN, flags [I] (0x08), vni 302 IP 10.100.65.135 > 10.100.65.132: ICMP echo reply, id 30619, seq 1, length 64 01:37:35.736767 eth0 Out IP 10.100.0.10.53993 > 10.100.0.3.4789: VXLAN, flags [I] (0x08), vni 302 IP 10.100.65.135 > 10.100.65.132: ICMP echo reply, id 30619, seq 1, length 64 01:37:35.736774 eth1.2018 Out IP 10.100.0.10.53993 > 10.100.0.2.4789: VXLAN, flags [I] (0x08), vni 302 IP 10.100.65.135 > 10.100.65.132: ICMP echo reply, id 30619, seq 1, length 64 01:37:35.736776 eth1 Out IP 10.100.0.10.53993 > 10.100.0.2.4789: VXLAN, flags [I] (0x08), vni 302 IP 10.100.65.135 > 10.100.65.132: ICMP echo reply, id 30619, seq 1, length 64
The IPs/interfaces in that tcpdump are:
- 10.100.0.2 QFX1 loopback
- 10.100.0.3 QFX2 loopback
- 10.100.0.10 VyOS loopback
- eth1 Trunk port to QFX1
- eth0 Trunk port to QFX2
- eth1.2018 Point to point to QFX1
- eth0.2019 Point to point to QFX2
- 10.100.65.135/27 br0.302 interface on VyOS
- 10.100.65.132/27 bond0 on end server (connected to both QFX)
- vxlan0 VTEP on VyOS with VNI 302 to VLAN 302
If I take one of the interfaces down on the end server (so only a single link to QFX is up) traffic is no longer duplicated. The issue happens for forwarded traffic through VyOS as well.
If I go to a completely separate network device, I can add a L3 interface for VLAN/VNI 302 and use that for routing traffic and the duplicate traffic issues disappear; its something with how VyOS handles the traffic causing this.