I use two VyOS Appliances which connect to an Active-Active AZURE VPNGw type1.
In total I have 8 IPSec tunnels per VyOS Appliance (two tunnels per AZURE VPN Gateway) and 4 VPN Gateways in total.
I consulated not only the Azure example configuration for Cisco and Ubiquity EdgeOS but also the guide at https://cloudnetworking.io/2019/08/21/azure-vpn-vyos/
There are several posts describing this issue
- https://forum.vyos.io/t/interface-vti0-deactivated-when-ike-sa-delete-failed/2053/10
- https://community.ui.com/questions/AWS-IPSec-VPN-w-VTI-unable-to-keep-stable-connection/dcbbdafe-16d0-4649-ade9-655376452239
I now work-arount the problem using a custom shell script which checks the tunnel A/D state and resets it on demand
set system task-scheduler task azure-reset executable path '/root/reset_azure.sh' set system task-scheduler task azure-reset interval '1m'
vyos@vyos:~$ cat /root/reset_azure.sh #!/bin/vbash source /opt/vyatta/etc/functions/script-template for vti in $(run show interfaces vti | grep "A/D" | awk {'print $1}') do tunnel=$(run show configuration commands | grep "bind '$vti'" | awk {'print $6}') logger "Resetting IPSec tunnel $vti -> $tunnel" run reset vpn ipsec-peer $tunnel done exit
When looking at the logs when the hangup occured the last time one can see that reauth works perfect on vti42 but vti22 does not come up again and the script above kicks in:
Dec 14 13:46:01 VMU-02-AZURE CRON[27183]: pam_unix(cron:session): session opened for user root by (uid=0) Dec 14 13:46:01 VMU-02-AZURE CRON[27184]: (root) CMD (sg vyattacfg "/root/reset_azure.sh") Dec 14 13:46:01 VMU-02-AZURE sg[27184]: user 'root' (login 'root' on ???) switched to group 'vyattacfg' Dec 14 13:46:01 VMU-02-AZURE sg[27184]: user 'root' (login 'root' on ???) returned to group 'root' Dec 14 13:46:01 VMU-02-AZURE CRON[27183]: pam_unix(cron:session): session closed for user root Dec 14 13:46:25 VMU-02-AZURE charon[2247]: 07[IKE] reauthenticating IKE_SA peer-xxx.xxx.229.18-tunnel-vti[71] Dec 14 13:46:25 VMU-02-AZURE charon[2247]: 07[IKE] deleting IKE_SA peer-xxx.xxx.229.18-tunnel-vti[71] between zzz.zzz.32.190[zzz.zzz.32.190]...xxx.xxx.229.18[xxx.xxx.229.18] Dec 14 13:46:25 VMU-02-AZURE charon[2247]: 06[IKE] IKE_SA deleted Dec 14 13:46:25 VMU-02-AZURE sudo[27376]: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/ip link set vti42 down Dec 14 13:46:25 VMU-02-AZURE sudo[27376]: pam_unix(sudo:session): session opened for user root by (uid=0) Dec 14 13:46:25 VMU-02-AZURE netplugd[974]: vti42: ignoring event Dec 14 13:46:25 VMU-02-AZURE sudo[27376]: pam_unix(sudo:session): session closed for user root Dec 14 13:46:25 VMU-02-AZURE charon[2247]: 06[IKE] initiating IKE_SA peer-xxx.xxx.229.18-tunnel-vti[78] to xxx.xxx.229.18 Dec 14 13:46:25 VMU-02-AZURE charon[2247]: 05[IKE] establishing CHILD_SA peer-xxx.xxx.229.18-tunnel-vti{101720} reqid 5 Dec 14 13:46:25 VMU-02-AZURE charon[2247]: 04[IKE] IKE_SA peer-xxx.xxx.229.18-tunnel-vti[78] established between zzz.zzz.32.190[zzz.zzz.32.190]...xxx.xxx.229.18[xxx.xxx.229.18] Dec 14 13:46:25 VMU-02-AZURE charon[2247]: 04[IKE] CHILD_SA peer-xxx.xxx.229.18-tunnel-vti{101720} established with SPIs c8023ec0_i b41943aa_o and TS 0.0.0.0/0 === 0.0.0.0/0 Dec 14 13:46:25 VMU-02-AZURE sudo[27388]: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/ip link set vti42 up Dec 14 13:46:25 VMU-02-AZURE sudo[27388]: pam_unix(sudo:session): session opened for user root by (uid=0) Dec 14 13:46:25 VMU-02-AZURE netplugd[974]: vti42: ignoring event Dec 14 13:46:25 VMU-02-AZURE sudo[27388]: pam_unix(sudo:session): session closed for user root Dec 14 13:46:39 VMU-02-AZURE charon[2247]: 15[IKE] reauthenticating IKE_SA peer-yyy.yyy.89.238-tunnel-vti[74] actively Dec 14 13:46:39 VMU-02-AZURE charon[2247]: 15[IKE] deleting IKE_SA peer-yyy.yyy.89.238-tunnel-vti[74] between zzz.zzz.32.190[zzz.zzz.32.190]...yyy.yyy.89.238[yyy.yyy.89.238] Dec 14 13:46:39 VMU-02-AZURE charon[2247]: 08[IKE] IKE_SA deleted Dec 14 13:46:39 VMU-02-AZURE sudo[27398]: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/ip link set vti22 down Dec 14 13:46:39 VMU-02-AZURE sudo[27398]: pam_unix(sudo:session): session opened for user root by (uid=0) Dec 14 13:46:39 VMU-02-AZURE netplugd[974]: vti22: ignoring event Dec 14 13:46:39 VMU-02-AZURE sudo[27398]: pam_unix(sudo:session): session closed for user root Dec 14 13:46:39 VMU-02-AZURE charon[2247]: 08[IKE] initiating IKE_SA peer-yyy.yyy.89.238-tunnel-vti[79] to yyy.yyy.89.238 Dec 14 13:46:39 VMU-02-AZURE charon[2247]: 14[IKE] establishing CHILD_SA peer-yyy.yyy.89.238-tunnel-vti{101721} reqid 7 Dec 14 13:46:40 VMU-02-AZURE charon[2247]: 10[IKE] IKE_SA peer-yyy.yyy.89.238-tunnel-vti[79] established between zzz.zzz.32.190[zzz.zzz.32.190]...yyy.yyy.89.238[yyy.yyy.89.238] Dec 14 13:46:40 VMU-02-AZURE charon[2247]: 10[IKE] CHILD_SA peer-yyy.yyy.89.238-tunnel-vti{101721} established with SPIs ce67c986_i 7890fcd6_o and TS 0.0.0.0/0 === 0.0.0.0/0 Dec 14 13:46:40 VMU-02-AZURE sudo[27410]: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/ip link set vti22 up Dec 14 13:46:40 VMU-02-AZURE sudo[27410]: pam_unix(sudo:session): session opened for user root by (uid=0) Dec 14 13:46:40 VMU-02-AZURE netplugd[974]: vti22: ignoring event Dec 14 13:46:40 VMU-02-AZURE sudo[27410]: pam_unix(sudo:session): session closed for user root Dec 14 13:46:40 VMU-02-AZURE charon[2247]: 06[IKE] closing CHILD_SA peer-yyy.yyy.89.238-tunnel-vti{101712} with SPIs c24cfd41_i (364 bytes) 58386366_o (2338 bytes) and TS 0.0.0.0/0 === 0.0.0.0/0 Dec 14 13:46:40 VMU-02-AZURE sudo[27420]: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/ip link set vti22 down Dec 14 13:46:40 VMU-02-AZURE sudo[27420]: pam_unix(sudo:session): session opened for user root by (uid=0) Dec 14 13:46:40 VMU-02-AZURE netplugd[974]: vti22: ignoring event Dec 14 13:46:40 VMU-02-AZURE sudo[27420]: pam_unix(sudo:session): session closed for user root Dec 14 13:46:40 VMU-02-AZURE charon[2247]: 08[IKE] deleting IKE_SA peer-yyy.yyy.89.238-tunnel-vti[73] between zzz.zzz.32.190[zzz.zzz.32.190]...yyy.yyy.89.238[yyy.yyy.89.238] Dec 14 13:46:40 VMU-02-AZURE charon[2247]: 08[IKE] IKE_SA deleted Dec 14 13:46:41 VMU-02-AZURE ntpd[1634]: Deleting interface #26 vti22, fe80::200:5efe:50f6:20be#123, interface stats: received=0, sent=0, dropped=0, active_time=106678 secs Dec 14 13:46:41 VMU-02-AZURE ntpd[1634]: peers refreshed Dec 14 13:47:01 VMU-02-AZURE CRON[27422]: pam_unix(cron:session): session opened for user root by (uid=0) Dec 14 13:47:01 VMU-02-AZURE CRON[27423]: (root) CMD (sg vyattacfg "/root/reset_azure.sh") Dec 14 13:47:01 VMU-02-AZURE sg[27423]: user 'root' (login 'root' on ???) switched to group 'vyattacfg' Dec 14 13:47:02 VMU-02-AZURE root[27718]: Resetting IPSec tunnel vti22 -> yyy.yyy.89.238
I have labbed the issue (and hooked a tcpdump on it) but unfortunately no success yet.
On first glance it looks like a race condition as ip link down is called again after the new link is up.