Page MenuHomeVyOS Platform

Protocol failover stopped working after suspend + resume
Closed, WontfixPublicBUG

Description

I'm building a lab with VyOS in the GNS3 on my laptop computer and have faced strange VyOS behavior in condition where the laptop goes sleep and then being woked up after a couple of minutes.

I've noticed that a VyOS VMs with configured failover protocol can lost all static routes in such case.

The problem can appears on first sleep/resume cycle but can appears only on second (and maybe third-forth) cycle.

Here is one my VM, called ISP2. It's running latest 1.4 nightly build.
The VM is in problem state now after being resumed.

The VM has configured static routes:

vyos@ISP2# show protocols static | commands 
set route 1.1.1.104/32 next-hop 192.168.14.1
set route 1.1.1.105/32 next-hop 192.168.15.1
set route 1.1.1.204/32 next-hop 192.168.14.1
set route 1.1.1.205/32 next-hop 192.168.15.1

But ip route command wont show these routes:

vyos@ISP2# ip r
10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.2 
192.168.14.0/24 dev eth1 proto kernel scope link src 192.168.14.2 
192.168.15.0/24 dev eth2 proto kernel scope link src 192.168.15.2

If I'll change the static route config (e.g. add or remove a static route) - ip route command will show all of the routes (as it should be):

vyos@ISP2# set protocols static route 2.2.2.2/32 blackhole 
[edit]
vyos@ISP2# commit
[edit]
vyos@ISP2# ip r
1.1.1.104 nhid 49 via 192.168.14.1 dev eth1 proto static metric 20 
1.1.1.105 nhid 50 via 192.168.15.1 dev eth2 proto static metric 20 
1.1.1.204 nhid 49 via 192.168.14.1 dev eth1 proto static metric 20 
1.1.1.205 nhid 50 via 192.168.15.1 dev eth2 proto static metric 20 
blackhole 2.2.2.2 proto static metric 20 
10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.2 
192.168.14.0/24 dev eth1 proto kernel scope link src 192.168.14.2 
192.168.15.0/24 dev eth2 proto kernel scope link src 192.168.15.2

After the laptop being suspended and resumed *twice* (first time for 6 minutes, second time for 3 minutes) the problem appears:

vyos@ISP2# ip r
10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.2 
192.168.14.0/24 dev eth1 proto kernel scope link src 192.168.14.2 
192.168.15.0/24 dev eth2 proto kernel scope link src 192.168.15.2

I've tried to strace the vyos-failover.py and found that the script trying to open /tmp/vyos.log.debug and /tmp/vyos.command.debug files. After I've created these (empty) files, new file was created: /tmp/developer-log.

Part of /tmp/developer-log:

vyos@ISP2# nl developer-log | head -n 250
     1  2023-12-21 12:17:45 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
     2  2023-12-21 12:17:45 DEBUG/LOG     returned (out):
     3  [{"dst":"default","flags":[]}]
     4  2023-12-21 12:17:45 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
     5  2023-12-21 12:17:46 DEBUG/LOG     returned (out):
     6  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
     7  --- 1.1.1.105 ping statistics ---
     8  2 packets transmitted, 2 received, 0% packet loss, time 1002ms
     9  rtt min/avg/max/mdev = 15.628/35.215/54.803/19.587 ms
    10  2023-12-21 12:17:56 DEBUG/LOG     cmd 'ip --detail --json route show protocol failover 0.0.0.0/0'
    11  2023-12-21 12:17:56 DEBUG/LOG     returned (out):
    12  [{"type":"unicast","dst":"default","gateway":"192.168.15.1","dev":"eth2","scope":"global","metric":1,"flags":[]}]
    13  2023-12-21 12:17:56 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
    14  2023-12-21 12:17:56 DEBUG/LOG     returned (out):
    15  []
    16  2023-12-21 12:17:56 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
    17  2023-12-21 12:17:57 DEBUG/LOG     returned (out):
    18  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
    19  --- 1.1.1.104 ping statistics ---
    20  2 packets transmitted, 2 received, 0% packet loss, time 1001ms
    21  rtt min/avg/max/mdev = 14.582/14.894/15.206/0.312 ms
    22  2023-12-21 12:17:57 DEBUG/LOG     cmd 'ip route add 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1 proto failover'
    23  2023-12-21 12:17:57 DEBUG/LOG     returned (out):
    24  RTNETLINK answers: File exists
    25  2023-12-21 12:17:57 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
    26  2023-12-21 12:17:58 DEBUG/LOG     returned (out):
    27  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
    28  --- 1.1.1.104 ping statistics ---
    29  2 packets transmitted, 2 received, 0% packet loss, time 1002ms
    30  rtt min/avg/max/mdev = 13.189/14.417/15.645/1.228 ms
    31  2023-12-21 12:18:08 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
    32  2023-12-21 12:18:08 DEBUG/LOG     returned (out):
    33  [{"dst":"default","flags":[]}]
    34  2023-12-21 12:18:08 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
    35  2023-12-21 12:18:09 DEBUG/LOG     returned (out):
    36  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
    37  --- 1.1.1.105 ping statistics ---
    38  2 packets transmitted, 2 received, 0% packet loss, time 1002ms
    39  rtt min/avg/max/mdev = 13.573/14.959/16.345/1.386 ms
    40  2023-12-21 12:18:19 DEBUG/LOG     cmd 'ip --detail --json route show protocol failover 0.0.0.0/0'
    41  2023-12-21 12:18:19 DEBUG/LOG     returned (out):
    42  [{"type":"unicast","dst":"default","gateway":"192.168.15.1","dev":"eth2","scope":"global","metric":1,"flags":[]}]
    43  2023-12-21 12:18:19 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
    44  2023-12-21 12:18:19 DEBUG/LOG     returned (out):
    45  []
    46  2023-12-21 12:18:19 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
    47  2023-12-21 12:18:20 DEBUG/LOG     returned (out):
    48  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
    49  --- 1.1.1.104 ping statistics ---
    50  2 packets transmitted, 2 received, 0% packet loss, time 1002ms
    51  rtt min/avg/max/mdev = 14.626/14.629/14.633/0.003 ms
    52  2023-12-21 12:18:20 DEBUG/LOG     cmd 'ip route add 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1 proto failover'
    53  2023-12-21 12:18:20 DEBUG/LOG     returned (out):
    54  RTNETLINK answers: File exists
    55  2023-12-21 12:18:20 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
    56  2023-12-21 12:18:21 DEBUG/LOG     returned (out):
    57  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
    58  --- 1.1.1.104 ping statistics ---
    59  2 packets transmitted, 2 received, 0% packet loss, time 1001ms
    60  rtt min/avg/max/mdev = 13.061/13.909/14.757/0.848 ms
    61  2023-12-21 12:18:31 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
    62  2023-12-21 12:18:31 DEBUG/LOG     returned (out):
    63  [{"dst":"default","flags":[]}]
    64  2023-12-21 12:18:31 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
    65  2023-12-21 12:18:32 DEBUG/LOG     returned (out):
    66  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
    67  --- 1.1.1.105 ping statistics ---
    68  2 packets transmitted, 2 received, 0% packet loss, time 1001ms
    69  rtt min/avg/max/mdev = 14.959/15.430/15.902/0.471 ms
    70  2023-12-21 12:25:52 DEBUG/LOG     cmd 'ip --detail --json route show protocol failover 0.0.0.0/0'
    71  2023-12-21 12:25:53 DEBUG/LOG     returned (out):
    72  [{"type":"unicast","dst":"default","gateway":"192.168.15.1","dev":"eth2","scope":"global","metric":1,"flags":[]}]
    73  2023-12-21 12:25:53 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
    74  2023-12-21 12:25:53 DEBUG/LOG     returned (out):
    75  []
    76  2023-12-21 12:25:53 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
    77  2023-12-21 12:25:54 DEBUG/LOG     returned (out):
    78  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
    79  --- 1.1.1.104 ping statistics ---
    80  2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1001ms
    81  2023-12-21 12:25:54 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
    82  2023-12-21 12:25:56 DEBUG/LOG     returned (out):
    83  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
    84  --- 1.1.1.104 ping statistics ---
    85  2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 1038ms
    86  2023-12-21 12:25:56 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
    87  2023-12-21 12:25:56 DEBUG/LOG     returned (out):
    88  []
    89  2023-12-21 12:26:06 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
    90  2023-12-21 12:26:06 DEBUG/LOG     returned (out):
    91  [{"dst":"default","flags":[]}]
    92  2023-12-21 12:26:06 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
    93  2023-12-21 12:26:07 DEBUG/LOG     returned (out):
    94  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
    95  --- 1.1.1.105 ping statistics ---
    96  2 packets transmitted, 2 received, 0% packet loss, time 1002ms
    97  rtt min/avg/max/mdev = 13.489/13.770/14.051/0.281 ms
    98  2023-12-21 12:26:17 DEBUG/LOG     cmd 'ip --detail --json route show protocol failover 0.0.0.0/0'
    99  2023-12-21 12:26:17 DEBUG/LOG     returned (out):
   100  [{"type":"unicast","dst":"default","gateway":"192.168.15.1","dev":"eth2","scope":"global","metric":1,"flags":[]}]
   101  2023-12-21 12:26:17 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
   102  2023-12-21 12:26:17 DEBUG/LOG     returned (out):
   103  []
   104  2023-12-21 12:26:17 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
   105  2023-12-21 12:26:18 DEBUG/LOG     returned (out):
   106  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
   107  --- 1.1.1.104 ping statistics ---
   108  2 packets transmitted, 2 received, 0% packet loss, time 1002ms
   109  rtt min/avg/max/mdev = 12.893/13.859/14.826/0.966 ms
   110  2023-12-21 12:26:18 DEBUG/LOG     cmd 'ip route add 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1 proto failover'
   111  2023-12-21 12:26:18 DEBUG/LOG     returned (out):
   112  RTNETLINK answers: File exists
   113  2023-12-21 12:26:18 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
   114  2023-12-21 12:26:20 DEBUG/LOG     returned (out):
   115  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
   116  --- 1.1.1.104 ping statistics ---
   117  2 packets transmitted, 2 received, 0% packet loss, time 1002ms
   118  rtt min/avg/max/mdev = 12.816/13.749/14.682/0.933 ms
   119  2023-12-21 12:30:38 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
   120  2023-12-21 12:30:38 DEBUG/LOG     returned (out):
   121  [{"dst":"default","flags":[]}]
   122  2023-12-21 12:30:38 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
   123  2023-12-21 12:30:40 DEBUG/LOG     returned (out):
   124  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
   125  --- 1.1.1.105 ping statistics ---
   126  2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 1002ms
   127  2023-12-21 12:30:40 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
   128  2023-12-21 12:30:40 DEBUG/LOG     returned (out):
   129  [{"dst":"default","flags":[]}]
   130  2023-12-21 12:30:40 DEBUG/LOG     cmd 'ip route del 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1 proto failover'
   131  2023-12-21 12:30:50 DEBUG/LOG     cmd 'ip --detail --json route show protocol failover 0.0.0.0/0'
   132  2023-12-21 12:30:50 DEBUG/LOG     returned (out):
   133  []
   134  2023-12-21 12:30:50 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
   135  2023-12-21 12:30:50 DEBUG/LOG     returned (out):
   136  []
   137  2023-12-21 12:30:50 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
   138  2023-12-21 12:30:52 DEBUG/LOG     returned (out):
   139  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
   140  --- 1.1.1.104 ping statistics ---
   141  2 packets transmitted, 0 received, 100% packet loss, time 1043ms
   142  2023-12-21 12:30:52 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
   143  2023-12-21 12:30:54 DEBUG/LOG     returned (out):
   144  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
   145  --- 1.1.1.104 ping statistics ---
   146  2 packets transmitted, 0 received, 100% packet loss, time 1044ms
   147  2023-12-21 12:30:54 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
   148  2023-12-21 12:30:54 DEBUG/LOG     returned (out):
   149  []
   150  2023-12-21 12:31:04 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
   151  2023-12-21 12:31:04 DEBUG/LOG     returned (out):
   152  []
   153  2023-12-21 12:31:04 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
   154  2023-12-21 12:31:06 DEBUG/LOG     returned (out):
   155  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
   156  --- 1.1.1.105 ping statistics ---
   157  2 packets transmitted, 0 received, 100% packet loss, time 1023ms
   158  2023-12-21 12:31:06 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
   159  2023-12-21 12:31:08 DEBUG/LOG     returned (out):
   160  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
   161  --- 1.1.1.105 ping statistics ---
   162  2 packets transmitted, 0 received, 100% packet loss, time 1038ms
   163  2023-12-21 12:31:08 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
   164  2023-12-21 12:31:08 DEBUG/LOG     returned (out):
   165  []
   166  2023-12-21 12:31:18 DEBUG/LOG     cmd 'ip --detail --json route show protocol failover 0.0.0.0/0'
   167  2023-12-21 12:31:18 DEBUG/LOG     returned (out):
   168  []
   169  2023-12-21 12:31:18 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
   170  2023-12-21 12:31:18 DEBUG/LOG     returned (out):
   171  []
   172  2023-12-21 12:31:18 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
   173  2023-12-21 12:31:20 DEBUG/LOG     returned (out):
   174  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
   175  --- 1.1.1.104 ping statistics ---
   176  2 packets transmitted, 0 received, 100% packet loss, time 1061ms
   177  2023-12-21 12:31:20 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
   178  2023-12-21 12:31:22 DEBUG/LOG     returned (out):
   179  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
   180  --- 1.1.1.104 ping statistics ---
   181  2 packets transmitted, 0 received, 100% packet loss, time 1021ms
   182  pipe 2
   183  2023-12-21 12:31:22 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
   184  2023-12-21 12:31:22 DEBUG/LOG     returned (out):
   185  []
   186  2023-12-21 12:31:32 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
   187  2023-12-21 12:31:32 DEBUG/LOG     returned (out):
   188  []
   189  2023-12-21 12:31:32 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
   190  2023-12-21 12:31:34 DEBUG/LOG     returned (out):
   191  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
   192  --- 1.1.1.105 ping statistics ---
   193  2 packets transmitted, 0 received, 100% packet loss, time 1021ms
   194  2023-12-21 12:31:34 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
   195  2023-12-21 12:31:36 DEBUG/LOG     returned (out):
   196  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.
       
   197  --- 1.1.1.105 ping statistics ---
   198  2 packets transmitted, 0 received, 100% packet loss, time 1044ms
   199  2023-12-21 12:31:36 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
   200  2023-12-21 12:31:36 DEBUG/LOG     returned (out):
   201  []
   202  2023-12-21 12:31:46 DEBUG/LOG     cmd 'ip --detail --json route show protocol failover 0.0.0.0/0'
   203  2023-12-21 12:31:46 DEBUG/LOG     returned (out):
   204  []
   205  2023-12-21 12:31:46 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
   206  2023-12-21 12:31:46 DEBUG/LOG     returned (out):
   207  []
   208  2023-12-21 12:31:46 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
   209  2023-12-21 12:31:48 DEBUG/LOG     returned (out):
   210  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
   211  --- 1.1.1.104 ping statistics ---
   212  2 packets transmitted, 0 received, 100% packet loss, time 1017ms
   213  2023-12-21 12:31:48 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.104 -I eth1 -n -c 2 -W 1'
   214  2023-12-21 12:31:50 DEBUG/LOG     returned (out):
   215  PING 1.1.1.104 (1.1.1.104) from 192.168.14.2 eth1: 56(84) bytes of data.
       
   216  --- 1.1.1.104 ping statistics ---
   217  2 packets transmitted, 0 received, 100% packet loss, time 1044ms
   218  2023-12-21 12:31:50 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.14.1 dev eth1 metric 1'
   219  2023-12-21 12:31:50 DEBUG/LOG     returned (out):
   220  []
   221  2023-12-21 12:32:00 DEBUG/LOG     cmd 'ip --json route show protocol failover 0.0.0.0/0 via 192.168.15.1 dev eth2 metric 1'
   222  2023-12-21 12:32:00 DEBUG/LOG     returned (out):
   223  []
   224  2023-12-21 12:32:00 DEBUG/LOG     cmd '/usr/bin/ping -q 1.1.1.105 -I eth2 -n -c 2 -W 1'
   225  2023-12-21 12:32:02 DEBUG/LOG     returned (out):
   226  PING 1.1.1.105 (1.1.1.105) from 192.168.15.2 eth2: 56(84) bytes of data.

I've managed to reproduce the problem on two VMs (inside GNS3) with different VyOS version.

Software versions:

  • VyOS 1.4-rolling-202312201553
  • VyOS 1.4-rolling-202311021324
  • Host OS: Linux 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64 GNU/Linux
  • GNS3: 2.2.41

Config of the VM:

vyos@ISP2# show 
 interfaces {
     ethernet eth0 {
         address 10.0.0.2/24
         description MGMT
         hw-id 0c:41:45:fc:00:00
     }
     ethernet eth1 {
         address 192.168.14.2/24
         description WAN1
         hw-id 0c:41:45:fc:00:01
     }
     ethernet eth2 {
         address 192.168.15.2/24
         description WAN2
         hw-id 0c:41:45:fc:00:02
     }
     loopback lo {
     }
 }
 protocols {
     failover {
         route 0.0.0.0/0 {
             next-hop 192.168.14.1 {
                 check {
                     target 1.1.1.104
                 }
                 interface eth1
             }
             next-hop 192.168.15.1 {
                 check {
                     target 1.1.1.105
                 }
                 interface eth2
             }
         }
     }
     ospf {
         area 0 {
             network 10.0.0.0/16
         }
         parameters {
             router-id 10.0.0.2
         }
     }
     static {
         route 1.1.1.104/32 {
             next-hop 192.168.14.1 {
             }
         }
         route 1.1.1.105/32 {
             next-hop 192.168.15.1 {
             }
         }
         route 1.1.1.204/32 {
             next-hop 192.168.14.1 {
             }
         }
         route 1.1.1.205/32 {
             next-hop 192.168.15.1 {
             }
         }
     }
 }
 service {
     ntp {
         allow-client {
             address 0.0.0.0/0
             address ::/0
         }
         server time1.vyos.net {
         }
         server time2.vyos.net {
         }
         server time3.vyos.net {
         }
     }
 }
 system {
     config-management {
         commit-revisions 100
     }
     console {
         device ttyS0 {
             speed 115200
         }
     }
     host-name ISP2
     login {
         user vyos {
             authentication {
                 encrypted-password $6$YK13LxhA3f5LVtkB$q9xbGL1Lgi5Krd/f0ZsIAF6h3iLBSFKoDxMRHQUM.TyhuyJfkjnAbn1uFAh16NTEYE9K.DXIqhnSkJ4pzrmfu/
                 plaintext-password ""
             }
         }
     }
     name-server 8.8.8.8
     syslog {
         global {
             facility all {
                 level info
             }
             facility local7 {
                 level debug
             }
         }
     }
     time-zone Europe/Moscow
 }

Details

Difficulty level
Unknown (require assessment)
Version
1.4-rolling-202312201553
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

@Viacheslav
First of all, thanks for the failover feature!
If you have enough time, please take a look at this bugreport.

Viacheslav triaged this task as Normal priority.Jan 20 2024, 2:03 PM

Try to check if the routes exist in the FRR config.

vtsyh -c "show run staticd"

Needs re-check as FRR was updated. Also, update GNS3 to the latest version.
It is not clear how to reproduce it without GNS3 on the some VM

Hi!
Recently I've note that this bug is not affects only protocol failover, but also protocol static routers.

Here is KVM VM with VyOS 1.4.0-epa3:

vyos@R1009# show protocols 
 static {
     route 0.0.0.0/0 {
         next-hop 10.1.3.1 {
         }
     }
 }
[edit]
vyos@R1009:~$ ip r
10.1.3.0/24 dev eth0 proto kernel scope link src 10.1.3.2
vyos@R1009:~$ vtysh -c "show run staticd"
Building configuration...

Current configuration:
!
frr version 9.1
frr defaults traditional
hostname R1009-User
log syslog
log facility local7
hostname R1009
service integrated-vtysh-config
!
end

To get the 0.0.0.0/0 works I need to to change a static routing setting (e.g. add a metric to the next-hop or add another static route).

I've modified the protocol ospf settings to check if it will affect static routing or not.
The static routing are not affected (still no route to 0.0.0.0/0):

vyos@R1009# set protocols ospf area 0 network 10.0.0.0/8
[edit]
vyos@R1009# commit
[edit]
vyos@R1009# ip r
10.1.3.0/24 dev eth0 proto kernel scope link src 10.1.3.2 
[edit]
vyos@R1009# ip r
10.0.0.3 nhid 28 via 10.1.3.1 dev eth0 proto ospf metric 20 
10.1.1.0/24 nhid 28 via 10.1.3.1 dev eth0 proto ospf metric 20 
10.1.2.0/24 nhid 28 via 10.1.3.1 dev eth0 proto ospf metric 20 
10.1.3.0/24 dev eth0 proto kernel scope link src 10.1.3.2 
[edit]

Logs from VM:

vyos@R1009:~$ show log | grep "Jun 15" | grep -v -e CRON -e logrotate -e atop -e tmpfiles
Jun 15 06:00:25 watchfrr[1381]: [HKQ2F-8D0MY][EC 100663315] Thread Starvation: {(thread *)0x56063bdf8940 arg=0x56063bdef930 timer  r=-3230.162  wakeup_send_echo() &(dmn)->t_wakeup from ../watchfrr/watchfrr.c:695} was scheduled to pop greater than 4s ago
Jun 15 06:00:25 bgpd[1409]: [HKQ2F-8D0MY][EC 100663315] Thread Starvation: {(thread *)0x55f26904bfd0 arg=0x0 timer  r=-3230.616  bgp_sync_label_manager() &bm->t_bgp_sync_label_manager from ../bgpd/bgp_labelpool.c:576} was scheduled to pop greater than 4s ago
Jun 15 06:00:25 pim6d[1441]: [HKQ2F-8D0MY][EC 100663315] Thread Starvation: {(thread *)0x55ddf2be1240 arg=0x55ddf2be15c0 timer  r=-3227.717  wheel_timer_thread() &wheel->timer from ../lib/wheel.c:50} was scheduled to pop greater than 4s ago
Jun 15 06:00:25 chronyd[2786]: Forward time jump detected!
Jun 15 10:05:40 chronyd[2786]: Forward time jump detected!
Jun 15 10:05:40 pim6d[1441]: [HKQ2F-8D0MY][EC 100663315] Thread Starvation: {(thread *)0x55ddf2cec1e0 arg=0x55ddf2be15c0 timer  r=-14684.690 wheel_timer_thread() &wheel->timer from ../lib/wheel.c:50} was scheduled to pop greater than 4s ago
Jun 15 10:05:40 watchfrr[1381]: [HKQ2F-8D0MY][EC 100663315] Thread Starvation: {(thread *)0x56063bdeebb0 arg=0x56063bdef690 timer  r=-14705.893 wakeup_send_echo() &(dmn)->t_wakeup from ../watchfrr/watchfrr.c:695} was scheduled to pop greater than 4s ago
Jun 15 10:05:40 bgpd[1409]: [HKQ2F-8D0MY][EC 100663315] Thread Starvation: {(thread *)0x55f269518ab0 arg=0x0 timer  r=-14705.661 bgp_sync_label_manager() &bm->t_bgp_sync_label_manager from ../bgpd/bgp_labelpool.c:576} was scheduled to pop greater than 4s ago
Jun 15 10:05:42 kernel: Process accounting resumed

Note for myself:

Static routes can be re-applied without reboot by issuing:

$ sudo python3 /usr/libexec/vyos/conf_mode/protocols_static.py
syncer claimed this task.
syncer subscribed.

Suspend/Resume is not supported operation