User Details
- User Since
- Aug 22 2020, 1:26 PM (187 w, 5 d)
Thu, Mar 21
May 8 2023
Feb 27 2023
As a temporary workaround, I use the script below. For some reason /etc/rc.local no longer runs automatically on VyOS 1.3.2, so I run it manually after each reboot for now. Until it is run, Phicomm routers keep disconnecting due to failed IPV6CP negotiation incorrectly triggering complete PPPoE session termination. I have two PPPoE servers at different locations for redundancy, both rebooting at the same time is very unlikely, so I can live with it for now.
Aug 8 2022
See also https://github.com/accel-ppp/accel-ppp/issues/57
Testing this patch, PPPoE session with the Phicomm router now stays up, the missing part after "else" is to remove IPv6 configuration from ppp interface (not sure how to do it properly).
diff diff --git a/accel-pppd/ppp/ppp_ipv6cp.c b/accel-pppd/ppp/ppp_ipv6cp.c index 1194b31..2bac31b 100644 --- a/accel-pppd/ppp/ppp_ipv6cp.c +++ b/accel-pppd/ppp/ppp_ipv6cp.c @@ -738,7 +738,10 @@ static void ipv6cp_recv(struct ppp_handler_t*h) if (conf_ppp_verbose) log_ppp_info2("recv [IPV6CP TermReq id=%x]\n", hdr->id); ppp_fsm_recv_term_req(&ipv6cp->fsm); - ap_session_terminate(&ipv6cp->ppp->ses, TERM_USER_REQUEST, 0); + if (conf_ipv6 == IPV6_REQUIRE) + ap_session_terminate(&ipv6cp->ppp->ses, TERM_USER_REQUEST, 0); + else + ppp_layer_passive(ipv6cp->ppp, &ipv6cp->ld); break; case TERMACK: if (conf_ppp_verbose)
Aug 7 2022
Log messages - http://91.224.224.43/phicomm/phicomm6.log
PPPoE server config:
Jul 4 2022
Jun 2 2022
Jun 1 2022
Oct 14 2020
Just my thoughts - there are situations where rp_filter is not sufficient, and it was not clear to me how to do this cleanly with the zone firewall, so I ended up hacking a few iptables commands in rc.local instead.
Oct 2 2020
Sep 9 2020
Aug 31 2020
Even with customers routes redistributed by OSPF instead of iBGP, it has just crashed again:
I tried unit-cache earlier but it seems to have issues too - I've seen duplicate routes if the same client (all have static IP assigned by RADIUS based on username) connects to a different PPPoE server and the old route is not removed, as if the cached (not removed) PPPoE interfaces were not seen as removed in FRR. But I haven't investigated this in more detail as it's a production setup, can't experiment too much on live customers.
I'm considering if I could go back to redistributing PPPoE customers /32 routes in OSPF instead of iBGP - it has been that way for a few years (using MikroTik, before moving to VyOS), but I've recently changed it following "BGP Best Current Practices" http://www.bgp4all.com.au/pfs/_media/workshops/05-bgp-bcp.pdf which recommends using OSPF only for infrastructure, not customers - seems logical to me as BGP was designed for much larger routing tables (all of the Internet), but perhaps OSPF is still good enough for just a few hundreds of customers.
Aug 30 2020
I've just had two different routers (one bare metal and one VM) crash roughly at the same time, triggered by many PPPoE sessions disconnecting at the same time due to a short power failure (routers itself had power all the time, but power was interrupted for about a minute to a switch on the network between the routers and PPPoE clients). Stack traces are very similar (absolute addresses differ, but the same functions and offsets in them). And again, each time watchfrr restarted bgpd but it was not working until reboot. No problems so far with two other BGP routers running a similar configu but without any dynamic interfaces (only OSPF and BGP, no PPPoE servers).
Aug 28 2020
Aug 27 2020
It crashed again after 5 days in 1.2.6-epa1, in the same function, also when a dynamic PPPoE interface was deleted.
It happens less frequently after the former customers who repeatedly failed authentication have been physically disconnected.
Again, BGP no longer works after watchfrr has restarted the bgpd process. All works again after reboot.
Aug 25 2020
Aug 22 2020
Maybe related - https://github.com/FRRouting/frr/issues/6439