Closing IPV6CP by client closes PPPoE link completely, even if IPv6 is optional
Open, HighPublicBUG
Actions

Assigned To

None

Authored By

	marekm
	Aug 7 2022, 3:21 PM

Description

Some background - https://forum.vyos.io/t/pppoe-server-dual-stack-ipv4-ipv6-buggy-soho-routers-that-refuse-ipv6/9010/4
traffic capture - http://91.224.224.43/phicomm/phicomm6.cap
RFC1661 - 3.7 Link Termination Phase - Implementation Note:

The closing of the link by LCP is sufficient.  There is no need
for each NCP to send a flurry of Terminate packets.  Conversely,
the fact that one NCP has Closed is not sufficient reason to cause
the termination of the PPP link, even if that NCP was the only NCP
currently in the Opened state.

Some cheap buggy unsupported SOHO routers (specifically, Phicomm KE2M) as PPPoE clients don't support IPv6, but try to negotiate it anyway and then fail, client sends IPV6CP terminate request, and VyOS PPPoE server responds by terminating LCP, even though it could still work fine as IPv4 only. This makes IPv6 deployment difficult, as changing ppp-options ipv6 "deny" to "allow" breaks existing IPv4 customers who happen to have such buggy routers.

Details

Difficulty level: Unknown (require assessment)
Version: 1.3.1-S1
Why the issue appeared?: Will be filled on close
Is it a breaking change?: Behavior change
Issue type: Bug (incorrect behavior)

Related Objects

Mentioned In: T6224: More easily produceable source builds

Event Timeline

marekm created this task.Aug 7 2022, 3:21 PM

pasik subscribed.Aug 7 2022, 5:45 PM

Log messages - http://91.224.224.43/phicomm/phicomm6.log
PPPoE server config:

set service pppoe-server authentication mode 'radius'
set service pppoe-server authentication protocols 'pap'
set service pppoe-server authentication protocols 'chap'
set service pppoe-server authentication protocols 'mschap'
set service pppoe-server authentication protocols 'mschap-v2'
set service pppoe-server authentication radius acct-timeout '0'
set service pppoe-server authentication radius nas-ip-address '91.224.224.11'
set service pppoe-server authentication radius server 91.224.224.XXX key '********'
set service pppoe-server authentication radius source-address '91.224.224.11'
set service pppoe-server authentication radius timeout '5'
set service pppoe-server client-ip-pool subnet '185.38.221.112/28'
set service pppoe-server client-ipv6-pool delegate 2a04:7840:dd70::/48 delegation-prefix '56'
set service pppoe-server client-ipv6-pool prefix 2a04:7840:dd71::/48 mask '64'
set service pppoe-server gateway-address '91.224.224.11'
set service pppoe-server interface eth2.49
set service pppoe-server interface eth2.50
set service pppoe-server interface eth2.55
set service pppoe-server interface eth2.59
set service pppoe-server mtu '1500'
set service pppoe-server name-server '91.224.224.55'
set service pppoe-server name-server '91.224.224.44'
set service pppoe-server ppp-options ipv4 'require'
set service pppoe-server ppp-options ipv6 'allow'
set service pppoe-server ppp-options ipv6-accept-peer-intf-id
set service pppoe-server ppp-options ipv6-intf-id 'random'
set service pppoe-server ppp-options ipv6-peer-intf-id 'calling-sid'
set service pppoe-server ppp-options lcp-echo-failure '3'
set service pppoe-server ppp-options lcp-echo-interval '10'
set service pppoe-server ppp-options min-mtu '1280'
set service pppoe-server ppp-options mppe 'deny'

client pools only defined to suppress a warning during "commit", all IPv4/v6 addresses/prefixes are actually static assigned by RADIUS.
Simply change ppp-options ipv6 "allow" to "deny" - and all customers with Phicomm routers stop disconnecting, work fine as IPv4 only.
The setup is proven working otherwise - 10/10 on test-ipv6.com and 19/20 on ipv6-test.com (-1 for lack of IPv6 revDNS) via properly configured MikroTik router as PPPoE client.

See also https://github.com/accel-ppp/accel-ppp/issues/57
Testing this patch, PPPoE session with the Phicomm router now stays up, the missing part after "else" is to remove IPv6 configuration from ppp interface (not sure how to do it properly).

diff
diff --git a/accel-pppd/ppp/ppp_ipv6cp.c b/accel-pppd/ppp/ppp_ipv6cp.c
index 1194b31..2bac31b 100644
--- a/accel-pppd/ppp/ppp_ipv6cp.c
+++ b/accel-pppd/ppp/ppp_ipv6cp.c
@@ -738,7 +738,10 @@ static void ipv6cp_recv(struct ppp_handler_t*h)
                        if (conf_ppp_verbose)
                                log_ppp_info2("recv [IPV6CP TermReq id=%x]\n", hdr->id);
                        ppp_fsm_recv_term_req(&ipv6cp->fsm);
-                       ap_session_terminate(&ipv6cp->ppp->ses, TERM_USER_REQUEST, 0);
+                       if (conf_ipv6 == IPV6_REQUIRE)
+                               ap_session_terminate(&ipv6cp->ppp->ses, TERM_USER_REQUEST, 0);
+                       else
+                               ppp_layer_passive(ipv6cp->ppp, &ipv6cp->ld);
                        break;
                case TERMACK:
                        if (conf_ppp_verbose)

syncer edited projects, added VyOS 1.3 Equuleus (1.3.2); removed VyOS 1.3 Equuleus.Aug 14 2022, 1:05 PM

syncer edited projects, added VyOS 1.3 Equuleus (1.3.3); removed VyOS 1.3 Equuleus (1.3.2).Aug 29 2022, 12:14 PM

As a temporary workaround, I use the script below. For some reason /etc/rc.local no longer runs automatically on VyOS 1.3.2, so I run it manually after each reboot for now. Until it is run, Phicomm routers keep disconnecting due to failed IPV6CP negotiation incorrectly triggering complete PPPoE session termination. I have two PPPoE servers at different locations for redundancy, both rebooting at the same time is very unlikely, so I can live with it for now.

#! /bin/sh
set -e

# block ipv6cp from broken phicomm routers (MAC prefix: d8c8e9, fc7c02)
PPPOE_IFACES="eth2 eth4"

for i in $PPPOE_IFACES ; do
  tc qdisc del dev $i ingress || true
  tc qdisc add dev $i ingress
  tc filter add dev $i parent ffff: u32 match u32 0xd8c8e900 0xffffff00 at -8 match u16 0x8864 0xffff at -2 match u16 0x8057 0xffff at 6 action drop
  tc filter add dev $i parent ffff: u32 match u32 0xfc7c0200 0xffffff00 at -8 match u16 0x8864 0xffff at -2 match u16 0x8057 0xffff at 6 action drop
done

To verify the script worked:

$ tc filter show dev eth2 ingress
filter parent ffff: protocol all pref 49151 u32 chain 0
filter parent ffff: protocol all pref 49151 u32 chain 0 fh 801: ht divisor 1
filter parent ffff: protocol all pref 49151 u32 chain 0 fh 801::800 order 2048 key ht 801 bkt 0 terminal flowid ??? not_in_hw
  match fc7c0200/ffffff00 at -8
  match 00008864/0000ffff at -4
  match 00008057/0000ffff at 4
        action order 1: gact action drop
         random type none pass val 0
         index 2 ref 1 bind 1

filter parent ffff: protocol all pref 49152 u32 chain 0
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? not_in_hw
  match d8c8e900/ffffff00 at -8
  match 00008864/0000ffff at -4
  match 00008057/0000ffff at 4
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1

syncer edited projects, added VyOS 1.3 Equuleus (1.3.4); removed VyOS 1.3 Equuleus (1.3.3).Jul 12 2023, 9:38 PM

syncer edited projects, added VyOS 1.3 Equuleus (1.3.5); removed VyOS 1.3 Equuleus (1.3.4).Aug 25 2023, 9:29 PM

syncer edited projects, added VyOS 1.3 Equuleus (1.3.6); removed VyOS 1.3 Equuleus (1.3.5).Dec 17 2023, 11:36 PM

dmbaturin triaged this task as High priority.Jan 10 2024, 10:35 PM

dmbaturin added projects: VyOS 1.4 Sagitta, VyOS 1.5 Circinus.

syncer edited projects, added VyOS 1.3 Equuleus (1.3.7); removed VyOS 1.3 Equuleus (1.3.6).Feb 10 2024, 9:17 AM

marekm mentioned this in T6224: More easily produceable source builds.Apr 29 2024, 8:32 PM

syncer edited projects, added VyOS 1.4 Sagitta (1.4.0-GA); removed VyOS 1.4 Sagitta.May 7 2024, 8:02 AM

syncer edited projects, added VyOS 1.3 Equuleus (1.3.8); removed VyOS 1.3 Equuleus (1.3.7).May 13 2024, 7:31 PM

dmbaturin edited projects, added VyOS 1.3 Equuleus (1.3.9); removed VyOS 1.3 Equuleus (1.3.8).Jun 20 2024, 10:17 AM

dmbaturin edited projects, added VyOS 1.4 Sagitta (1.4.1); removed VyOS 1.3 Equuleus (1.3.9), VyOS 1.4 Sagitta (1.4.0-GA), VyOS 1.5 Circinus.Tue, Jul 2, 7:28 PM

Hello @dmbaturin - it's now about 2 years since I've discovered this bug, any chances to review my patch?
I don't claim it's 100% correct, in simple tests it does prevent LCP termination but the closing IPV6CP part (after "else") needs review by someone with better understanding of accel-ppp internals.
I still have a few users of these buggy routers in my network, can help with testing and possibly backporting the fix to Equuleus.
Thanks!

@marekm Can you add the PR to the accel-ppp repo? I guess it will be better to fix it in upstream.
https://github.com/accel-ppp/accel-ppp

I think it's not yet ready for PR - needs review by someone more experienced with accel-ppp internals.
Issue already reported there - https://github.com/accel-ppp/accel-ppp/issues/57
Isn't upstream effectively the same now (accel-ppp owned by VyOS)?
Some issues I reported in accel-ppp were moved to discussions but met with silence otherwise. It might (or not, I don't know) have to do with some personal disagreements - someone other than me might have better luck.

Add PR on accell-ppp repo or patch in the vyos-build via PR https://github.com/vyos/vyos-build/tree/current/packages/linux-kernel/patches/accel-ppp
There are no other options for review.

Closing IPV6CP by client closes PPPoE link completely, even if IPv6 is optionalOpen, HighPublicBUGActions

Description

Details

Related Objects

Event Timeline

Closing IPV6CP by client closes PPPoE link completely, even if IPv6 is optional
Open, HighPublicBUG
Actions