Page MenuHomeVyOS Platform

Firewall refactor
Closed, ResolvedPublicFEATURE REQUEST

Description

Task for firewall re-writing.

Details

Difficulty level
Hard (possibly days)
Version
vyos-1.4-rolling-202304130846
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Internal change (not visible to end users)

Event Timeline

n.fort changed the task status from Open to In progress.Apr 14 2023, 3:11 PM
n.fort claimed this task.
n.fort changed Difficulty level from Unknown (require assessment) to Hard (possibly days).
n.fort changed Version from - to vyos-1.4-rolling-202304130846.

1:
Shouldnt set firewall global-options resolver-cache have "enable" and "disable" as options?

2:
Migration can be improved:

Original config (pre 1.4-230814):

firewall {
    state-policy {
        established {
            action accept
        }
        invalid {
            action drop
        }
        related {
            action accept
        }
    }
}

The above ended up with (after migration, post 1.4-230814):

    ipv4 {
        forward {
            filter {
                default-action accept
                rule 1 {
                    action accept
                    state {
                        established enable
                    }
                }
                rule 2 {
                    action drop
                    state {
                        invalid enable
                    }
                }
                rule 3 {
                    action accept
                    state {
                        related enable
                    }
                }
            }
        }
        input {
            filter {
                default-action accept
                rule 1 {
                    action accept
                    state {
                        established enable
                    }
                }
                rule 2 {
                    action drop
                    state {
                        invalid enable
                    }
                }
                rule 3 {
                    action accept
                    state {
                        related enable
                    }
                }
            }
        }
        output {
            filter {
                default-action accept
                rule 1 {
                    action accept
                    state {
                        established enable
                    }
                }
                rule 2 {
                    action drop
                    state {
                        invalid enable
                    }
                }
                rule 3 {
                    action accept
                    state {
                        related enable
                    }
                }
            }
        }
    }
    ipv6 {
        forward {
            filter {
                default-action accept
                rule 1 {
                    action accept
                    state {
                        established enable
                    }
                }
                rule 2 {
                    action drop
                    state {
                        invalid enable
                    }
                }
                rule 3 {
                    action accept
                    state {
                        related enable
                    }
                }
            }
        }
        input {
            filter {
                default-action accept
                rule 1 {
                    action accept
                    state {
                        established enable
                    }
                }
                rule 2 {
                    action drop
                    state {
                        invalid enable
                    }
                }
                rule 3 {
                    action accept
                    state {
                        related enable
                    }
                }
            }
        }
        output {
            filter {
                default-action accept
                rule 1 {
                    action accept
                    state {
                        established enable
                    }
                }
                rule 2 {
                    action drop
                    state {
                        invalid enable
                    }
                }
                rule 3 {
                    action accept
                    state {
                        related enable
                    }
                }
            }
        }
    }
}

2.1:
Suggestion that established/related merges to a single rule such as:

rule X {
    action accept
    state {
        established enable
        related enable
    }
}

2.2:
For security reasons invalid should be processed before established/related (which is processed before anything else) due to that nftables is first-match/topdown execution.

So suggestion that the original config should be migrated into something like:

xxx {
    filter {
        default-action accept
        rule 1 {
            action drop
            state {
                invalid enable
            }
        }
        rule 2 {
            action accept
            state {
                established enable
                related enable
            }
        }
        rule XXX {
        # other rules...
        }
    }
}

3:
Bonus: Personally I find it odd to have a firewall that defaults to "default-action accept" on the other hand one could argue that VyOS is primaryly a router and not a firewall and a router will by default enable all traffic.

2.1:
Suggestion that established/related merges to a single rule such as:

--> I would say that almost always legacy configuration for state policies is the same as the example you have shared: established and related accept, invalid drop. In that case, you are able to join established and related into one rule.
But we can't predict and asume that every configuration will be like that. So we need to treat them independently, to ensure that migrated config is equals to legacy.

2.2:
For security reasons invalid should be processed before established/related (which is processed before anything else) due to that nftables is first-match/topdown execution.

--> I don't see a security breach there. Either if invalid rules is 1st, 2nd or 3rd, will be treated properly (let say drop). In most of cases, rules that matches established state will get more hits than rule that matches invalid state. So, for performance, it is better to define established before invalid rule

3:

Once again, this is done for backwards compatibility. The goal is to get the same (or as similar as possible) behavior after upgrading to vyos version with new cli.

Now we have this included in the nightly builds, is there any documentation on how these refactored rules should be modified? Just bumped my version and was completely lost

2.2: Invalid shall ALWAYS be processed BEFORE established/related/other rules otherwise it will not serve it purpose.

The way the migration turns it into rule 1, 2 and 3 I still argue that the invalid ruleset should be processed FIRST and THEN you can have established as rule 2 and related as rule 3.

From the first-match/top-down perspective (which iptables and nfttables execute rules) any malformed packet trying to circumvent the firewall (illegal TCP flags combo etc which the Invalid will detect) shall be dropped first.

Whatever is left should match against Established (for performance) as rule2 and then Related as rule3.

And THEN we will process other custom rules as rule4 etc. which then finally ends up with the "default-action" when there are no more rules to process.

2.2: Invalid shall ALWAYS be processed BEFORE established/related/other rules otherwise it will not serve it purpose.

You will never have a packet state that is both established and invalid. There is no security issue at play here. Any potential performance hit should definitely lean towards established being processed first.

If there would never be such then "INVALID" wouldnt exist as an option.

If there would never be such then "INVALID" wouldnt exist as an option.

An invalid packet might still match a later rule if the second rule wasn't there.
The point is an invalid packet cannot also match established, so there's no security risk in ordering the established allow rule before the invalid drop rule

As seen on slack and I think on the forum.

VRRP seems to be broken after the refactoring of the firewall.

Possibly other services affected aswell.

User claims clearing the ruleset with nft flush ruleset will make VRRP work as expected.

It's me. I'm User! Happy to provide before (1.4-rolling-202308060317) and after (1.4-rolling-202308180646) VyOS configs and nft dumps for analysis. I can reliable reproduce/correct the issue by switching between those two images.

I don't know how to dump, nftables, but happy to if it helps!

A dirty workaround would be to include a "hidden" (as in it exists in nft but not displayed in the vyos-config itself) CoPP table which includes the port(s) needed for:

ICMP, NTP, DNS, SNMP, SSH, DHCP, DHCP6, RA, VRRP, BGP, RIP, RIPNG, OSPF, OSPF6, ISIS, BABEL, LDPD, EIGRP, BFD, Wireguard, OpenVPN and so on.

That is services runned on the VyOS for INPUT and OUTPUT chains.

For security reasons each port should only be open if the service is actually being enabled.

For example TCP22 in incoming (and outgoing for SCP purpose) should only be enabled if "service ssh" is enabled.

Here is for example the Arista default control plane ACL:

https://arista.my.site.com/AristaCommunity/s/article/default-control-plane-acl-explained

https://nlnog.net/static/live/nlnog_live_sep_2020_joris.pdf

I disagree with that. Cause only why bgp is running, we don't need the port to be reachable on all interfaces or for all source IP's.

Comparing with other vendors thats what you use the ACL for.

That is the CoPP default ACL itself is very "open" (or "basic" is perhaps a better word) as it allows for lets say TCP179 (BGP) for input and output chain.

If you want to limit this behaviour you apply an ACL thats either binded to the BGP process or acts on the interface itself (lets say eth5).

Some vendors will also add a hidden ACL based on the neighbor settings in your BGP config to automatically protect the BGP-process from evil hosts.

There are also examples of vendors who adds hidden ACL's in order to protect for example MGMT-services so the user wont cut off the branch they sit on.

n.fort changed the task status from In progress to Needs testing.Aug 23 2023, 11:16 AM

I would disagree with a hidden ruleset.

There’s no consensus among vendors. Juniper uses a default deny for all management traffic rather than allow. You must specifically permit inbound traffic for management traffic in each zone.

These CoPP rules are primarily to protect dedicated hardware control plane CPUs that are usually only powerful enough to run legitimate management traffic. i don’t believe that’s currently relevant to VyOS running on commodity hardware.

Additionally, VyOS is very popular in education for students to learn networking. Hidden rules/defaults leads to confusion with new learners who expect to get exactly what they see. Once one has the basics down, they can better learn the nuances of specific vendor implementation.

Intuitively, if a user explicitly enables a feature with inherent management traffic (BGP, VRRP); their expectation would be for the feature they just enabled to work. In my opinion, the current behavior violates principle of least surprise. It also seems different from 1.3 (and earlier 1.4 behavior?)

I also see the security and wanting users to explicitly understand what they're doing arguments. For example, a user may have a compliance requirement to document all open ports -- said user may not realize that functioning VRRP implies certain open ports -- which may then fail to be documented.

Does VyOS have any capability to have implicit rules which show up when viewed but aren't editable (ie enabling vrrp creates a visible-but-uneditable rule allowing the traffic -- disabling vrrp deletes this rule.)

If not, maybe a pragmatic work-around would be adding a warning to the UI when enabling keepalived? The warning could link to an explanation and further steps in the documentation.

I don't get which exact issue with VRRP
but if you have an issue, please add a separate task.
Provide sudo nft list ruleset before and after the bug version. Bug with VRRP definitely not clear for now.

@giga1699 There are already plenty of hidden stuff going on if you take a look at the output of nft -s list ruleset.

Having CoPP (or call it something else for that matter, for example VYOS-SERVICES or something) must be included to deal with the case if the admin sets default-action deny (which is how a proper firewall should be configured, one could on the other hand argue that VyOS is a router and not a firewall and therefor would have default-action allow).

That is when/if lets say NTP is being configured then VyOS should automatically add nft rules to allow for outbound NTP traffic aswell as inbound if such NTP configuration is set. And this should be done automatically without cluttering the vyos-config itself. Same for outbound DNS if resolver is configured, same for inbound and outbound DHCP if DHCP-client or DHCP-server is configured. Same for VRRP if that is configured and so on. An advanved edition for lets say BGP would to interpret "neighbour x.x.x.x" and use x.x.x.x as valid srcip with dstip <vyos-ip> and dstport tcp179 for INPUT and the other way around for OUTPUT.

Also allowing localhost to speak to itself should be part of the "hidden" ruleset:

{
 action "accept"
 description "Allow localhost"
 inbound-interface {
  interface-name "lo"
 }
}

{
 action "accept"
 description "Allow localhost"
 outbound-interface {
  interface-name "lo"
 }
}

Arista, PaloAlto, Checkpoint among others utilize this method of "hidden" ruleset to deal with services the firewall itself is setup to use, for example VRRP etc.

@jworrell You can see what rules vyos creates on its own when you do nft -s list ruleset which will bring you the full ruleset used by the firewall. For example when you enable one of the helpers (which I personally think should be disabled by default) then various ports are added to the ruleset.

There is currently a task regarding remains/leftovers from rpc helper which should be removed when the rpc helper is disabled.

@jworrell I agree that if an administrator turns on a service it should be functional. If no firewall is configured, and a security ruleset isn't required for the use case, there's no issue with something being in place that allows that traffic for extra comfort. However, if security rules are in place it should be the burden of the administrator to define how that management traffic should be handled. This would be consistent with previous versions of VyOS that if you applied a default-deny to the local direction of an interface, you would need to specify any management traffic for the interface explicitly. By introducing hidden allows, this would violate the principle of least surprise that you mentioned.

@Apachez Just because an administrator enables a service, doesn't mean they want a hole punched in their firewall everywhere. Especially if they had set a default deny in place. Your proposal that if BGP gets enabled all interfaces can talk on TCP/179 seems overly broad, and potentially opens up that process to a remote vulnerability where it shouldn't be open. Just because SSH is enabled, doesn't mean it should be open on all interfaces. Just because a DNS resolver is enabled, doesn't mean TCP_UDP/53 traffic should be sent out to the internet. Why would you default to expanding the attack surface?

Even in your advanced example of including source IP, there exists the potential that since you aren't filtering on source interface(s) that a lower cost route to that IP address gets introduced into your routing table through a different interface and now you're leaking data where it shouldn't be going. This could be compared to your recent post about TunnelCrack.

It's interesting that you would propose such a security hole given your previous stances on the priority of proper security. Hiding an implicit allow that punches holes in the firewall without being explicitly displayed in the configuration presents a security risk to young administrators, or those without experience with VyOS. If anything, the implicit deny (fail secure) method that Juniper SRX devices are using is more appropriate for a device utilizing security features, with an explicit stanza saying the administrator wants that service available for that security zone.

I wouldn't have a problem with rules if there was a default allow in place, or the firewall itself wasn't being used since it would be a moot point, but I expect a default deny to actually deny unless something is explicitly configured as an allow. I shouldn't have to go back and add deny rules to cover up some automated hole punching in the firewall that I may or may not know exists, or that the automated approach was overly broad and opened up too much. Plus, how do you reconcile potential conflicting rules? If I put in a deny, will it get overridden by your automated allow? That would not be desirable behavior.

An administrator shouldn't have to drop into the CLI to manually evaluate nftables unless they configured something custom in nftables themselves. Any configuration that opens up security holes should be explicit if it's automated since it directly affects the security of the system, and should be included in the output of an appropriate show command.

The hidden CONNTRACK rules aren't affecting if traffic will pass through the firewall or not, so those being hidden aren't the security concern that a hidden hole in the firewall would present. Much like NAT rules don't need to be viewed from nftables since the configuration would be seen in a show command. MSS adjust would also be explicitly viewed in the configuration.

I agree that the helper rules shouldn't be enabled by default, at least not if any default deny exists.

@giga1699 Again, if I as an administrator enable BGP and configure it with "neighbor x.x.x.x" I expect this to work without having to setting up multiple additional firewall rules on my own. Same goes with if I enable DHCP-server on the VyOS - I expect it to work.

My proposal is NOT to enable BGP on all interfaces, my proposal is that the config-engine should analyze which "neighbor x.x.x.x" is configured for the BGP process and add those for TCP179 traffic in INPUT and OUTPUT chains. This way if an srcip which isnt defined by the neighbor-statement try to speak to TCP179 that traffic will be dropped (and logged).

If possible this could be configured through a global-option if the VyOS should aid in creating needed firewall rules or if VyOS should ignore and left the administrator on its own to figure out why things isnt working.

A first step would however be to enable/disable based on ports based on if the service is configured or not. And as a 2nd step add more logic (such as the above regarding "neighbor x.x.x.x").

For DNS if I configure "name-server <internet IP-address>" I expect it to work without having to manually add additional rules for INPUT and OUTPUT chains.

The result of forcing the customers to setup their own rules will be that they will just "ehh f**k it!" and sets "default-action: accept" which is a much worser option than having VyOS aid in making sure that the services the admin have choosen to enable will actually do their work and well do their work :-)

Also today VyOS defaults to "action-drop: accept" which means that its already vulnerable to the attacks you are thinking of simply because there is no current filtering at all towards local services.

The hidden helper rulesets do affect if traffic is allowed or not. By your take enabling a helper should not affect the nftables ruleset - the admin must configure the rules they want on their own?

It boils down to if VyOS should be seen as a router of if it should be seen as a firewall product (but even firewall products such as PaloAlto Networks, Checkpoint among others use hidden rulesets to make sure that vital services wont get butched by the admin by accident).

The firewall will not be autoconfigured by bgpd or something else. We are not going to do it.

Then perhaps add it as an global-option or similar to make life easier for the admin to not having to dig into how each service should have the firewall configured in order to make it work properly?

Something like (default is disable):

set firewall auto-ruleset bgp enable
set firewall auto-ruleset dhcp-client enable
set firewall auto-ruleset dhcp-server enable
set firewall auto-ruleset dns-client enable
set firewall auto-ruleset dns-server enable
set firewall auto-ruleset ntp-client enable
set firewall auto-ruleset ntp-server enable
set firewall auto-ruleset ssh-client enable
set firewall auto-ruleset ssh-server enable
set firewall auto-ruleset vrrp enable

along with (default is none):

set firewall auto-ruleset bgp interface 'eth0 eth1'
set firewall auto-ruleset dhcp-client interface 'eth0 eth1'
set firewall auto-ruleset dhcp-server interface 'eth0 eth1'
set firewall auto-ruleset dns-client interface 'eth2 eth3'
set firewall auto-ruleset dns-server interface 'eth2 eth3'
set firewall auto-ruleset ntp-client interface 'eth0 eth5'
set firewall auto-ruleset ntp-server interface 'eth0 eth5'
set firewall auto-ruleset ssh-client interface 'eth7 eth8'
set firewall auto-ruleset ssh-server interface 'eth7 eth8'
set firewall auto-ruleset vrrp enable interface 'eth9'

@Apachez I would also not want this. Example bgp on eth0 with one peer. I would not like to see to have the bgp port open for all source ips, only for the configured peers and not more.
To make it better to manage for the admins I would like to see a syntax like in junos:

prefix-list bgp-neighbors {
     apply-path "routing-instances PROD protocols bgp group <*> neighbor <*>";
 }

to create dynamic network groups

@rherold Well thats how it is today with default-action:accept where ALL ports are open to ALL services on ALL interfaces.

But after this discussion I will move that subdiscussion into its own task: https://vyos.dev/T5509

Found some anomalies regarding show firewall command (I assume related to the refactoring) which I have reported in https://vyos.dev/T5513

sarthurdev changed the status of subtask T5080: Conntrack enabled by default from In progress to Needs testing.Aug 27 2023, 8:21 AM