
BPF comes to firewalls - jasonma
https://lwn.net/SubscriberLink/747551/7b37e1ce398c30c0/
======
majke
At Cloudflare we use BPF in firewall extensively. We blogged about this in
past:

* [https://blog.cloudflare.com/bpf-the-forgotten-bytecode/](https://blog.cloudflare.com/bpf-the-forgotten-bytecode/)

* [https://blog.cloudflare.com/introducing-the-bpf-tools/](https://blog.cloudflare.com/introducing-the-bpf-tools/)

* [https://blog.cloudflare.com/introducing-the-p0f-bpf-compiler...](https://blog.cloudflare.com/introducing-the-p0f-bpf-compiler/)

We even have a piece of software called "floodgate" which is pretty much a
custom runtime for iptables compiled down to BPF. It's using proprietary
driver magic to offload the firewall rules engaged during large L3 attacks,
and runs them with high performance in userspace. More:

* [https://blog.cloudflare.com/meet-gatebot-a-bot-that-allows-u...](https://blog.cloudflare.com/meet-gatebot-a-bot-that-allows-us-to-sleep/)

More BPF integration in iptables is a very good idea. The interesting bit is
how to deal with more complex things like haslimits, limits, ipsets and
conntrack integration.

------
comex
So the Berkeley Packet Filter is now being used for packet filtering? What a
wonderful development! What will they think of next? :)

~~~
metalliqaz
Yes the title caught me off guard as well. As a BSD user for many years, it is
strange to think of it as "new".

~~~
bodyfour
eBPF is far evolved from "classic" BPF, though. There is a historical
relationship (hence the name) but they're very different animals at this
point.

Also, the traditional use of BPF wasn't to filter network traffic, but only to
sieve data flowing to userland tools like tcpdump.

------
nialv7
For anyone who don't know: nftables has its own byte code interpreter for
filtering too. And at the time nftables was created, bpf already existed in
Linux.

The author's argument against using bpf is that it's hard to reproduce the
human readable rules from bpf byte code. Wonder how bpfilter is going to
should this problem.

------
callesgg
What we are missing from linux today is a networking subsystem that allows for
configurable efficient hardware offloading.

Without hacking up everything, like it is done in consumer routers these days.

Will BPF handle that better than nftables?

~~~
chrissnell
In this age where major security flaws are being discovered in hardware, do we
really want to offload network filtering to what might be black box hardware
with limited/no auditability?

~~~
viraptor
There are various levels of offloading. There are basic operations like
checksum offloading that existed for a long time and it's optional. Not sure
anyone would complain about the auditability of that one. (you would see
errors on the next hop if it didn't work correctly)

~~~
kardos
I interpreted OP's point as moving that kind of complexity to a black box
would open it up to being compromised and then leveraged as a backdoor into
your system, or as a botnet, etc.

------
loudmax
> Developers should be careful, though; this could prove to be a slippery
> slope leading toward something that starts to look like a microkernel
> architecture.

That's an interesting warning. Pushing more tasks out of the kernel would seem
like a good idea to me. I thought Torvalds' argument against a microkernel
design was more about performance than complexity. Is that incorrect?

~~~
mavhc
I assumed this was a joke, based on early Linux vs microkernel debates.

------
danwent
For those interested in a really technical deep-dive on BPF, check out:
[http://cilium.readthedocs.io/en/latest/bpf/](http://cilium.readthedocs.io/en/latest/bpf/)

If you have questions on BPF and Cilium for advanced firewalling, you can also
ask them on the cilium slack:
[http://www.cilium.io/slack](http://www.cilium.io/slack)

------
jandrese
After many years of learning the many options to iptables, I can only say it's
about time. The iptables command is one of the least ergonomic ones I use
regularly.

That said, while BPF syntax is great for simple cases, the boolean logic gets
pretty messy in a hurry if you want to do something weird.

Simple case comparision:

    
    
      iptables: iptables -t filter -A FORWARD -p tcp --dport 80 -j ACCEPT
    
      theoretical bpf: allow forward tcp dst port 80
    

Note that the capitalization in the first command is not arbitrary, it must be
there for the command to work, exactly as shown. Also, I didn't switch to the
double dash option on dport frivolously, there is no single dash option for
this incredibly common feature. Plus the manpages are split into a bunch of
different parts and the SEE ALSO section at the bottom is woefully incomplete,
making it difficult to track down exactly the page you need.

That said, incorporating all of the features from iptables into a BPF syntax
is going to require a considerable expansion over what you get with tcpdump.
Things like marked packets, NAT, state tracking, etc... all need to be grafted
onto the language somehow. And of course everything needs to be well
documented because a lot of sysadmins are going to need to learn this in a
hurry and bad documentation will make them hate it and insist on keeping
iptables instead, warts and all.

~~~
vbernat
There are two man pages: iptables(8) and iptables-extensions(8). What other
man pages are you referring to?

~~~
jandrese
I had remembered the conntrack stuff being in a different manpage, but maybe
that's faulty memory on my part.

~~~
vbernat
The iptables-extensions(8) manual page is fairly recent. AFAIK, previously,
there was no manual page, but maybe it was scattered over various pages.

------
tinco
What do sysadmins generally use BPF or other more advanced firewall systems
for?

I've administered only a few production systems, and the firewalls I
configured were always very simple. Reject all traffic incoming except for
port 22/23, 80/443 and outgoing except for to certain package management
systems, that sort of thing.

I admit I've done some slightly more complex things to rewrite things to
implement some virtual network thing, but I don't think I did that in
production.

------
tytso
One potential advantage of the new BPF code for firewalls is that may make it
easier to excise code owned by a certain copyright troll....

~~~
ciupicri
Are you referring to Patrick McHardy a former contributor to Netfilter?

[https://www.theregister.co.uk/2017/10/18/linux_kernel_commun...](https://www.theregister.co.uk/2017/10/18/linux_kernel_community_enforcement_statement/)

------
arca_vorago
"The Linux kernel currently supports two separate network packet-filtering
mechanisms: iptables and nftables. For the last few years, it has been
generally assumed that nftables would eventually replace the older iptables
implementation; few people expected that the kernel developers would, instead,
add a third packet filter. But that would appear to be what is happening with
the newly announced bpfilter mechanism. Bpfilter may eventually replace both
iptables and nftables, but there are a lot of questions that will need to be
answered first."

Goshdarnit. I've been trying to get ahead of the curve and have been learning
and implimenting nftables, and now you're telling me I might need to learn
something else! Such is the life of a sysadmin I suppose.

"The use of BPF enables the writing of firewall rules in C"

Have you seen the rulesets people write in other firewall languages? This
seems scary to me.

"One of the core design features for bpfilter is the ability to translate
existing iptables rules into BPF programs."

nftables also does this, but I suggest not using it and writing fresh

"even though it would be likely to supplant nftables relatively quickly.
Instead, Miller said in the discussion that nftables failed to address the
performance problems in Linux's packet-filtering implementation, driving users
toward user-space networking technologies instead. There is a real possibility
that nftables could end up being one of those experiments that is able to shed
some light on the problem space but never takes over in the real world."

Ok, well I have some questions here. First, show me the benchmarks. Also,
nftables is still much faster than iptables in my benchmarks, so it has
largely delivered. Of course it's difficult to compete with an asic offload,
but I do see how there could be lots of potential with bpf if it offloads to
the interface. That said, the real potential I see is for nftables and bpf to
coexist in the future as a replacement for iptables or for iptables and bpf.
nftables solves a lot of real problems and working with it has been really
enjoyable for me compared to years of iptables rules (I always refused to use
the layer-on-top-of-iptables abstractions, so I'm talking about pure
iptables.) A quick glance at bpf seems like it would be worth it for extremely
high requirement cases where the investment would be worth it (just noticed
the cloudflare comment for example), but for the rest of us mortals in IT deps
with limited budgets, time, and knowledge workers, it seems a bit too heavy to
just start implimenting.

I could be wrong, but I really hope nft succeeds despite this.

~~~
marios
I'd really like it if instead of having "kernel developers add a third packet
filter", said developers would sit down a bit and agree on how to manage
firewalling at the userspace level. iptables feels like a tool that was
developped to test out netfilter rules (that is, the kernel part) but not
really for actual use. That would explain why there are so many frontends that
attempt to abstract away the ugliness. I've seen too many "firewall setups"
that are just a script calling iptables for each rule. While this works, it
can be dangerous as well: if you edit the file and mess up a rule, there's a
risk that only part of the ruleset is loaded. Hopefully, the operator won't
have locked himself out in the process. Of course, there's the iptables-
persistent package for atomically loading a ruleset (in Debian at least). The
problem is that there's also a netfilter-persistent package. What's the
difference between the two ?

How is one supposed to pick ? Then, there's also nftables. I've only glanced
at it, and it looks promising but it seems there are some things that are
missing. A comment in the article says that TCP MSS clamping has been added
only recently. By the looks of it, it appears to be "almost there" but not
quite ... which is a shame.

I'm hoping whatever implementation ends up prevailing will solve the various
technical problems (performance, features w.r.t filtering capabilities) but
will also provide a sane way to manage it. I feel kind of sad with the current
situation. With my developer hat on, I am continuously impressed with the
networking features available on Linux (Netfilter, XDP, ...). With my operator
hat on, I find the general lack of usable tools as well as the inconsistency
maddening.

------
ciupicri
systemd implemented eBPF-based per-unit IP access lists and accounting [1] in
version 235.

[1]:
[https://github.com/systemd/systemd/pull/6764](https://github.com/systemd/systemd/pull/6764)

------
vegasbrianc
This is what project Cilium does -
[https://github.com/cilium/cilium](https://github.com/cilium/cilium)

~~~
deltaprotocol
From your link:

>A new Linux kernel technology called BPF is at the foundation of Cilium. It
supports dynamic insertion of BPF bytecode into the Linux kernel at various
integration points such as: network IO, application sockets, and tracepoints
to implement security, networking and visibility logic. BPF is highly
efficient and flexible.

------
rjsw
The NPF firewall in NetBSD also uses BPF as its rule engine along with a JIT
for several CPU architectures.

------
INTPenis
This reminds me of a cloudflare blog post I read a few years back about the
xt_bpf module for iptables.

[https://blog.cloudflare.com/bpf-the-forgotten-
bytecode/](https://blog.cloudflare.com/bpf-the-forgotten-bytecode/)

I wonder if the projects are related.

~~~
borplk
[https://news.ycombinator.com/item?id=16420328](https://news.ycombinator.com/item?id=16420328)

------
mrmondo
Really please to see BPF making further progress, well done and thank you to
those involved in the implementation, testing and review process across the
various projects involved.

------
ComodoHacker
Well, more user-controlled code executed by the kernel means more fun! And
more work for security researchers.

------
snvzz
I'd look at Dragonfly's stack first, as that's outperforming Linux at the
moment.

~~~
mdekkers
_I 'd look at Dragonfly's stack first, as that's outperforming Linux at the
moment._

Apples, Oranges, etc.

~~~
mdekkers
Instead of senseless downvotes, how about some comments instead? Dragonfly
isn't Linux, it is a BSD, therefore the OP is comparing apples to oranges. If
you disagree, say why, don't just hit the downvotes. HN is rapidly becoming
the Reddit for technosnobs.

~~~
seeekr
I would guess that both of you received downvotes because both comments failed
to provide any explanation for the view expressed. I think it's good practice
on HN to briefly introduce a technology where it's safe to assume that a
majority are not very likely to be familiar with it, so that the comment
becomes meaningful on its own.

If you had stated: "You are comparing apples to oranges here because Dragonfly
only exists on BSD, not on Linux, and as such may not be a viable choice for
most users." I think you would have received no or substantially fewer
downvotes. If you then had in addition to that provided a quick explanation
(or more) of what "Dragonfly" is, you would have gotten upvotes instead,
because that would have taught a number of us something we did not know yet.

