Also, Katran is very cool. Facebook is doing really cutting edge work in the networking space.
I'm curious where you work. I've been doing eBPF in production for a while as well, but our control plane is Go / C.
Unlike other options like DPDK, eBPF programs are tightly integrated in the kernel. They allow you to write kernel code that the kernel can safely run. They are simple to write, easy to install (it's just a syscall) and there are tools in Rust and other languages (check out BCC) that allow you to compile to eBPF bytecode from various languages. You're also not just limited to networking. eBPF programs can hook in to many places in the kernel, and there are already efforts to build new system performance tools using eBPF.
Do you consider eBPF(XDP) to be the clear next step in this space?
Check this out, it spells out why DPDK isn't a great solution.
The downsides to XDP is that it requires newer kernel releases, also XDP eBPF programs do not provide chaining and are limited in size (they can be chained, you just have to do it yourself).
I'm not saying DPDK is perfect. As an user, here are some of the most annoying things about DPDK:
- 100% CPU, even in down time. There has been works done recently in power management for DPDK but it still is quite limited.
- Debugging is difficult. Valgrind doesn't even work ootb.
- Very limited tool set compared to linux. For example no tcpdump.
- Setting up is cumbersome. Allocating hugepages must be done soon after reboot, NICs must be bound to uio,...
- Ad-hoc Layer 4-7.
That said, some of the packet processing libraries that come with DPDK is awesome. Once you get through the first few hurdles the dev experience is actually quite nice. I think combining DPDK with XDP is very promising.
And also a recent talk on how AF_XDP can be optimised further to get closer to DPDK speed (http://vger.kernel.org/lpc-networking2018.html#session-11) where it concludes "DPDK still faster for 'notouch', but AF XDP on par when data is touched". And latter is what matters.
I think combining the two definitely has huge potential:
- All the Linux tooling can be used again as drivers in kernel are used.
- No 100% busy polling needed, it's definitely not a must for workloads.
- Easy setup as simply kernel drivers are used as is.
- Vendors only have to maintain their kernel drivers, but not DPDK ones, so less cost
- Users can simply switch from one NIC to another without any hassle
- DPDK library for application development can be fully reused.
Sounds like a big win to me for both worlds.
It require kernel 4.18+ and you can use my guide to build test application very fast: https://github.com/pavel-odintsov/fastnetmon/wiki/af_xdp-tes...
Edit: Oh boy. That's so obviously not a criticism of the talk.
On the other hand, Cilium.io made a good summary of that.