
How to drop 10M packets per second - signa11
https://blog.cloudflare.com/how-to-drop-10-million-packets/
======
blattimwind
I recently heard about VPP for the first time; VPP is one of those user-space
networking frameworks. Apparently it's the fastest thing currently around by a
large margin ("10 mpps per core? Too slow for me!"); it's alluded that Cisco's
ASR9000 routers use the same core tech to do all routing in software.

~~~
baybal2
Important note, DPDK and VPP are not faster by the virtue of being based in
userspace. It is best to say that they just provide fastest way in between
interrupt handler, and the piece of userspace code. The application will then
have to reimplement most of the network stack functions that run in kernel in
itself. The speedup is there because the application author can freely slim
down the network stack to minimal. Secondly to that DPDK took some
deliberation on better driver buffer management, ensuring zero copy path,
interrupt and syscall efficiency.

~~~
monocasa
AFAIK, you don't even run interrupts off the NIC if you're doing DPDK; the
overhead is too high. You poll instead, and just dedicate a core.

~~~
baybal2
I'm unsure. What I'm imagining is that NICs for which DPDK is made have option
to do interrupts like "hey, we have just run out of 50% of buffer! Driver, go
offload packets to RAM in bulk now"

~~~
jsolson
DPDK's drivers are (as far as I know?) all poll-mode:
[https://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html](https://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html)

~~~
shaklee3
Correct

------
barbegal
"With XDP we can drop 10 million packets per second on a single CPU." is not
really correct, the packets are being dropped on the network card which has
its own processor. The packets never get transferred to the main CPU.

~~~
rayiner
That’s not true. While there are NICS that can drop packets in hardware (e.g.
Chelsio) that is not what XDP does. XDP runs eBPF programs in a driver-
implemented recieve callback. The recieve callback happens at the lowest level
of the Linux network stack, when an interrupt is delivered to a processor and
a packet descriptor is queued by the NIC, but before the generic packet
handling metadata (SKB) is allocated. So XDP operates on the packet in place,
but the code runs on the host CPU.

~~~
iajrz
When you say "in place" you mean the data is still on the NIC? Or is the data
'elsewhere in RAM'?

I'm curious because this would be the difference between "scale with more
cores" or "scale with more NICs".

~~~
penagwin
If I read that correctly the data is still on the NIC. The CPU receives an
interrupt - notifying it of a new packet, which it then decides to drop? Thus
the CPU is the one dropping packets, but the data (most of it at least) is
still on the NIC.

Please correct me if I'm wrong.

~~~
emmericp
This all happens after the packet is transferred over PCIe to the CPU/memory
(even the interrupt happens after DMA transfer). Otherwise the XDP code
wouldn't have access to the packet to filter it; the example doesn't just drop
all packets, it drops all packets matching a /24 subnet which happens to be
all packets here ;)

~~~
penagwin
Ah ok! I was kinda confused by that xD

------
ck2
now find a way for me to enter an unknown/untrusted website that is using your
"ddos protection" without having to allow inline javascript to execute

because as you clearly know/demonstrate, the web can't be trusted by default

------
JensRex
I'm an electrician. I can help you drop packets really quick. Available for
hire.

