Thanks for posting these links. On page 20 of the slide deck the author enumerates problems with existing user mode frameworks and one bullet point states:
• Limited support for interrupts
• Interrupts not considered useful at >= 0.1 Mpps
Does anyone have any insight into why Interrupts aren't considers "useful" at this rate? Is this a reference to NAPI and the threshold at which its better to poll?
The core of the problem is an optimization problem between saving power and optimizing for latency.
Lets say you are receiving around 100k packets. You can fire 100k interrupts per second. Sure, no problem. But you'll probably be running at 100% CPU load.
NAPI doesn't really help here. You'll see a quite high CPU load with NAPI (if measuring it correctly, CONFIG_IRQ_TIME_ACCOUNTING is often disabled), just not the horrible pre-NAPI live locks.
What will help is a driver that limits the interrupt rate (Intel drivers do this by default, the option is called ITR). Now you've got a high latency instead (for some definitions of high, you'll see 40-100 µs with ixgbe).
Note that the interrupts are now basically just a hardware timer. We don't need to use a NIC for hardware timers.
This is of course only true at quite high packet rates (0.1 Mpps was maybe the wrong figure, let's say 1 Mpps).
Interrupts are great at low packet rates.
All the DPDK stuff about power saving is also quite interesting. Power saving is a relevant topic and DPDK is still quite bad at it. I think the most promising approach is dynamically adding and removing worker threads as well as controlling the CPU frequency from the application (that knows how loaded it actually is!) is currently the most promising approach.
Unfortunately, most DPDK allocate threads and cores statically at the moment.
From the draft paper: "It should be noted that the Snabb project [16] has similar
design goals, ixy tries to be one order of magnitude simpler
than Snabb. For example, Snabb targets 10,000 lines of
code [15], we target 1,000 lines of code and Snabb builds on
Lua with LuaJIT instead of C limiting accessibility."
Snabb aims at practicality while Ixy aims at education, as I understand it, though I haven't read either one yet.
Most expensive thing will be the switch, the cabling is not that expensive.
For a 10g card to work you'll need a pair of SFP+ transceivers, one for the card, and another one for the other element you're connecting to (a switch, a router, another computer...) This transceiver can be chosen for copper or fiber depending on your needs, and each transceiver will be from 50 to 100$. Then you buy fiber or copper Cat6a cable.
But, if your two devices are near (less than 10 meters), you can use a Twin Axial DAC (Direct Attach Cable) : you won't need the transceivers, you can just directly connect the two devices with this cable into the SFP+ port, and it's kind of cheap, for example, 12€ a 3meter run:
> You can do physical loopback by connecting the two ports together.
this ! and this is waay better, because you can run all simulations (perf etc.) whatever on the same machine. for example, you can run client on 1 cpu-core, and server on another, and have packets looped over the physical interface.
The problem with these NICs is that they are PCI or even ISA. Also, they are quite rare compared to, e.g., an igb/e1000e NIC. A driver for an igb/e1000e NIC would be quite similar to the ixgbe driver.
However, since the main goal of using these NICs seems to be support in virtual environments: just using virtio is better.
Demystifying Network Cards
- https://events.ccc.de/congress/2017/Fahrplan/events/9159.htm...
- https://streaming.media.ccc.de/34c3/relive/9159