First of all, what a disorganized article. Who the hell edited this?
Secondly, the article surmises that the kernel is the problem, which seems right. Then it makes a leap to state that the way to do this is to write your own driver.
Who said? I certainly don't agree. You are still running a multi-user kernel, but you've now sacrificed its stability by running a parallel network stack. Linux wasn't written to work in that way and it's hard to know what you're getting into when you subvert your OS in that way. This article would be much better if it talked about other options out there. For example..
Why not take out the middle-man entirely with something like OpenMirage (http://www.openmirage.org/)? Get rid of the lowest-common-denominator performance you get using a general-purpose OS and make your application the OS. Link in whatever OS-like services (network, filesystem, etc) you may need. Talk to the hardware directly without going through a middle-man and without ugly hacks.
Well, he's mostly suggesting using a user-space networking stack provided by your vendor (Intel's DPDK). But writing your own userspace driver from scratch is actually very simple -- I've done it for Intel's 1Gbps and 10Gbps cards before the DPDK was made publicly available. We're talking ~1000 lines of user-space code, and less than a month of work. Writing your own userspace networking stack is of course more complicated, depending on exactly which level of protocol support you need. Supporting say ARP or EtherChannel is trivial, while a production quality and interoperable TCP stack will be man-years of work.
But having written systems doing 10M TCP connections at 10Gbps, I strongly believe that you want to relegate the kernel into the role of a management system. Having the system split across the kernel and user-space will lead to unacceptable performance, pretty much no matter where you split things. And having the full application in the kernel would be a nightmare for development, debugging, and deployment. (And I sure as hell am not going to choose a pre-alpha research project over Linux as the base of telecoms systems.)
> First of all, what a disorganized article. Who the hell edited this?
I have to agree. It didn't really turn out as well as I wanted. The talk is really good, but I could never make this read smoothly.
Fortunately I think the content is excellent and that will hopefully save the day.
My take is along the lines of the earlier embedded systems comment. This is the sort of thing you would have done in a real-time OS snagging packets directly in an ISR and queuing them to a task for later processing. The processor and memory suggestions are just common practice in that arena.
But that kind of sucks for numerous reasons. It's great having a widely supported OS that you can leverage while replacing the bits you need to meet your goals. That's pretty genius in my book.
>I have to agree. It didn't really turn out as well as I wanted. The talk is really good, but I could never make this read smoothly.
Props for stepping forward and saying this :-)
I think it has to do with that you're trying to capture the content of a talk largely by presenting the bullet points from that talk's slide deck, but in doing so miss a lot of context that came from the presentation.
Your summary at the bottom is great -- I think it should be the whole article. This would offer viewers a snapshot of what they can expect out of the video so that they don't miss anything (for example, a very common takeaway I'm seeing from this and from people I show it to is that this is "just" a network stack hack -- no mention of his proposal for multi-core optimization, etc).
There is another one alternative, rump kernel, mentioned in commentary to original posting about "direct" network processing. The recent development was discussed in LuaJIT mailinglist recently (the thread is interesting by all means in "minus OS" context by the way)
The answer is right there in the article:
"Yet, what you have is code running on Linux that you can debug normally, it’s not some sort of weird hardware system that you need custom engineer for. You get the performance of custom hardware that you would expect for your data plane, but with your familiar programming and development environment."
Maybe Mirage will pan out, but for right now it's in the pre-alpha stage. We've heard about the glories of microkernals before (GNU Hurd, anyone?).
I see. So the argument is that because something that was considered crazy panned out against the odds, therefore everything considered crazy is actually a good bet?
That seems like dubious logic.
Also unlike Google's bet on commodity hardware, I don't see the millions of dollars of savings to be had if the micro-kernel idea works out. At best you'll get an incremental performance benefit and easier debugging. It'd be different if there were no open source OSes and so rolling your own was the only alternative to paying millions in licensing fees.