Hacker News new | comments | ask | show | jobs | submit login

Nice, except recvmmsg() is broken.


    The timeout argument does not work as intended.  The timeout is
    checked only after the receipt of each datagram, so that if up to
    vlen-1 datagrams are received before the timeout expires, but then no
    further datagrams are received, the call will block forever.
Which makes it useless for any application that wants to service data in a short time frame. The only way around it is to use a "self clocking" method. If you want to receive packets at least every 10ms, set a 10ms timeout... and then be sure to send yourself a packet every 10ms.

I've done similar tests with UDP applications. It's possible to get 500K pps on a multi-core system with a test application that isn't too complex, or uses too many tricks. The problem is that the system spends 80% to 90% of its time in the kernel doing IO. So you have no time left to run your application.

Another alternative is pcap and PF_RING, as seen here: https://github.com/robertdavidgraham/robdns

That might be useful. Previous discussion on robdns: https://news.ycombinator.com/item?id=8802425

The Snabb switch paralkelisation experiments are getting to 35 million packets per second doing real work (encap decap) in Linux userspace[1]

[1] https://groups.google.com/forum/m/#!topic/snabb-devel/_vKQgC...

Tunnel encapsulation is real work, but not all real work can be mapped to tunnel encapsulation.

The point of all that work to context switch into processes to handle small amounts of network I/O is that very often THE CORRECT SOFTWARE ARCHITECTURE is for multiple address-space-separated processes to be doing small amounts of network I/O. That I/O "means something" to a larger data model being implemented by the software.

It's true that for some tasks that "look like routing" there's no point to having that kind of external data model. The packets are the data being operated on. So there's little value in process separation and you might as well DMA them all streamwise into a single process to do it. And that's great stuff, but AFAICT it's really not what the linked article is about.

Ultimately, all those packets are going to end up in conventional processes, because that's where conventional processing needs to happen. There are very good reasons why we like our page-protected address space separation in this world!

> Nice, except recvmmsg() is broken. [...] The only way around it is to use a "self clocking" method.

Apparently you can also use the SO_RCVTIMEO socket option, which is a way to specify a timeout for all receive operations on the socket.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact