Hacker News new | past | comments | ask | show | jobs | submit login

Not really. Last I checked, stock Ubuntu was choking around 60k concurrent connections for no reason, and Fedora could handle a lot more. This was a couple years back, but I'd demand numbers before assuming the situation has changed.



The entire purpose is to NOT route stuff through the kernel.

TFA explains how to do it, and I've done it myself. You can set a flag on the Linux kernel when it boots limiting it to the first N cores. I usually use 2. The remaining cores are completely idle—Linux will not schedule any threads on those cores.

Then you build an app that works more-or-less like Snabb Switch, which talks directly to the Ethernet adaptor, bi-passing the kernel (Ubuntu, Fedora, etc. isn't relevant in the least).

So, you launch your app as a normal userland app. For each of your app's threads, schedule them on the remaining CPU cores however you want (I schedule one thread per core). Linux will not schedule its own threads or threads from any other process on those cores, so you own them completely—it'll never context switch to another thread.

That means when you SSH in, it's running on a thread on cores 1 or 2 only. Same with every other Linux process but your own. Other than sucking up available memory bandwidth and potentially trashing your L2/L3 cache, these other processes don't impact your own app at all.

Thus, even though you're running stock Linux, and SSH and gdb works, and you've got a normal userland app, your app is the ONLY app running on the remaining cores, and you're talking directly to the hardware. It's just as fast as doing everything without a kernel, except it cost you 2 cores. IMO, it's more than worth it for the convenience.

This approach is so easy that there's really no reason not to do it. There are so many situations in the past where I wanted the performance of those single-app kernels, but it just wasn't worth the dev effort. That's no longer true.


Linux will often still schedule kernel threads to run on those cores so they are not totally isolated. Also cache effects. If your architecture shares caches between cores, sometimes it would be worth "wasting" a neighboring core to avoid ssh and gdb thrashing your cache.

Oh and also don't forget to set up IRQ affinity to avoid any of those cores to handle.

There is an interesting research done by Siemens, that takes this kind of isolation a step further and uses virtualization extensions to isolate resources (cores for example):

https://github.com/siemens/jailhouse


> Linux will often still schedule kernel threads to run on those cores so they are not totally isolated.

Have a look at cpuset[0]. You can forcibly move kernel threads into a given set (even pinned threads), and then force that set onto whatever cores you want.

[0] http://www.kernel.org/doc/man-pages/online/pages/man7/cpuset...


Ah good stuff thanks posting that. I'll have to look into it, I haven't tried cpusets yet.

In the past I tried migrating kernel threads, the migration did work but system became unstable, so I gave up on that.


Oh and also don't forget to set up IRQ affinity to avoid any of those cores to handle.

Perhaps this is what you meant, but this is straightforward. Simply disabling 'irqbalance' is a simple way to do this. Alternatively, you can also configure it to cooperate with 'isolcpus' by using 'FOLLOW_ISOLCPUS'.


Disabling it will disable auto-balancing it, it will just become static, which might not be the exact configuration you want.


Yes, if you don't want all interrupts to handled by CPU0 you'll need a different approach. But doing anything else may be more difficult than it sounds. Do you know if the bug mentioned here is fixed?

http://serverfault.com/questions/380935/how-to-ban-hardware-...

http://code.google.com/p/irqbalance is 403 for me and I haven't been able to check.


Setting IRQ affinity for network controllers to either follow or not follow the "high priority" processes seems to do the trick.

I usually do stuff described here:

https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.tx...

And sometimes following ends up better (same cache), sometimes isolating ends up being better.


This sounds really cool, but I didn't see where the original post talks about this hybrid setup with the Linux kernel (reading the first part and skimming through the rest, which I had already read back when it came out).

Do you know of other resources for this approach? I have seen cpusets but not used them.


Do you have any data about what the bottleneck was?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: