IPC Buffer Sizes

throwaway984393 · on Nov 18, 2021

IPC buffers are absolutely complicated and specific to the kernel and IPC method. I've worked on several applications that required various kernel tuning to perform well, and I still can't remember how any of it works. You can't really just know how buffers are going to work, because they might change in the next kernel release.

The only reliable way to understand the buffers affecting your individual application, is to do functional performance testing. You have to re-create the exact scenario you want to run "in production", slam it with different kinds of way-too-big requests, and record every facet of the system up to the point it fails. Then you can do a deep dive to discover which part caused a problem and tweak settings until it works as needed.

mgaunard · on Nov 18, 2021

If you have special performance requirements you should do your own queue in userland where you can not only control how big it is but also completely avoid context switches and system calls.

oshiar53-0 · on Nov 19, 2021

Except you still need some kind of signaling mechanism to avoid busy-waiting

wahern · on Nov 19, 2021

Interesting developments on that front: "User-space interrupts", https://lwn.net/Articles/871113/

Also see Google's fast userspace context switching, which would basically let you implement pipes the way they were originally designed and implemented in Unix, as a kind of coroutine that implicitly passed control flow to the other process when you filled or drained a buffer.

oshiar53-0 · on Nov 19, 2021

Hey, that's microkernel's dream come true

nly · on Nov 19, 2021

In low latency engineering it's exceedingly common just to pin a core to an IPC task (send or receive) and let it spin on your queue at 100% CPU utilisation.

The kernel CPU scheduler typically adds too much latency (5-10us) to make things like conditional variable synchronisation useful

oshiar53-0 · on Nov 19, 2021

Hence "to avoid busy-waiting."

Some applications with softer low-latency requirements tend to spin for a certain amount of time before going to sleep and await data arrival. Also, if it is hit with a constant stream of data to process, it wouldn't even get a chance to block.

At the other extreme end of RT computing, you usually don't have an OS that blocks using privileged instructions in the first place. You just...schedule things.

mgaunard · on Nov 19, 2021

There are lots of different mechanisms to use for this depending on your need.

You can do a futex but that still requires the producer to issue system calls. You could also just have the consumer poll at whatever frequency you please (either continuously spin, or go to sleep for a time).

1vuio0pswjnm7 · on Nov 19, 2021

"As you can tell, there is some significant variance across the different forms of IPC and OS variants, and to understand the true limits, you basically need to dive deep into the network buffers and OS-specific handling, influenced by a number of possibly system-specific variables or settings, including default max values, hard-code kernel limits, run-time configurable settings such as sysctl(8)s etc. etc."

NetBSD allows one to experiment with smaller hard-coded limits.

When compiling the kernel, uncomment

   #options        PIPE_SOCKETPAIR # smaller, but slower pipe(2)