The only reliable way to understand the buffers affecting your individual application, is to do functional performance testing. You have to re-create the exact scenario you want to run "in production", slam it with different kinds of way-too-big requests, and record every facet of the system up to the point it fails. Then you can do a deep dive to discover which part caused a problem and tweak settings until it works as needed.
Also see Google's fast userspace context switching, which would basically let you implement pipes the way they were originally designed and implemented in Unix, as a kind of coroutine that implicitly passed control flow to the other process when you filled or drained a buffer.
The kernel CPU scheduler typically adds too much latency (5-10us) to make things like conditional variable synchronisation useful
Some applications with softer low-latency requirements tend to spin for a certain amount of time before going to sleep and await data arrival. Also, if it is hit with a constant stream of data to process, it wouldn't even get a chance to block.
At the other extreme end of RT computing, you usually don't have an OS that blocks using privileged instructions in the first place. You just...schedule things.
You can do a futex but that still requires the producer to issue system calls.
You could also just have the consumer poll at whatever frequency you please (either continuously spin, or go to sleep for a time).
NetBSD allows one to experiment with smaller hard-coded limits.
When compiling the kernel, uncomment
#options PIPE_SOCKETPAIR # smaller, but slower pipe(2)