

Non-blocking buffered file read operations - mtanski
https://lwn.net/Articles/612483/

======
mtanski
The point is to reduce latency introduced between eventloops and io
threadpools. Not every application can use sendfile() due for a number of
reasons (post processing, TLS, ...)

I'll be talking about this at the LSF MM/FS summit in Boston on Monday if any
of you guys are in attendance.

~~~
nteon
very neat. It definitely seems useful to be able to do a speculative
read/write from an eventloop, and only defer the IO to another thread if it
would block.

------
JoeAltmaier
I'm ignorant - is reattempting the read(NOBLOCK) the way to confirm the block
is available? Does that amount to polling? Can I issue multiple async reads
and then wait for the first of them to complete?

As described, I can't tell if its possible to have a single read-completion
thread that processes any async reads as they complete, which is a powerful
design pattern. Or better yet, specify a completion callback.

~~~
mtanski
What this approach does is only read the data for you if it's in the kernel
page cache. If it's not it's your responsibility to get it somehow (maybe
through a regular read call). Today it does not enqueue a any kind read-ahead.

There's a very long history 12+ years of various buffered disk IO, along
similar lines of what you're asking (callback, notification). All of them
ended up failing to go upstream for a number of reasons 1) pushed too much
complexity into the kernel (complicating the regular code paths) 2) caused
regressions in performance in sync read/write 3) were based on functionality
that the upstream developers did not want (tasklets).

With that background of various failed attempts, my approach focuses on a
common use case in modern servers: a network pool loop and a disk IO thread
thread pool. That's pretty much how everybody works around lack of async disk
IO. The big problem with that is additional latency. That latency comes from
synchronization (between two threads) and queuing (cached data stuck in queue
behind long running requests).

My solution tackles that latency by letting us answer in the same network
thread if data is cached by the kernel.

I tried this using samba (kernel guys like this use case) and I was able to
get close to sync like latency / throughput vs. ~23% lower numbers with pure
thread pool. I've observed similar numbers in our application as well.

Finally, using this you can implement what you're purposing in user-space
using a threadpool without giving up to much performance.

