

Ask HN: I/O of Web Server - ejanus

I read a couple of articles and papers and I came to the conclusion that event-driven pattern is the way to go if one in targeting Unix platform. This is because Unix(Linux) kernel does not have asynchronous I&#x2F;O(This is not so with Windows). I checked some open-source web server codes and  I noticed that some still use synchronous Unix (select and epoll), bare bone. 
However, last few hours I stumbled on article written by a top dog at Mailinator which argued that threads are still fast than event-driven(Reactor). He argued that context switching and memory foot prints are just theoretical stuff. That what one should avoid is thread pool , and that threads should be started at the beginning. 
Now my question is: Which one is the best as per performance from your experience, and easy to code and maintain?  And I would like to go through more papers , articles, and blogs in order to fully understand the limitations of I&#x2F;O and the best way to handle such.
======
trentnelson
The first 80 or so slides of this deck go into detail regarding "async I/O":
[https://speakerdeck.com/trent/pyparallel-how-we-removed-
the-...](https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-
exploited-all-cores).

> He argued that context switching and memory foot prints are just theoretical
> stuff.

What? Can you cite your source?

~~~
ejanus
Thanks for the link you provided. This is the link I cited,
[http://www.mailinator.com/tymaPaulMultithreaded.pdf](http://www.mailinator.com/tymaPaulMultithreaded.pdf)
.

------
zbuf
You post is very broad but I understand the core question you're asking. I
have done some work with a streaming servers in the past on BSD and Linux and,
whilst my knowledge could be dated, I'm currently sitting at home with a cold
so let me share information that might be helpful.

Last I checked it's true that on Linux there is no non-blocking API for disk
I/O, and that read(), write() or accessing mmap() pages has no option but to
pause the process in 'D' state. Network I/O is non-blocking, with lots of
options around epoll() for edge or level-triggered, so no worries there.

So if you are going to write a fully event-driven/co-routine program that
streams from disk and out of a network, you'll need to work with some options
to do the disk I/O while the process continues to run.

There are some ways you can leave the main thread/process non-blocking and use
several worker threads or processes doing disk I/O.

You can hand control of the socket descriptor to the worker thread for
sendfile() which is likely to be simplest to develop and fastest if your
application suits it.

Alternatively you can totally abstract away the I/O by having worker threads
sendfile() data down a pipe, ready for splice() by the main event driven
threads. Effectively the pipe descriptor now becomes your 'async' disk I/O,
and you can probably write your own nice API to it. This becomes effective if
your main thread needs to do some nasty things like inserting extra data
within the stream.

If you're doing anything more than pure data movement then the main thread may
be CPU bound, in which case a useful solution is to fork() a process per-CPU
all of which are running accept() on the same socket. This assigns a client to
a process on connection, and is a lovely cheap way to get nice concurrency for
very little development.

Once you do have multiple forks, you may find that by having a few threads
per-CPU and combining non-blocking network I/O with blocking (sync) disk I/O
is valid for your task; this will do good to minimise the "average"
responsiveness of the server, but not the "worse case" responsiveness.

In some practical cases, you may be able to use a combination of mmap(),
posix_madvise() and mincore() as a poor man's solution to despatching lots of
I/O into the kernel's I/O scheduler from one process. But this comes with
little guarantees, and there's no way to be woken when the data is 'ready'.

The suggestion to use a thread per client may be valid today, I don't know.
But threads have previously been a heavyweight solution (if most are to be
idle at any one time, waiting on some I/O). Certainly it didn't used to be
advisable to use, eg. 100,000 threads because of the memory that takes up.
Whereas these sorts of numbers can easily be accommodated using the methods
above.

If the number of clients is small and they are all independent, then I would
consider using a process per-thread for the additional crash protection it
gives (over threads).

In general, I have seen such applications more as a 'broker' which spends most
of its time idle while it waits for data from either the disk or network end.

Oh, and maybe libaio will help, though I have not used this myself. It aims to
be an async disk I/O API, but IIRC its implementation on Linux was as some
form of worker threads.

I believe a good reference for this is:
[http://www.kegel.com/c10k.html](http://www.kegel.com/c10k.html). Despite its
age, I skim read this just now and I think a lot of its advice is still
relevant.

~~~
SamReidHughes
> Last I checked it's true that on Linux there is no non-blocking API for disk
> I/O,

There is (see io_submit and friends) but the interface is incomplete and it's
often buggy. A thread pool works far better, and performance overhead (for
anything in your 2.5" or 3.5" form factor) is minimal to non-existant.

------
checker659
What about epoll / kqueue?

