

Scalable Network Programming - kirubakaran
http://www.scribd.com/vacuum?url=http://bulk.fefe.de/scalable-networking.pdf

======
huhtenberg
<http://www.kegel.com/c10k.html> is more concise and it has a better coverage
of available options. It is _the_ canonical reference for anyone asking "how
to handle 10000 clients".

~~~
neilc
It's a great read, although it hasn't been updated since 2006, so it should be
read with care.

------
signa11
oldie but a goldie. obligatory non-scribd link (<http://bulk.fefe.de/scalable-
networking.pdf>). if you are in this business get 'network algorithmics'. read
it, be one with it ! it is sicp of networking.

~~~
yagibear
"sicp of networking" is a bit over the top. IMHO it gives undue emphasis to
Varghese's research interest in packet classifiers, and is filled with typos.

------
davidw
It's been a while since I've checked, but I think he's mistaken that Apache is
one process per connection. I think it fires up multiple processes just
because that's best if you take a bit to serve each request. He doesn't really
cover that aspect of things much. Serving static files quick is, if not quite
a solved problem, not terribly hard. Doing applications, with significant
processing overhead, is a bit more difficult.

~~~
aristus
So is anyone interested in an essay about that? I've got a few topics on the
hook. I used to run a search engine that served 600 million queries per month.

~~~
bluelu
Now that sounds interesting. Why did you quit or did you sell your search
engine? I'm doing some search stuff myself (I haven't launched so far) and it
would be nice to be able to get some tips.

However I do not expect to reach 600 million queries per month, at least not
in the first month ;)

~~~
aristus
I only ran the initial architecture & buildout and then moved to sysop as they
hired people. The rest of the business was not my business. :) After a good
few years we hit financial trouble and were bought up by Yahoo for pennies.

~~~
bluelu
Must have been an interesting job though. :)

------
strlen
Should be a mandatory read for anyone who wants to talk about "scaling".
There's lot of armchair discussion of this topic from people who have never
looked at UNIX networking from a below-the-surface level (or don't even know
about select(), epoll and the various UNIX IPC mechanisms).

------
Hexstream
Executive summary of the first 53 pages: epoll is great, the other solutions
range from not-so-good to terrible.

------
axod
Interesting read, but I didn't see the case that you just don't use processes
_or_ threads covered. Did I miss it?

~~~
strlen
The discussion of epoll/select does talk about the event-driven scenario and
there's the discussion of simplest web browser (one procession for all
connections).

Memcached is one example of a purely event driven application. Another example
(also by Brad Fitzpatrick) is perlbal (written in Perl and using epoll).

~~~
axod
Sure, I just expected the doc to go a little more like:

1 process per connection (bad) -> 1 thread per connection (better?) -> 1
process (bingo!)

~~~
neilc
~1 process per core, that is (unless the daemon is _completely_ I/O bound).

~~~
rcoder
This is only true if you think your application can do a better job of
scheduling than the underlying OS kernel. In many (if not most) cases, this is
false.

For HTTP servers hosting static content, you may be able to out-perform the OS
thread scheduler. For most non-trivial apps, you're probably wrong.

Small fork()-ed processes can still compete with clever poll(), /dev/epoll, or
kqueue()-based servers, if you can keep each instance lightweight.

~~~
neilc
_This is only true if you think your application can do a better job of
scheduling than the underlying OS kernel. In many (if not most) cases, this is
false._

Why is it false? As the other reply notes, you have more domain knowledge than
the kernel's scheduler does. You also didn't need to pay the overhead of
entering the kernel to context switch (which is why context switches between
userspace threads are cheaper than between kernel threads). In the case of
fork()-based servers, you also need to flush the TLB (depending on the CPU
architecture).

I'd be curious to see links that support your last claim: AFAIK, it is fairly
well-known that event-based daemons using epoll/kqueue are the most performant
technique for writing scalable network servers. See C10K, etc.

~~~
rcoder
The problem is that, in the real world, monolithic polling network servers
have to manage all the transaction-specific context for your application in a
single binary.

I'm not challenging the fact that the C10K architecture wins if we're talking
about static content. Real webapps don't live and die by their static content
performance, though: the critical path is through the dynamic content
generation, which means that select()/poll()/et. al. don't buy you much,
unless you can hook your database client events and application thread
scheduling inside that loop as well.

Green (a.k.a. userspace) threads are one good solution, but fork() isn't the
performance-killer that people seem to think it is, either.

