

Blocking I/O: it's not just for pansies - epall
http://paultyma.blogspot.com/2008/03/writing-java-multithreaded-servers.html

======
jlindley
About the title, "Blocking I/O: it's not just for pansies".

Whether the title wording is ironic or sincere -- and I'm guessing it's used
jokingly -- using it presumes a very specific shared outlook with the reader
that is, likely as not, wrong. Certainly wrong in my case.

I'm not claiming the title is homophobic. I claim it's distracting to some
portion of readers and serves to introduce the actual content of the linked
submission poorly. The submission title does not nearly reflect the attitude
or writing in the actual linked post which is, in comparison: specific,
technical, and non-abrasive.

~~~
solutionyogi
Relax. It's just a title. Let's talk about the article which (even though from
2008) is actually interesting.

~~~
axod
It's also unashamedly biased. I'd take it all with a massive dose of salt.

------
kolektiv
Important to remember that we're talking about threads and IO in one specific
VM case, etc. The same assumptions won't necessarily hold in, for example, a
.NET case, as the memory cost of a thread is more significant (in some
qualified cases).

Fundamentally, most of these points have so many caveats were you to want to
extrapolate them to general programming technique, that you're much better off
simply saying "benchmark your actual cases and try the alternatives".

~~~
borisk
With both Sun JVM 1.6 and .NET, on Windows x64 you get by default a 1MB stack
per thread.

------
IgorPartola
Also, programming in C, it's very easy use thread-per-connection model
especially if synchronization is not necessary. If you want a pool of threads
it gets a bit more complex, but if you always spin up a new thread for each
connection, it's easy as pie (and in my case doesn't matter since the time
spent creating a new thread is negligible compared to the amount of time it'll
spend doing useful work).

~~~
vladev
If you want to keep the easy thread-per-connection model even with 1M
connections you might have a look at Erlang - it takes the best of both worlds
(threads and async IO) via its processes (so you get a process-per-
connection). I read somewhere that the overhead of a process is just about 300
bytes.

------
mkramlich
I realize there are lots of special cases and exceptions to what I'm about to
say but I've come to 2 good general rules of thumb when it comes to designing
concurrency:

1\. _prefer processes over threads_ \-- because your architecture can scale
horizontally across multiple boxes easier, because you become free to write
each piece in a different language, and because mutable shared memory is
problematic in very subtle, counter-intuitive ways

2\. _prefer events over processes or threads_ \-- because you can handle much
higher concurrent IO traffic on a single machine, due in part to reduced
memory use

------
jwegan
I've found similar results to those stated in the article. In my case I found
that the Tomcat Http11NIO connector was actually slower than the default
Http11 connector. In benchmarking my server I found the default Http11 had
about 20% higher throughput than Http11NIO. From what I understand (and I may
be wrong since I didn't delve to deep), but it is because there is a high cost
when using methods accessed through the Java Native Interface (JNI)

------
strlen
Couple of things:

1) Recent versions of the 1.6 JDK will use epoll on Linux. Thus the benchmark
should be re-evaluated. poll() is known to not very scalable. There is an
issue, however: NIO only supports level triggered (not edge triggered) epoll.

2) This doesn't cover the case of threadpool starvation. I.e., there are
multiple connections, some are very fast, some are very slow.

Prime example of this would be a client for a WAN-distributed database or a
WAN-distributed file system: most operations are local (5 ms), some operations
are remote (80 ms). Remotes are lingering longer and longer in a fixed size
threadpool, leaving less and less space in the threadpool causing a longer
wait time for incoming operations. You can even have this without WAN
distribution e.g., 80% of operations require no random disk seeks (are
retrieved from cache), 20% require them (orders of magnitude slower operation
even with elevator scheduling).

------
JoelPM
Depends on what you're doing. I read that article a couple years ago and
thought "Take that you libevent C zealots." And while I still firmly believe
that the JVM is optimized and mature enough to hold its own against platforms
that have traditionally been considered faster, I still think there are cases
where event-based I/O will out-perform thread-per-request. For instance, if
most of what you're doing is shuttling bytes back and forth (like in a proxy).
I also wouldn't be surprised if a thread-per-request app took a bigger hit
when virtualized (like EC2) than an event-based app.

~~~
agazso
If you start a thread for each request in Java, you can easily run out of
memory when serving large number of connections. This is because Java uses
kernel threads, and the stack of threads are allocated from heap when created,
usually around the size of 1MB.

This means that you will need around 10GB of memory when serving 10000
request, while an event based app can serve the same amount of requests with
minimal memory.

You also have to take into account that thread creation and context switching
is really an expensive operation (contrary what the OP is saying), so the
thread-per-request app is adequate only for serving small number of requests
but for large numbers you will need to use the event-based approach.

You might get along with a threadpool based approach, but take into account
that if your protocol is not stateless, then you will need shared state which
means you have to use concurrent data structures, which might complicate code.

------
bsaunder
This post is over 2 years old. Seems likely that the world has changed at
least a bit since posted (slow things tend to get optimized).

------
pcestrada
As a side note, I really enjoyed attending the SD West conferences in the
past. Alas, it has morphed into something focused on Cloud Computing and lost
it's platform independent focus.

------
hubb
interesting stuff. it also gave me that weird feeling when one thing or person
you've never heard of before crops up in several places in on day, as just an
hour ago i was reading this:
[http://www.azulsystems.com/events/javaone_2002/microbenchmar...](http://www.azulsystems.com/events/javaone_2002/microbenchmarks.pdf)

