

Kernel Summit 2009: How Google uses Linux - gtani
http://lwn.net/SubscriberLink/357658/1a5bf93645d9fe6e/

======
pmjordan
_There are also doubts internally about how much of this stuff will be
actually useful to the rest of the world._

This seems the wrong approach. If it's useful to google, it might not be
useful to the rest of the world in exactly that form, but I'm sure outside
kernel devs would have opinions on it and could suggest a version that helps
everyone.

 _Google makes a lot of use of the out-of-memory (OOM) killer to pare back
overloaded systems. That can create trouble, though, when processes holding
mutexes encounter the OOM killer. Mike wonders why the kernel tries so hard,
rather than just failing allocation requests when memory gets too tight._

This is something that really bothers me about Linux too, and I'm not google,
I run a workstation and some servers. Processes with runaway memory
allocations can easily grind the system to a halt by causing crazy amounts of
swapping. If you don't catch it in time, you basically need to hard reset.
_This shouldn't be possible from user space._

 _Mike concluded with a couple of "interesting problems." One of those is that
Google would like a way to pin filesystem metadata in memory. The problem here
is being able to bound the time required to service I/O requests. The time
required to read a block from disk is known, but if the relevant metadata is
not in memory, more than one disk I/O operation may be required. That slows
things down in undesirable ways. Google is currently getting around this by
reading file data directly from raw disk devices in user space,_

Ouch!

 _but they would like to stop doing that._

No kidding.

~~~
psranga
For the memory allocation problem, do you have any ideas about how other OS's
address this problem?

How about ulimit? Could that be useful?

------
josephruscio
"Google manages its kernel code with Perforce."

Did not see that one coming.

~~~
sid0
I thought it was common knowledge that Google uses p4 for practically
everything.

------
dschobel
_And load balancing matters: Google runs something like 5000 threads on
systems with 16-32 cores._

Even in the most IO-bound of domains or in the case where 4.9k are sleeping,
does it really make sense to ever have that many threads/core?

~~~
jbellis
Sure. Modern linux can handle 100s of 1000s of threads without breaking a
sweat.

See also [http://paultyma.blogspot.com/2008/03/writing-java-
multithrea...](http://paultyma.blogspot.com/2008/03/writing-java-
multithreaded-servers.html) for an interesting (java-centric) analysis.

~~~
agazso
While the kernel certainly able to handle hundreds of thousands of threads,
the main limit is memory, because by default (in libc) there is around 2M of
memory allocated to the stack of each thread.

With the pthread library you can create threads with a stack size smaller than
this, but this limits some aspects (think of recursion, local objects etc.) of
your program.

My experiences show that it is possible to be happy with 64K stack size in an
event driven program if you are careful, so you can run ~500,000 threads on
64G memory if you want to. Nevertheless you'd better design your program more
clever and use only a handful of threads and put them in blocking state only
when you must :)

