

Obscure kernel settings to increase performance of Linux TCP servers - fosk
http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.kernel.obscure.html

======
ams6110
In general you don't want to screw around with kernel settings. You're just as
likely to break something as you are to improve anything. You need to know
what you are doing when you venture into this territory, and if you don't,
there's not enough info in this post to really guide you.

~~~
retr0h
Ideally, thats why you tune on a system that you can "break", in an
environment that is replicating the bottleneck/issue you are trying to
improve/fix.

Tuning can be fun. Almost as fun as fixing obscure problems with modem AT
strings. ;)

------
leef
The one I have been bitten by is the TCP retransmission timeout (RTO). This is
the time TCP will wait for an ack before resending some data. This number has
to be somewhat conservative because if the RTO is less than the normal round
trip latency TCP will be constantly resending packets.

Normally the RTO is set to 200ms. This is conservative in case you have long
distance TCP connections. However when your systems are in a same data center
environment where round trip times are sub-millisecond the default is two
orders of magnitude too high and should be tuned down.

~~~
ballard
Looks like incast. Here's an in-depth article [http://www.snookles.com/slf-
blog/2012/01/05/tcp-incast-what-...](http://www.snookles.com/slf-
blog/2012/01/05/tcp-incast-what-is-it/) Check out these patches extracted from
the original paper site for 2.6/3.0rc kernel lines
[https://github.com/vrv/linux-microsecondrto](https://github.com/vrv/linux-
microsecondrto)

------
IvyMike
I believe this list is up to date as of July 2002.

~~~
thaumaturgy
Worse than that even.

> The default is 0, since this feature is not implemented yet (kernel version
> 2.2.12).

2.2.12 was last modified in August of 1999. 2.2.14 was released by January
2000, so it's not like 2.2.12 was around forever.

Don't go screwing around with your kernel settings based on this document.

------
aray
These all seem prime candidates for some kind of learning algorithm (at the
very least a naive one). Like TCP's window size scaling, just for settings
that are too complex to bother with tuning.

~~~
npsimons
At the same time, the default settings have been tweaked by domain experts
over decades to be reasonably good for the general case. Not everyone might be
interested in shoving a a complex, un-tested, un-debugged learning algorithm
into their kernel. Still, interesting proposal . . .

~~~
lostapathy
You wouldn't necessarily have to run it in-kernel, as these are all
configurable from /proc. With a sufficiently complete test environment, you
could use machine learning to tune a lot of the params up safely, then just
run the resulting config in production.

------
ballard
As said elsewhere.... don't adjust the "richness" controls on your kernel
unless there's an actual scalability problem costing you $$. Your special
"tweaks" will end up using more memory or slowing performance most of the
time. sysctls should be sensible by default on RH (cent/sci/cern/aws) and
Ubuntu based kernels.

The two most common encountered problems are incast TCP collapse and slow
clients.

[https://everythingisdata.wordpress.com/2009/09/25/fine-
grain...](https://everythingisdata.wordpress.com/2009/09/25/fine-grained-tcp-
retransmissions/)

[http://www.pdl.cmu.edu/Incast/](http://www.pdl.cmu.edu/Incast/)

[http://conferences.sigcomm.org/sigcomm/2009/workshops/wren/p...](http://conferences.sigcomm.org/sigcomm/2009/workshops/wren/papers/p73.pdf)

(For TCP testing: iperf... remember to account for protocol overhead,
bits/bytes)

Handling slow clients is app-specific, so there's no general advice other than
don't get clever. Clever -> !debuggable -> !maintainable -> !scalable.

------
zurn
I don't know where the poster's "performance" came from in the title, the TLDP
doc just calls these "obscure settings".

Actually it seems someone just copypasted the whole set of sysctls from the
kernel ip-sysctl.txt, as the list includes the most common settings such as
net/ipv4/ip_forward, net/ipv4/tcp_keepalive_time and
net/ipv4/ip_local_port_range.

------
krakensden
This is worth noting apart from the crowd:

/proc/sys/net/ipv4/tcp_abort_on_overflow

because figuring out when you need it is awful.

