In order for a collision to take place, we’d have to get a new connection from an existing client, AND that client would have to use the same port number that it used for the earlier connection, AND our server would have to assign the same port number to this connection as it did before.
Ephemeral ports aren't assigned to inbound connections, they're used for outbound connections. So, for the client-to-nginx connection, both the server IP and port are fixed (the port will be either 80 or 443) - only the client IP and port change, so for a collision all you need is for a client to re-use the same port on its side quickly.
For the nginx to node connection, both IPs and the server port are fixed, leaving only the ephemeral port used by nginx to vary. You don't have to worry about out-of-order packets here though, since the connection is loopback.
Note that only the side of the connection that initiates the close goes into TIME_WAIT - the other side goes into a much shorter LAST_ACK state.
So just to be sure, some noobs questions: The 'Ephemeral Ports' and the 'TIME_WAIT state' tricks are here to handle the connections from nginx to Node.js (not for the client to nginx)?
Socket from client to nginx are well identified by the client IP an the client port. On each client request, nginx create a new socket to node.js?
There can be more than one node.js intance running? That's the main goal of nginx here, or there is some additional benifices?
> Edit, ok: "nginx is used for almost everything: gzip encoding, static file serving, HTTP caching, SSL handling, load balancing and spoon feeding clients" http://blog.argteam.com/coding/hardening-node-js-for-product...
Really surprised this article doesn't mention tcp_tw_reuse or tcp_tw_recycle. These have a more substantial impact that simply adjusting TW, as those ports will still be in a FIN_WAIT status for a long time before reuse as well.
I've been playing around with these settings on very loaded machines:
# Retry SYN/ACK only three times, instead of five
net.ipv4.tcp_synack_retries = 3
# Try to close things only twice
net.ipv4.tcp_orphan_retries = 2
# FIN-WAIT-2 for only 5 seconds
net.ipv4.tcp_fin_timeout = 5
# Increase syn socket queue size (default: 512)
net.ipv4.tcp_max_syn_backlog = 2048
# One hour keepalive with fewer probes (default: 7200 & 9)
net.ipv4.tcp_keepalive_time = 3600
net.ipv4.tcp_keepalive_probes = 5
# Max packets the input can queue
net.core.netdev_max_backlog = 2500
# Keep fragments for 15 sec (default: 30)
net.ipv4.ipfrag_time = 15
# Use H-TCP congestion control
net.ipv4.tcp_congestion_control = htcp
_"A large part of this is due to the fact that nginx only uses HTTP/1.0 when it proxies requests to a back end server, and that means it opens a new connection on every request rather than using a persistent connection"_
There's some good info in here. We ran a flash hotel sale a while back. Only lasted for 60 seconds but with about 800 booking req/second. Discovered many of the same issues but I never quite got iptables stable (hash table flooding, then other issues) so I ended up getting it to ignore some of the traffic. Will try out the solutions in here next time to see how it goes.
There are ~32k ephemeral ports. Typical servers have ~32G of memory. It's certainly not hard to imagine a request architecture where a single request can be handled in less than 1MB of per-request memory.
so we've made a few changes since I'd initially done the ephemeral port tuning, the most important being switching to unix domain sockets rather than TCP. with that, we probably no longer need the ephemeral port setting.