
Maintaining 65k open connections in a single Ruby process - WJW
http://www.wjwh.eu/posts/2018-10-29-double-hijack.html
======
jsnell
> The maximum amount of connections achieved was 65523, just 13 short of 2^16.
> This is the theoretical maximum number of connections for a single IP
> address on a single port,

That's not right. Connections are identified by the 4-tuple of (source ip,
source port, destination ip, destination port). The server ip and port are
fixed in this example, but there's still 2^48 combinations of client ip/port.

> It is amazing that a problem which took huge engineering efforts to solve
> back in the early 2000s can now be solved easily by anyone in just a handful
> lines of code.

But you didn't solve the problem! The whole point of C10K was the need for
scalable ways to figure out which connections could be operated on (i.e. had
something to receive, or could be written to after having previously been
blocked for writes). Keeping an array of sockets and just repeatedly iterating
over it with the assumption that all those sockets are active is the opposite
of scalable.

~~~
WJW
Can you explain more how you get to 2^48? I thought the protocol specification
only reserved two bytes for the port and so getting over 65536 connections
would either require using more IP adresses or more ports. This seemed
supported by answers like [https://www.quora.com/What-is-the-maximum-number-
of-concurre...](https://www.quora.com/What-is-the-maximum-number-of-
concurrent-tcp-connections-system-can-support). If you can enlighten me, I'd
be much obliged.

You are definitely right that this is not a "full" solution for the c10k
problem, but it is an interesting start IMO. Maybe for next month I'll
implement basic epoll functionality to make an echo server or something :)

~~~
masklinn
> Can you explain more how you get to 2^48? I thought the protocol
> specification only reserved two bytes for the port and so getting over 65536
> connections would either require using more IP adresses or more ports.

32b IP * 16 bit port?

> You are definitely right that this is not a "full" solution for the c10k
> problem, but it is an interesting start IMO.

It's not a solution to the C10K at all, because the C10K problem is outdated
and was trivialised a decade back. Whatsapp was doing 2 million connections on
a single box (with resources to spare) back in 2012:
[https://blog.whatsapp.com/196/1-million-is-
so-2011](https://blog.whatsapp.com/196/1-million-is-so-2011) and folks were
working on C10M: [http://highscalability.com/blog/2013/5/13/the-secret-
to-10-m...](http://highscalability.com/blog/2013/5/13/the-secret-
to-10-million-concurrent-connections-the-kernel-i.html)

> Maybe for next month I'll implement basic epoll functionality to make an
> echo server or something :)

Epoll is bad, why would you want to do that?

~~~
meguest
> epoll is bad

Source please! I'm genuine - what is bad about it?

~~~
masklinn
[https://idea.popcount.org/2017-02-20-epoll-is-
fundamentally-...](https://idea.popcount.org/2017-02-20-epoll-is-
fundamentally-broken-12/) and [https://idea.popcount.org/2017-03-20-epoll-is-
fundamentally-...](https://idea.popcount.org/2017-03-20-epoll-is-
fundamentally-broken-22/) is a pretty good baseline.

------
verdverm
Reminds me of [https://medium.freecodecamp.org/million-websockets-and-go-
cc...](https://medium.freecodecamp.org/million-websockets-and-go-cc58418460bb)

------
romed
As they say: is that a lot?

~~~
kbenson
Well, if it's TCP/UDP connections, there's only 65k ports available to assign,
so it would be _all_ the ports. That's a lot of ports, but I'm not sure why
it's supposed to be particularly hard. Off to read the article now...

Edit: As masklin referenced in reply to me[1] that jsnell pointed out, that's
the ports for a single IP. So yeah, it's not _all_ the ports, but it is all
the ports for a single source IP, unless I'm missing something. You can of
course assign multiple source IPs to extend this, but it is _a_ limit to be
dealt with.

1: And then deleted it, at least as of now, even though I think it was a good
point.

~~~
anothergoogler
I think you're confused about network programming in general. In a post like
this, the assumption is that the concurrency is happening over a single
network interface. It doesn't matter really. The most conspicuous limiting
factor on a POSIX-compliant system is the number of available file
descriptors. When you listen() on an address and accept() a connection, the
networking stack allocates a file descriptor for that connection. Then you can
handle more connections.

[http://pubs.opengroup.org/onlinepubs/9699919799/functions/ac...](http://pubs.opengroup.org/onlinepubs/9699919799/functions/accept.html)

[http://pubs.opengroup.org/onlinepubs/9699919799/functions/li...](http://pubs.opengroup.org/onlinepubs/9699919799/functions/listen.html)

Beej's guide is a popular intro, but I'd bet any text on network programming
covers sockets.

[https://beej.us/guide/bgnet/](https://beej.us/guide/bgnet/)

~~~
kbenson
Yes, I was actually mistaken in assuming a server would use a port only once
per IP for all connections, and not just once per client IP (and even farther,
client IP and client port combo).

It makes much more sense, and in fact I'd been pondering that since I
commented as how that could work in practice with servers that actually keep
connections open but can handle a lot of requests wasn't obvious in my prior
mental model.

> Beej's guide is a popular intro, but I'd bet any text on network programming
> covers sockets.

I highly suspect I knew this at some point, but forgot it over the past 15-20
years since it didn't have direct relevance to most of my projects since that
time. :/

------
jashmatthews
Neat idea! If you're interested in production quality high concurrency Ruby
check out [https://github.com/postrank-
labs/goliath/](https://github.com/postrank-labs/goliath/) which can support
hundreds of thousands of concurrent connections.

~~~
WJW
That looks really interesting! I built this during a hackathon and ran out of
time before managing to get beyond the 2^16 barrier. I'll definitely take a
look at Goliath to see if I can borrow some tricks to push it even further. :)

------
edoo
The issue was solved with epoll back in 2004. All modern socket processing
should be using async IO. This is just an example of how to hold the sockets
open and use it more like a TCP server. Ruby might be the good choice for this
if you only know Ruby.

~~~
protomyth
[https://illumos.org/man/5/epoll](https://illumos.org/man/5/epoll)

 _While a best effort has been made to mimic the Linux semantics, there are
some semantics that are too peculiar or ill-conceived to merit accommodation.
In particular, the Linux epoll facility will -- by design -- continue to
generate events for closed file descriptors where /when the underlying file
description remains open. For example, if one were to fork(2) and subsequently
close an actively epoll'd file descriptor in the parent, any events generated
in the child on the implicitly duplicated file descriptor will continue to be
delivered to the parent -- despite the fact that the parent itself no longer
has any notion of the file description! This epoll facility refuses to honor
these semantics; closing the EPOLL_CTL_ADD'd file descriptor will always
result in no further events being generated for that event description._

Brian Cantrill on epoll [https://youtu.be/l6XQUciI-
Sc?t=3424](https://youtu.be/l6XQUciI-Sc?t=3424)

------
salim_semaoune
i think that the `conn_waiting_area` variable is undefined in the source code

