During the decades since BSD sockets were first introduced the way
they are used have changed significantly. While in the beginning the
user was supposed to fork a new process for each connection and do
all the work using simple blocking calls nowadays they are expected
to keep a pool of connections, check them via functions like poll()
or kqueue() and dispatch any work to be done to one of the worker
threads in a thread pool. In other words, user is supposed to do
both network and CPU scheduling.
To address this problem, this memo assumes that there already exists
an efficient concurrency implementation where forking a new
lightweight process takes at most hundreds of nanoseconds and context
switch takes tens of nanoseconds. Note that there are already such
concurrency systems deployed in the wild. One well-known example are
Golang's goroutines but there are others available as well.
I don't think the idea of requiring Go-style lightweight threading is viable. Lightweight threading has many nice properties, but also inherently requires more expensive operations to deal with split stacks, and tends to use more memory than a manual approach. It's also rather poorly supported in general by existing languages and OSes (other than Go), while any new OS-level socket API should be as universally accessible as the existing one. In particular, many scripting-ish languages either have no concept of threading at all or only support isolated threads whose communication is limited to relatively slow message passing. In theory the language implementations could write a C shim to expose a non-blocking interface around blocking underlying operations - this is what libuv does today for file I/O, for instance - but the result would be a lot of unnecessary overhead.
(I'm not challenging you, just asking what you think the semantics should look like).
Worth adding for everyone else, though I'm sure you know: high speed network implementations already pull a lot of this stuff into userland.
