
Goroutines, Nonblocking I/O, and Memory Usage - eklitzke
https://eklitzke.org/goroutines-nonblocking-io-and-memory-usage
======
atombender
It always struck me as odd that Go has select{}, but no way to select on a
file descriptor such a socket. There's literally no way to poll a file
descriptor: You have to read from it, and deal with the result.

That also means reads aren't interruptible (unless you count closing the
descriptor as interrupting, which is a blunt hammer indeed). AFAIK the only
way to do this is to set SetReadDeadline() on the connection to a smallish
value, then let the interrupter block on a channel that's invoked when the
deadline is reached.

Go's concurrency model is pretty friendly, but it's also unfriendly in many
surprising ways. (Don't get me started on the whole nil channel thing, or the
problem of channel closure/ownership, or channel performance overall.) It
doesn't surprise me the least (while it did surprise the Go team) that Go fell
flat among the C/C++ system programmer demographic.

Anyone want to chip in about how Rust, Swift and Nim compare here (regarding
the article)?

~~~
nothrabannosir
> You have to read from it, and deal with the result.

They try to keep it simple. As with all I/O: it's blocking your current
goroutine, so if you want it non-blocking you spawn a new goroutine to do this
in, pass it a channel, the spawned routine proxies the result to it and your
main routine does select on _that_.

The semantics for reading sockets are different from reading channels (compare
the types; one is multi valued, the other single valued---EDIT: no they're
not, exactly, see [1]), so it kind of makes sense you can't share the same
construct.

I'm not a Go apologist by any means, I think it has a tremendous amount of
flaws. But I/O, (a)synchronicity and concurrency are not among them. They are
simple and consistent, and (most importantly) orthogonal building blocks which
you can combine at your leisure to create more complex constructs.

(Besides, how do you interrupt a read, anyway? Or any syscall? You can't do
that in C, either...?)

[1] edit: fair is fair, go channel reads are (of course) also multi valued.
however, the types still don't exactly match, since you can't pass arguments
(buffer) to a channel read. you'd need some extra hoops to jump through, as
other commenters suggest, which starts becoming magic and breaking down the
simplicity and orthogonality of the constructs.

~~~
lobster_johnson
Sorry for being unclear. You don't interrupt the _read_ as such. You interrupt
the select.

The usual trick on POSIX is to use a self-pipe, since select()/epoll() only
work on file descriptors and not, say, pthread primitives. Windows is slightly
better here: WaitForMultipleObjects supports waiting on any primitive that
supports blocking, so you wait on the socket + on an event object, and
signaling the event object will unblock the wait. Windows also has completion
ports, and the APIs are generally more extensive (not a fan of Windows, but
they got this part very right).

I do understand why Go is the way it is. But it's important to recognize that
it's a compromise in design that inherently reduces your flexibility — such as
in this particular case. I don't think there's anything technically
_preventing_ Go from supporting non-blocking I/O, though.

Channels and FDs/sockets are both conceptually multi-valued; what did you mean
by that? One major difference is that reads can fail with an error, and
channel reads can't.

Edit: Also, I didn't mean that select{} on a file descriptor ought to actually
return data. It would be more appropriate for it to yield a true/false value
to signal that the FD has data, so you'd do "case <-fd: fd.Read(bufSize)".

~~~
echlebek
How is your "case <-fd: fd.Read(bufSize)" not a race condition?

~~~
atombender
Race condition? It's no different from select() followed by recv() in C. But
I'm talking about semantics here, not suggesting that this would necessarily
be how you'd write it. The main idea was allowing select{} to work on non-
channel sources of notifications, such as file descriptors.

------
zimbatm
This would be solved if the "select" keyword was also working on I/O objects.
It could gain special io.ReadReady and io.WriteReady channels that return a
boolean.

This would also simplify most of the code that tries to pull events from
different sources, including I/O. Right now each I/O needs to be extracted in
it's own goroutine (adding error handling makes it worse):

    
    
        fromIO := make(chan []byte)
        go func() {
           for {
              buf := myPool.Get()
              myIO.Read(buf)
              fromIO <- buf
           }
        }()
    
        for {
          select {
          case a := <-fromIO:
            // do one thing
          case b := <-fromOtherChan:
            // something else
          }
        }
    

After the change:

    
    
        for {
          select {
          case <-myIO.ReadReady:
            buf := myPool.Get()
            myIO.Read(a)
          case b := <-myOtherChan:
            // ...
         }

~~~
bkeroack
I/O isn't an object, it's an interface (Go has no objects). There is nothing
magical about I/O or any interface--they are just functions--so this change
would require extra magic to be implemented which I strongly oppose.

One of the beautiful things about Go is the designers carefully chose to
confine language magic to only a few fundamental areas (channels, goroutines,
arguably maps) and resisted going beyond that. Go approximates "C plus
Concurrency and Better Data Structures" and nothing more.

~~~
gpderetta
On the contrary, the magic is required because things like select and channels
and coroutines are implented at the language level in go. If they were a
library with proper extension points and, sure possibly languge sugar on top,
it would be doable.

~~~
amscanne
That is very nearly what they are.

The compiler actually just transforms everything channel related into function
calls within the runtime package. The majority (probably > 99%) of the
implementation of channels and select is Go itself.

E.g., see:
[https://golang.org/src/runtime/chan.go](https://golang.org/src/runtime/chan.go)

------
twotwotwo
It's true goroutines waiting on the network are going to sit on a few KB
(small stack, buffer, user-mode sched bookkeeping), and if you multiply that
out by a million or something you're talking gigs. But I think much of the
appeal of the standard Go way, and similar approaches in other languages, is
you can write more or less as if you were doing simple synch I/O, and the
runtime work that other folks have done gives you something decent with AIO,
small stacks, user-space thread switching, etc. without you having to think
much about it.

I think the author says something similar put differently in the first
paragraph.

Something like rakoo's trick is interesting, with the caveat zzzcpan added
that it takes at least a tiny buf, not zero-len, but if you don't actually hit
this wall I'd charge forward the regular way.

------
Matthias247
You actually can observe the same behavior in most async APIs which follow a
"pull model". E.g. if you do "await socket.ReadAsync(buffer...)" in C# you
would also need to have preallocated the buffer and keep it alive for the
whole duration of the async operation. Same is true for C++ boost asio async
reads. Even if you use the pure C WinAPI IOCP methods you have to preallocate
the buffers.

The readyness model avoids this, as described in the article. But it will also
lead to a more event driven programming model.

I think the preallocation might be a problem if the use case is a server that
holds an enormous amounts of mostly inactive connections (e.g. websocket
server). For most servers I don't see a problem with it.

This method also has a benefit for performance: You often can totally avoid
dynamic memory allocation during runtime with this model. You allocate the
receive buffer once for the connection and then reuse it for the whole
lifetime. In the push model you either need to allocate a fresh buffer for the
new data or retrieve one from a pool, which is still more expensive.

~~~
porpoisemonkey
> This method also has a benefit for performance: You often can totally avoid
> dynamic memory allocation during runtime with this model. You allocate the
> receive buffer once for the connection and then reuse it for the whole
> lifetime.

There is a security concern here; you'd better be sure to flush the buffers
correctly or you could cause memory to leak between sessions.

If I were writing a server to handle a sensitive task such as executing
payment transactions I'd probably take the less memory efficient approach of
allocating a new buffer for every connection to lower the risk of data
spilling between connections.

~~~
gpderetta
allocating a new buffer doesn't give any guarantee as you might be given back
a just dealocated buffer by the allocator with its original content mostly
intact. You need to explicitly scrub any sensitive data yourself before
deallocating. At that point you might just reuse the buffer yourself.

------
rakoo
Read() being a blocking call, wouldn't some hack like that work ?

    
    
        _, err := conn.Read([]byte{})
        if err != io.EOF {
            return err
        }
    
        // we know there is something to read
        buf := pool.Get().([]byte)
        n, err := conn.Read(buf)
    
        // process n, err and buf as needed
        // if there is more to read, you may need to loop over conn.Read
        
        // after some inactivity timeout
        buf = buf[:0]
        pool.Put(buf)

~~~
zzzcpan
No, it checks for zero-length buffers and returns immediately. You would have
to read at least a byte and copy it into a new buffer.

Won't matter much though, as the memory usage of a single goroutine is quite
significant and doubling it by having a buffer preallocated is not something
to care about.

~~~
masklinn
> Won't matter much though, as the memory usage of a single goroutine is quite
> significant

2K as of 1.4 (was 4K before 1.2, 8K in 1.2 and 1.3).

But yeah it's a good point, the intrinsic memory overhead of goroutines means
even if buffer pooling worked the memory use of the system would still mostly
follow the "naive" estimate.

You may want to message the author to remind them of that, they may not have
thought of that concern.

------
VanillaCafe
> Suppose that typically 5% of the client connections are actually active, and
> the other 95% are idle with no pending reads or writes.

I suppose the intent is to then use that 95% memory savings for other work.

So what's the point of making this optimization? The system is going to be
hugged to death if the number of client connections approaches 100%, because
the system will not have enough available memory.

If 100% client connections is causing memory problems, rather than keeping a
pool of buffers so that inactive client connections have less footprint, it
seems like a better solution is to decrease the number of available client
connections. Then, provision more servers or more memory if more client
connections are required.

~~~
cakoose
Provisioning for the worst possible case is sometimes necessary (e.g. hard
real-time use cases), but can also be very expensive.

Since HTTP connections are typically kept alive between requests, it's very
common for them to be idle. It could be the case that 99.9% of the time, X
bytes of memory is enough to avoid transfer delays, but to cover the 100%
case, you'd need 10X that.

I think it's reasonable to decide that avoiding delays in the 100% case is not
worth 10X the cost in memory.

------
kornish
Instead of supplying a buffer to each read call itself, you can use something
like a sync.Pool which will give metered and concurrency-safe access to a
buffer:
[https://golang.org/pkg/sync/#Pool](https://golang.org/pkg/sync/#Pool).

~~~
kevinherron
I think you misunderstand. It's not that you can't get a pooled buffer in Go.
The problem is that you don't know if/when you need that buffer without
calling Read(), and the call to Read() requires a buffer that's ready to use.

~~~
vendakka
Exactly. The sync.Pool will only be useful if net.Conn.Read accepted a Writer
instead of []byte.

~~~
masklinn
That wouldn't make much sense since the Reader interface requires []byte, but
the specific connection could inherit WriterTo:
[https://golang.org/pkg/io/#WriterTo](https://golang.org/pkg/io/#WriterTo)

(TCPConn implements ReaderFrom, but not WriterTo)

~~~
vendakka
I wasn't very clear. A hypothetical conn.Read(w Writer) would call Write on w
instead of filling a byte array. Write in turn allocates as needed. WriterTo
is a cleaner way of doing this.

~~~
masklinn
> A hypothetical conn.Read(w Writer) would call Write on w instead of filling
> a byte array.

Yes I understand the purpose — hence suggesting WriterTo which is supposed to
do that — I simply noted that conn.Read(w Writer) would make Conn not extend
io.Reader anymore since it takes a buffer.

~~~
vendakka
Ah right.

------
kbwt
Hmm, there will still be a buffer in the kernel, that the NIC DMAs into.

You will also need to buffer incoming data from a stream protocol like TCP to
combine consecutive reads into complete application layer packets.

A better approach can be seen with Registered I/O on Windows, which lets user-
mode programs register buffers with the kernel so that they can be locked for
the NIC to DMA into: [https://technet.microsoft.com/en-
us/library/hh997032(v=ws.11...](https://technet.microsoft.com/en-
us/library/hh997032\(v=ws.11\).aspx)

~~~
tedunangst
Those DMA buffers are shared. You don't have a different DMA buffer for each
of a million sockets.

~~~
kbwt
You're right. But there will realistically still be some protocol-level
buffering in user-mode for every socket. Although that might be a good use-
case for a hash table of buffers, as most of the sockets will not have
outstanding partial packets most of the time in typical (frequent small
message) applications.

------
jules
Is this inherent in Go's concurrency model or just a limitation of the
standard library? Couldn't the standard library provide a function that takes
a buffer pool as an argument instead of a buffer?

~~~
neild
It's a standard library limitation.

One could imagine a function that takes a set of net.Conns and returns the
ones that are available for reading. You'd then pass those off to a goroutine
for processing with blocking I/O as usual.

------
saman_b
Since IO Multiplexing in go is based on Edge-triggered epoll, you always need
to do a write/read syscall (it happens underneath) before blocking the
goroutine to arm the epoll to poll that specific FD. Also this approach relies
on parking/unparking the goroutine and there is no way around it, since it is
part of the runtime.

One way go can make it nonblocking is to do the read/write and return the
result immediately, but this kind of call requires a call back function to be
provided, so instead of parking/unparking the goroutine, the call back
function is called (either directly or by "go cb_function"), It can be tricky
to make it work, since the initial goroutine might not be allowed to call read
on FD anymore (maybe exit from the goroutine if the read was unsuccessful). In
addition, you still need to provide a buffer to read from, the 1 byte trick
works though, but the buffer management can be done either if the read was
successful or in the call back function. Also this approach makes the
programming asynchronous and similar to event programming.

------
esert
Golang already provides bindings for the polling syscalls (e.g.
[https://gcmurphy.wordpress.com/2012/11/30/using-epoll-in-
go/](https://gcmurphy.wordpress.com/2012/11/30/using-epoll-in-go/)). Someone
can write a library to do what you describe.

------
pierrebai
One way to solve the problem is to have read return a buffer instead of
providing the buffer, so that the internal read multiplexing code can do the
buffer-pool trick under the hood. As long as all your readers ask for the same
maximum size for the read, the pool can be shared. Under the hood, the
implementation makes a select and request a buffer from the pool of the
requested size.

------
snippet22
This would help. [https://blog.quickmediasolutions.com/2015/09/13/non-
blocking...](https://blog.quickmediasolutions.com/2015/09/13/non-blocking-
channels-in-go.html)

