
Show HN: Molecule – Streaming, zero-allocation protobuf decoding in Go - richieartoul
https://github.com/richardartoul/molecule
======
ignoramous
I've come across a fair share of zero-alloc implementations of various things,
like u/poitrus' zerolog [0] and u/jorangreef's cuckoo-hashtable [1], for
instance.

Given that JavaScript runs on restrictive clients, it must have libraries in
spades for this kind of a thing, but I couldn't find many. I'm really
interested in techniques that are generally employed in languages like Java,
Golang, Rust, C/++ that are worth translating over to JavaScript, especially
now that TypedArrays are a reality. I'm pretty sure there's more to zero-alloc
than just _Buffers_ , _Object pools_ , and _Recyclables_ , but I can't seem to
find good resources to get started with this.

[0] [https://github.com/rs/zerolog](https://github.com/rs/zerolog)

[1] [https://github.com/ronomon/hash-
table/blob/master/README.md](https://github.com/ronomon/hash-
table/blob/master/README.md)

~~~
kitd
Giving different views and projections on underlying memory is really the key
tactic to processing data without allocations. A slice (ie a view onto an
array) is a pointer to underlying memory (or another slice), along with the
length (+ capacity in Go, but that's not often needed). Underneath the bytes
don't move. You simply pass around the views (or views of views) of what you
want your code to actually see.

Don't know about Rust, but Go has slices built into the language, and C++ has
it in the standard library. Learning idiomatic slice-handling makes working
with data sans allocation much, much easier.

~~~
monocasa
> Don't know about Rust, but Go has slices built into the language, and C++
> has it in the standard library.

Rust also is very big into slices as well.

------
thedance
You can beat the pants off this, even though it beats the standard proto
library. Most of the time here is being spent calling out-of-line functions
through Go's stack calling convention because the Go compiler hates inlining.
For example, DecodeTagAndWireType is CALLing DecodeVarint, even though that's
really where you want it to have been inlined.

If your use case is really simple, and you want the fastest possible proto
decoding, you can hand-roll it pretty easy.

    
    
      message point { fixed32 x = 1; fixed32 y = 2; }
    

Then your code is structured such as

    
    
      var x uint32
      var y uint32
      for len(buf) > 0 {
        switch buf[0] {
        case (1<<3) & 5: // Field 1, type fixed32
          x=binary.LittleEndian.Uint32(buf[1:])
          buf = buf[5:]
        case (2<<3) & 5: // Field 2, type fixed32
          y=binary.LittleEndian.Uint32(buf[1:])
          buf = buf[5:]
        default:
          // Freak out somehow
      }
    

Improve the exception hardness of this code to the extent that it suits your
use case.

~~~
richieartoul
Yeah thats a good point. The use-case I wrote this for involved decoding a
protobuf message that has only 3 fields, but where the total size is several
MiB. For my workload this optimization is meaningless, the most important
thing is to avoid allocating huge slices, but for lots of small protobufs like
the one you defined a hand-rolled solution will definitely be faster.

Do you think I could just manually inline the decoding code as much as
possible in the molecule library, or does it need a different API?

Also if you open a P.R I'll merge it :)

~~~
thedance
Heh OK, see my other post for the simplest possible improvement.

------
kodablah
This looks great to reuse protobuf definitions in a high performance
situation. I would be curious how the performance of this compares to
flatbuffers given a reader or byte slice as the use cases seem similar.

------
rapsey
If you're using Go and need to optimize to get rid of allocations. Why are you
using go?

~~~
shanemhansen
Realistically if you're able to serve 10k requests/s and you want to get to
20k and you have the option of:

a) rewriting in C++

b) pooling some objects

Which one is the right call?

~~~
dirtydroog
Both

