All of these should come with the disclaimer that your use case among other things will determine how much your mileage varies. For instance, when I benchmarked (wrk) my json utilizing API with a payload size of 145-200 bytes + a minimal header it came out ahead of GRPC with protobufs. I was getting ~180,000 requests per second with serialization & deserialization on my laptop (Thinkpad T25). You're already saturating a 1gbps connection at 1/3 of that output but when you think of the extra leeway for processing that you have, not to mention the smaller size box you can run it on it's pretty nice.
This wasn't in java, the environments themselves are different, the code is different, the benchmarking tool is different, etc, etc so maybe it's not applicable but it still stands that you should investigate whether it will really be worth it or not to invest in the tech if you optimize elsewhere. I mean, I've tried similar things with python and have only gotten 20-40k rps across various frameworks.
Please take everything above with a grain of salt and do your own research.
That reminds me of how a few years back, I was able to beat the pants off all the XML parsing libs I could find using a regex in Perl. Like 7-10 times faster.
The trick is that it was a very simple XML format (Apache Lucene/SOLR), where each record was only one level deep and consisted of a set of tags of type, name and value.
The code was damn simple too. Something along these lines:
my %records
$content =~ /^.*?(?=<doc>)/gsmi;
my %r;
while ($content =~ m{<doc>(.*?)</doc>}gsmi) {
my $record = $1;
$r{$1} = $2 while $record =~ m{<(?:arr|date|str|bool|int|float) name="([^"]+)">([^<]+)</[^>]+>}gsmi;
$records{ $r{id} } = {%r};
}
I might have been able to find a solution that was faster by configuring an XML parsing library correctly, but I suspect data marshaling would have eaten most the gains. I probably could speed it up even more by combining the record start/end with the content parsing and making a state machine, but this is damn fast already and about as conceptually simple as you can get, which has it's own appeal.
The moral is the same though, sometimes if your data is simple enough, you can easily beat industry standard solutions with minimal effor because throwing away their (for your case unneeded) flexibility can yield speed.
Except it's a completely different scenario, micro optimization vs generic approach is most likely always faster, but the goal is not there, you're using lib / tools that no one uses, that's it s no language independent ect .. not well supported.
tbh I'm curious about how you did it because grpc / protobuf is highly optimized since it's the main target at Google.
If you use a length prefix you can easily skip parts of the message. You don't need to replace escape sequences, and thus can work with slices into the original message. In practice length prefix vs escape sequences is strongly correlated with binary vs text protocol, but there are a few exceptions (netstring/bencode).
Another important consideration are field names. If you use string field names, these take up more space and are more expensive to access (dictionary with string keys). Though compression and precomputed perfect hashing based dictionaries can mitigate those downsides surprisingly well. Simple integer keys like in protobuf are a bit cheaper to look up and more compact as well. And offset based approaches like capnproto can reduce those costs even further, though they come with their own limitations.
In general there are many good-enough solutions, but if you want the ideal solution you have to think about many details (zero-copy, single-pass write, random access read, human readability, extensibility etc.) with contradictory requirements on the format.
One thing I hate about GRPC with Kotlin is the crap of get set that it generates for Request/Response payloads. I've always wanted to used basic POJO objects and this might solve my problem. Although I've seen some unmaintained Kotlin generators for GRPC there is no battle tested solution and this post just might solve my problem. Jackson is performant enough for me.
I think he is referring to the fact that you use a fluent builder interface when building protobuf messages so you chain together a bunch of `.setFoo(myFoo).myBar(myBar).build()` to build your protobuf, while in Kotlin it is idiomatic to write `foo = myFoo` and `bar = myBar`. There are ways around this with `.apply` in Kotlin, but it all feels really clunky.
That said if you want a systematic don't make me think approach for backward and forward compat I think protobuf/GRPC is the way to go you just need to learn a few rules for how to do it and you're set. In terms of static typing protbuf doesn't really get you that much because in order to get backward/forward compat you can't have required fields, so everything needs to be null-checked or have default values.
This wasn't in java, the environments themselves are different, the code is different, the benchmarking tool is different, etc, etc so maybe it's not applicable but it still stands that you should investigate whether it will really be worth it or not to invest in the tech if you optimize elsewhere. I mean, I've tried similar things with python and have only gotten 20-40k rps across various frameworks.
Please take everything above with a grain of salt and do your own research.