
MsgPack vs. JSON: Cut your client-server exchange traffic by 50% - muellerwolfram
http://indiegamr.com/cut-your-data-exchange-traffic-by-up-to-50-with-one-line-of-code-msgpack-vs-json/
======
cheald
JSON's appeal is that it is both compact and human readable/editable. If
you're willing to sacrifice all semblance of readability/editability, then
sure, let's all do the binary marshaling format dance like it's 1983.

Additionally, if you're sending data to a browser, then you're cutting the
knees out of native JSON.parse implementations (Internet Explorer 8+, Firefox
3.1+, Safari 4+, Chrome 3+, and Opera 10.5+). The copy claims "half the
parsing time" (just because of smaller data size), but I'm _exceptionally_
skeptical of those claims since this is just going to move the parser back
into Javascript.

I whipped up a quick example to demonstrate my point:
<https://gist.github.com/2905882>

My results (using Chrome's console):

    
    
        JSON size: 386kb
        MsgPack size: 332kb
        Difference: -14%
    
        Encoding with JSON.stringify 4 ms
        Encoding with msgpack.pack 73 ms
        Difference: 18.25x slower
    
        Decoding with JSON.parse 4 ms
        Decoding with msgpack.unpack 13 ms 
        Difference: 3.25x slower
    

So MsgPack wins the pre-gzip size comparison by 14%, but then is nearly
_twenty times slower_ at encoding the data, and is over three times slower at
decoding it.

Furthermore, once you add in gzip:

    
    
        out.json.gz: 4291 bytes
        out.mpak.gz: 4073 bytes
    

So, our grand savings is 218 bytes in exchange for 78ms slower parsing time.
It'd still be faster to use JSON's native encoding/decoding facilities even if
your client were on a 28.8k baud modem.

~~~
haberman
> I'm exceptionally skeptical of those claims

I can attest that these guys are known to post bogus benchmark numbers. The
claim on their home page ("4x faster than Protocol Buffers in this test!") is
incredibly misleading; their benchmark is set up in such a way that Protocol
Buffers are _copying_ the strings but Message Pack is just _referencing_ them.
It's true that the open-source release of Protocol Buffers doesn't have an
easy way of referencing vs. copying, but it is still highly misleading not to
mention that you're basically benchmarking memcpy().

One of these days I'll re-run their benchmark with my protobuf parser which
can likewise just reference strings. I am pretty sure I will trounce their
numbers.

~~~
ralph
I pointed out the Protocol Buffers comparison to a friend on IRC. He came back
with:

"They rolled their own "high-level" serializer and deserializer for PB, built
on top the lower-level stuff the documentation advises you not to use. Using
the recommended interfaces, PB is faster than in their test. It is still
slower than msgpack. Not sure why they'd make their test look cooked when they
win in a fair test anyway. Further examination shows that the test is _mainly_
a test of std::string, for protobuf. std::string is required to use contiguous
storage, whilst msgpack's test uses a msgpack-specific rope type."

------
catch23
Having actually used MsgPack, it's nice that's compact, but bad that it
doesn't handle utf-8 properly. The reason data is smaller is because they're
basically using less bits depending the value, eg if the numerical value is
"5" then you can use 3 bits to represent the value whereas JSON will always
use floats to represent integer values. If you know exactly what your data in
the JSON might be, MsgPack is nice, otherwise it can be a pain in the butt if
you're sending arbitrary data from users.

Here's a nice review of different "competitive" formats to JSON:
[http://qconsf.com/dl/qcon-
sanfran-2011/slides/SastryMalladi_...](http://qconsf.com/dl/qcon-
sanfran-2011/slides/SastryMalladi_DealingWithPerformanceChallengesOptimizedSerializationTechniques.pdf)

According to their review, msgpack is good for small packets of data, bad for
big ones. Binary JSON formats like MsgPack, are only good if you know your
exact usage pattern, otherwise they bring along too many restrictions for it
to be competitive. The best transport mechanism is still JSON.

~~~
ithkuil
"whereas JSON will always use floats to represent integer values."

It might use floats to represent integers in memory, but when json is
transferred over the wire it's a textual encoding so each digit will consume 8
bits.

~~~
catch23
That's true, but I guess one should then mention a 5 digit number would
consume less space in memory than in textual format. MsgPack recognizes this
and stores that 5 digit number using the minimum number of bits required, but
this wouldn't be possible in the usual json string formats.

------
fserb
Saying that reducing 50% of a 30 bytes message reduces 50% of the client-
server exchange traffic is misleading (and wrong).

The average HTTP header is around 700 bytes plus 40 bytes of TCP/IP header.
Since its gains diminishes with more data (MsgPack seems to focus on the
structure of the message), I imagine the real gain will be between 1-2% of
traffic.

The extra complexity is simply not worthy.

~~~
ch0wn
Also, if you're using it in the browser, JSON is already there while you need
to load a library in order to use msgpack.

------
zippie
The recent trend of new projects trying to create a new JSON binary/more
compressed protocol is troublesome ... and I strongly feel like these projects
overlook gzip and "what's there" in favor of "look at my new shiny library". I
had the same grievance w/RJSON previously
(<http://news.ycombinator.org/item?id=4068555>).

These projects are failing to realize the broader picture: gzip and JSON
parsing are builtin and the subject of many optimizations by browser vendors.
The network/request/response cost of putting an RJSON/MsgPack like library
into the mix negates any payload savings I might have had by creating a
dependency to the library.

------
steve8918
We have now come full circle in terms of how data is marshaled between client
and server.

I've now seen it go from XDR (the marshaling used by ONC-RPC) to XML to JSON
and now to a binary form of JSON, which is basically like XDR.

~~~
troels
I'm convinced this is a very elaborate troll.

------
MehdiEG
This sort of comparison is fairly meaningless in many real-world situations as
other have already pointed out. You'll find that with most payloads,
compressed JSON will actually be significantly smaller than non-compressed
MsgPack (or Thrift / Protocol Buffer / Other Binary Serialisation Format).

And compressed MsgPack (or Thrift / Protocol Buffer / Other Binary
Serialisation Format) will often be roughly the same size as compressed JSON.

Of course, there's also the performance benefit of faster serialisation /
deserialization, but again, it won't make much of a difference in many real-
world situations.

That said, our API supports both JSON and Google Protocol Buffer (using
standard content negotiation via HTTP headers) and we actually use it with
Google Protocol Buffer from our mobile app. It's not so much for the message
size benefit (which is non-existent in practice) but more for development
convenience (and the potentially slightly better performance is an added
benefit we get for free).

We've got one .proto file containing the definition of all the messages our
API returns / accepts. Whenever we make a change to it and click Save, the
Visual Studio Protobuf plugin we use kicks in and automatically generates /
updates the strongly-typed C# classes for all the API messages.

On the XCode side, a simple Cmd-B causes a 2-liner build script we wrote to
copy across the latest .proto file for the API and automatically generate
strongly-typed Objective-C classes for all our API message in a couple of
seconds. Beautifully simple.

It then lets us code against strongly typed classes and take advantage of code
completion and compile-time type checks instead of working against arrays or
dictionaries. It also means that there's no magic or automatic type guessing
going on that inevitably end up breaking in all sort of weird manners - our
API contract is well-defined and strongly-typed.

And while the API design is still in flux, working against strongly-typed
classes means that a simple compile will tell us whenever a change we made to
an API message causes existing code to break.

Last but not least, if we ever need to check what the API returns in a human-
readable form, all we have to do is call the API with Accept: application/json
and we can troubleshot things from there.

It all makes working with the API from both sides really quick and easy and
making changes to the messages trivial. It's certainly possible to have a nice
workflow with JSON as well but I have to say that I quite like what we ended
up with with Protocol Buffers

~~~
wsc981
"for development convenience"

I wonder what the development convenience is when using protobuf instead of
JSON. Could you elaborate?

~~~
MehdiEG
That's what my whole comment was all about. Automatic generation of strongly-
typed classes. Code completion. Compile-time type checking. Well-defined and
strongly-typed API contract. Anything you want me to clarify?

------
latch
Word of warning, the node.js implementation has many bugs and pull requests
are ignored. Find someone's fixed fork to point to.

------
viraptor
I looked at it lately working in python, but was a bit disappointed that they
didn't include more types (not necessary, but some sugar for sets, lists,
uuids would be nice - they have lots of codes to spare at the moment) or
distinction between utf8 and bytes. Lack of that leads to:

    
    
        >>> msgpack.loads(msgpack.dumps(u"ąę"))
        '\xc4\x85\xc4\x99'

~~~
dguaraglia
Yep, the moment it couldn't encoded a datetime I knew it wasn't going to work
for me.

------
stevenleeg
What's the difference between this and BSON, another binary based JSON
implementation?

~~~
LeafStorm
The main differences are:

1\. MessagePack includes less data types. Several of the BSON-only data types
are specific to MongoDB, but others - such as separate types for binary data
and UTF-8 text - are IMO quite useful.

2\. MessagePack includes several optimizations in the "type byte" to
efficiently represent small values.

A while back I designed my own serialization format that combined features
from both, but I never finished an implementation.

~~~
olsn
do you have a specification or even some prototype on github? would be
interesting to see

~~~
LeafStorm
I dug the spec I wrote off my backup hard disk and posted it here:
<https://gist.github.com/2907123>

I also planned to define a standard set of tags for the "tagged value" type,
but never quite got around to that.

------
mkramlich
CSV/JSON/YAML/XML/similar + zip/gzip = best of both worlds in many cases. But
don't add the compression step until you're sure you really need it. It may
not matter. If/when it does matter you may have a kind of Maserati problem on
your hands anyway.

I'm really hoping all the "Every Byte Is Sacred" kinds of wire formats will
die off. Moore's Law-like effects are making computing resources beefier and
more abundant over time. No similar effect for our human cognitive capacity or
work environments. Meatware is more expensive than hardware, in the general
case.

------
macspoofing
>Additionally, the whole thing look less readable when you look at it – so as
a small bonus this will probably repell some script-kiddies, who are trying to
intercept your JSON- or XML-calls.

This of course is not a problem because we're all using SSL here.

------
mibbitier
You'll get far better gains by removing HTTP headers and other fun things.

------
JosephRedfern
Surely this depends on how long your actual content is. If your JSON values
are short, I can see the advantages of this. If you've got a lot of data, then
I can't see this reducing traffic by 50%.

------
muellerwolfram
has anyone tried using msgpack with backbone, or any other js framework?

~~~
catch23
you won't be able to use msgpack with backbone, unless you're using backbone
on the server side and you have something to decode the binary json blobs.
msgpack is not a json replacement on the client side.

~~~
muellerwolfram
using msgpack on the server with rails for example shouldnt be a problem.
using this gem <https://github.com/nzifnab/msgpack-rails> , it seems like its
not much different then to letting rails return xml, json or html.

I just thought that it might be possible to transform the received msgpack-
data into json, so backbone can use it.

~~~
catch23
That's what I mean. MsgPack is strictly a wire format for server side
applications. You won't gain much using that as the wire format for browsers,
especially if you use compression which would probably negate the benefits of
MsgPack anyway. You'd also need something to read/write byte arrays inside the
browser; I think firefox has some non-standard mechanisms for doing this, but
I doubt there's cross browser compatibility here.

------
taligent
This has actually been covered before last year:

<http://news.ycombinator.com/item?id=2571729>

One interesting comment:

 _Yup. We tested messagepack a while back in our environment. Speed difference
was 2% versus gzipped JSON. Not enough to invest time in changing a working
system and losing human readability._

Also looks like MsgPack has very poor serialize/deserialize speeds on the JS
end:

<https://gist.github.com/1101623>

