Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It looks cool, but be very careful about using it (in a browser, at least). Browsers have super-fast JSON parsers and parsing msgpack is much, much slower than parsing good old JSON.

This great comment (from 201 days ago) is exactly about this issue:

http://news.ycombinator.com/item?id=4091051

I quote a bit of it here:

    JSON size: 386kb
    MsgPack size: 332kb
    Difference: -14%

    Encoding with JSON.stringify 4 ms
    Encoding with msgpack.pack 73 ms
    Difference: 18.25x slower

    Decoding with JSON.parse 4 ms
    Decoding with msgpack.unpack 13 ms 
    Difference: 3.25x slower

and also this comment: http://news.ycombinator.com/item?id=4093077

   MessagePack is not smaller than gzip'd JSON
   MessagePack is not faster than JSON when a browser is involved
   MessagePack is not human readable
   MessagePack has issues with UTF-8
   Small packets also have significant TCP/IP overhead


sigh. Why do we always want to declare a winner? Anyone who has tried to store binary data in JSON knows that MessagePack does indeed have a use case. And anyone who is working with browsers knows that JSON/JS is the king there.

> MessagePack is not smaller than gzip'd JSON

But gzip'd MessagePack is smaller than gzip'd JSON, so?

> MessagePack is not faster than JSON when a browser is involved

Using it in the browser was never the use case for MessagePack. JSON these days are used for far more than browser-server-communication.

> MessagePack is not human readable

Fair enough. Although I don't consider JSON without newlines to be "human readable" either. I'll have to pipe it through a beautifier to read it.

> MessagePack has issues with UTF-8

MessagePack only stores bytes. It's up to you to decide how to interpret those bytes.

> Small packets also have significant TCP/IP overhead.

And?


I think developers like to declare winners for very, very good reasons typically. Winners are often very useful. For example in settling on a standard once a good or great solution is found so a community can thrive around it, lots of open tools and open code can be built for it, developers can build upon it with a roadmap in mind, sites can publish with it knowing they can reach a maximum audience, etc etc etc etc.

Who really wants to have even a dozen different 'standards' for data interchange? The gradual move from XML to JSON demonstrated perfectly well how messy it can all get, just trying to support two quasi standards.


Choice is a good thing.. do you use a fork, to open doors, eat,scratch your foot, kill flies, start a fire, just because you have to use the one-and-ultimate tool ?

json are not the best for every scenario, as also msgpack does not have the best fitness for everything..

Nature is pretty wise on that matter, it "launch" several possibilities, cause the environment is always changing.. so in a particular giver scenario, one of the possibilities (biological agents) can succed..

but that one that succeded in a particular scenario, will fail in another, where now, another subject who has failed is the better option, adaptation..

so.. choise is a good thing.. why people are so lazy, and dont like to think? everything must be ready..

its our duty to choose the best tool to solve a given problem.. its cool that i can even use something "obsolet" from the 1800´s to solve a 2012 problem..

So if we keep thinking.. "oh that old unwanted thing from the past", we will never see it..

Winners are only winners in a given scenario.. its good to have choice, even against all boards and standards i prefer to decide from myself, than to have others to do that decision for me..

This is technology, not fashion.


Thank you for taking the time to write that, instead of linking to that one comic that everyone links to in these threads about competing standards.

Although I'm not sure why you say "quasi" standard. XML and JSON seem like pretty robust standards to me, and they should both be around for a long time.


Why do we always want to declare a winner?

To be fair, MessagePack seems to be throwing down the gauntlet with their tagline. The parent comment also makes some important points that anyone considering the two should be aware of.


sigh Why do we always have to misconstrue messages that conflict with our personal beliefs? :)

The GP was not declaring a winner. He was responding to overreaching title of the link. Yes there are similarities to JSON, and yes it may be faster and smaller, but only in certain cases. It is very relevant for the GP to point out that in the most common JSON cases (browser communication), those claims are basically false or only marginal.

This is reasoned argumentation, not arbitrary judgement to declare The One.


>> MessagePack has issues with UTF-8

> MessagePack only stores bytes. It's up to you to decide how to interpret those bytes.

Then it's not really "like JSON" is it? I'm not saying that it is a bad thing or that it isn't a better option for many current uses of JSON but the "like JSON but fast and small" is a misleading pitch.


How's that? Just use UTF-8 for your strings and you're on par with JSON, no?


JSON can't carry binary data because it must be UTF-8. Glancing over the spec, msgpack commits the complementary sin, it can't carry UTF-8 data because it can only carry binary data. Yes, it is possible to stuff some UTF-8 bytes into a binary, but you still semantically have a binary, which could be anything. You must bring additional external information to the party to know what that binary hunk actually is.

In theory this may sound like no big deal; in practice I've observed in similar cases it's a disaster. Average developers routinely muck this up (and that's just me being conservative, above-average ones can choke on this problem too). msgpack really ought to have a dedicated string type, and either declare that this string is always a particular encoding, or give a way to declare what encoding the string is in. (The second is more flexible and arguably more correct, but in something like this where there's going to be dozens of libraries trying to implement it, it is virtually guaranteed that a number of them will muck the variable encoding support up badly, so in practice I'd go with mandatory UTF-8 too.)


For me JSON's strength is that you can throw typed data in simple structures straight at it and be confident of getting the same out the other end without worrying about the exact details of the encoding. If you exceed the supported types you have to work harder but in great many cases that is not necessary.

If MsgPack doesn't let me throw strings, numbers plus arrays and dictionaries containing arbitrary further structures without me having to worry about defining the encoding it isn't on par.

Obviously any sane data format can contain any data but not equally easily or efficiently.


I did a quick test on some JSON data I use in my webapps; ~100k-1M blobs, mostly numbers. gzipped MessagePack data was just about the same size as gzipped JSON data. If you care about size on the wire over HTTP, then MessagePack may not be an improvement. It's amazing how well gzip compresses the stupid ASCII encoding of numbers in JSON.


Doesn't have to be stupid. A normal integer in binary representation ALWAYS consumes 4 bytes. With string representation you have 1000 numbers that will beat that as they only need <=3 characters. Numbers in that range are actually used very frequently.


Funny you should mention that. Msgpack uses a variable-length binary encoding for integers. It can, for example, fit integers in the interval [-32, 127] into a single byte. And unlike that fixed four-byte encoding you mention, it can also handle 64-bit integers.

(I'm not disagreeing with you, or anything. I just like talking about binary encodings.)


There is also zigzag encoding, something Protocol Buffers (I believe) introduced for general usage. Thrift later copied it for the compact encoding.

It's pretty interesting.


> A normal integer in binary representation ALWAYS consumes 4 bytes

That's not true. Most languages have 1-byte integers, 2-byte integers and 4-byte integers.


Truly awesome languages use binary-coded decimal.

Ok, I'll show myself out.


> Why do we always want to declare a winner?

You are right. But I came to that page after reading about a faster and smaller JSON at HN, thus I was expecting to find a faster and smaller JSON.

It's just another case of bad marketing hurting a brand.


Oh for crying out loud. In a lot of use cases, msgpack can be treated as a smaller, faster JSON. Let's do a quick experiment. Here's a small, very ordinary JSON document:

http://graph.facebook.com/btaylor

Using Python's json and simplejson modules, which are quite zippy, the encoding time was about 11.4 us, and the decoding time was about 8 us. With msgpack, the encoding and decoding times were 2.7 us and 1.7 us, respectively. The encoded size was 187 bytes with JSON, and 140 bytes with msgpack. Zlib compression brought those down to 127 and 125 bytes, while bringing the total encode-and-compress time up to 25 us for JSON and 16 us for msgpack.


My first thought was "maybe it would be a useful JSON replacement for Unity" (which still lacks native or even aolid third party JSON support). I'm not sure how big a deal the lack of seamless UTF8 supportwill be in practice (in most cases no big deal i think) but it does look promising owing to its C# implementation.

I do agree the tagline is bound to annoy.


> sigh. Why do we always want to declare a winner?

You may want to ask whoever copywrote messagepack's website, they kind-of are the one declaring:

> It's like JSON. but fast and small.


> Using it in the browser was never the use case for MessagePack.

Hopefully it will be soon, with degradation for old browsers. I'd like to be able to load an avatar as binary png data with the rest of the data.


Assuming you are using a browser (other clients support this too)... For now, can't you just encode the image in your json using base 64 and use the 'data' uri scheme to decode it? Sure the encoded data can't be more than 32k in most browsers, but that is huge avatar.


IE8 has the 32k limitation. I don't think anything else does.


Doh, you are correct. I had that piece of info reversed. Thanks for clarifying.


Or, a JSON AddOn for your browser, like http://jsonview.com/


The JavaScript msgpack codec is a couple of years old now, by the looks of the github repository. I see a custom base64 function, and no use of typed arrays. I wonder if these numbers might change a little given more modern features like typed arrays and fetching arraybuffers via XHR2.

Biggest issue for me with JSON is needing to base64 any binary data, so msgpack could actually be quite useful.


I noticed that opportunity too. If anyone is looking for a fun project...

The next thing would be to have practices and/or libraries that switch techniques based on browser support.



Question out of curiosity: did you notice any performance improvements from using DataView and ArrayBuffer?


I never used the "official" js codec because it didn't exist when I started. My codec has been optimized for nodejs and does rather well there. I recently did a jsperf for the browser port (msgpack-js-vs-json) and while slower than the nodejs version, it looks a bit faster than the "official" one.


The Pintrest testimonial on the msgpack website is a very good example of use case: serializing and de-serializing for memcache. The don't claim to be using it for browser-server communication.


For those who are interested in Pinterest usecase (Memcached + MessagePack), please access to this url too. > http://engineering.pinterest.com/posts/2012/memcache-games/


Did you guys see this video?: http://vimeo.com/53261709 Github uses MessagePack for their internal RPC protocol.


Another use is storing a lot of small items in Redis. It's compact in memory, and can be manipulated using Redis 2.6's Lua scripting engine, which is a very handy combination.


That benchmark you referenced has nothing but strings for it's data. That's not msgpack's strong point. Do another benchmark with nothing but small integers and arrays and you'll find msgpack kicks json's tail in both cpu usage and bandwidth, even when comparing the native JSON.parse with my pure javascript msgpack implementation using typed arrays.


I find it fascinating that people claim some computer coding is "human readable" and some is not. Pencil marks on a piece of dead tree are often "human readable". Bits on a hard drive (or on a FLASH drive or in DRAM or ...) are not.

Bits on a some kind of modern electronic storage device require a program (software) to interpret them.

When people say something is not "human readable" they mean not human readable by their favorite text editor, etc. Nothing would prevent a different program from making the binary blob "human readable".

So I claim that, unless we use e.g. Williams-Kilburn tube memory, which stored bits on the screen of the CRT http://www.computerhistory.org/revolution/memory-storage/8/3... which were, in fact, "human readable", I believe it does us all a disservice to say that just because your program does not interpret the binary blob the way you like implies that somehow the binary blob itself is deficient!


I think that people who say that mean "without special interpretation tools"; like, where if you came across the format in the wild tou would immediately e able to understand it, at least a little, based on nothing more than an understanding of some simple protocols like SMTP and some training in common programming languages like JavaScript.

That said, personally, I find tagged-length binary protocols very easy to read with nothing more than a hex editor, and often feel that their simplicity of implementation (no state machine required for their parser) to put them much farther into the camp of "human usable" than most file formats people like to claim are "human readable".


A lot of these bit shaving goose chases would be short circuited if browsers exposed zlib !


What would be the advantage of exposing it instead of compressing the data transparently in the background. (Like it is done already.)


Freeing up the time of compulsive bit chasers?


The situation is pretty sporadic with http, tls and websockets. On plain http it mostly works from server to client, but that's about the only usable case.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: