

In Depth Talk on Performance and Serialization (Protobuf, json, Thrift, etc.) - peterb
http://www.infoq.com/presentations/Dealing-with-Performance-Challenges-Optimized-Data-Formats

======
trjordan
Direct link to slides: [http://qconsf.com/dl/qcon-
sanfran-2011/slides/SastryMalladi_...](http://qconsf.com/dl/qcon-
sanfran-2011/slides/SastryMalladi_DealingWithPerformanceChallengesOptimizedSerializationTechniques.pdf)

------
stock_toaster
I didn't see anything in the slides about compressing (gzip) any of the non-
binary formats (json, xml, etc). It would be interesting to see those in a
comparison like this too.

Also would have been neat to see tnetstrings <or insert favorite format here>.
;)

------
BlueZeniX
I seriously wonder how it's possible that in his tests JSON got smaller than
MsgPack.

MsgPack is so close to JSON in structure, but very compact (1 byte type
header, small numbers type and payload combined) it doesn't make any sense.

~~~
stock_toaster
Consider for example a message with a double in it. A double in msgpack is 9
bytes (according to their spec[1]). In json it is just however many bytes it
takes to represent the "number" in ascii. So if the number happens to be a
very small double, such as 3.0, it may be just 3 bytes to store the double in
json (depending on the encoder?), as apposed to 9 bytes for msgpack. Something
similar could be said for large intergers too. '3' is only one byte in json,
but would be 5 bytes in msgpack when trying to encode an int32.

That something similar is occurring to the messages in the talk, is the only
explanation I could think of anyway... Looking at the thrift description, it
does appear that there are int32s and doubles in the messages.

[1]:
[http://wiki.msgpack.org/display/MSGPACK/Format+specification...](http://wiki.msgpack.org/display/MSGPACK/Format+specification#Formatspecification-
double)

~~~
saurik
One thing that is important to point out that is that JavaScript (and thereby
JSON) doed not separate integers from floating-point numbers; so if you have
the double 3.0 it will be represented in JSON as "3", not "3.0". If
MessagePack does not have a mechanism for dropping down to int32 for integral
doubles, then you will actually see a 9:1 difference for this specific case
(although if you are often storing integers in your doubles maybe you should
be using int32 anyway, in which case 3.1 may have been a better general
example than 3.0 ;P).

~~~
stock_toaster
I think that depends on the library you are using. 3.0 (not truncated to
simply 3) is certainly a valid json number. However, point taken that it is a
meaningful distinction for javascript.

~~~
saurik
Oh, it is certainly valid; it is just that for purposes of comparing relative
space you should be looking at the shortest representation. "3.0000000" is
also valid, and is the same 9 bytes as MessagePack, but there is no reason
you'd use that to encode this specific number ;P.

------
chad_walters
I recommend that anyone interested in this topic look closely at the raw data
provided in the deck on slides 38-41.

