

Fast deserialization in Python - bravura
http://blog.metaoptimize.com/2009/03/22/fast-deserialization-in-python/

======
jmillikin
I'm the author of jsonlib, and I registered specifically to post this message.
Please, please, _please_ do not use cjson!

First, it is unmaintained. The latest version available was posted on August
24, 2007. When you encounter one of its myriad bugs, you'll either have to
patch it yourself or pick another JSON library. Just skip the intermediate
step and use another library to begin with.

Second, it is _buggy_. In some cases, parsing text it just generated will
return a different value from what you passed in! It's almost entirely
ignorant of Unicode, and what little it tries to parse it gets wrong.

Third, it's exceedingly non-compliant. The text it parses and generates bears
only a passing resemblance to JSON. There are varying degrees of conformance
to the spec between libraries, based on personal preference of the authors --
I prefer strict conformance, others less strict -- but cjson is _so_ different
as to be simply unusable.

Yes, it's fast. I know. I wrote jsonlib partly because I was unsatisfied with
simplejson's performance, and one goal (never truly achieved) was always to
surpass cjson. However, speed isn't everything. As the saying goes, "if I want
my math performed fast and wrong I'll ask my cat".

In my opinion, the _only_ Python JSON libraries worth considering are:

* simplejson -- it's in the standard library, and should therefore be considered first and most thoroughly.

* jsonlib -- it's fast, well-tested, and standards-compliant.

* demjson -- has several options for reliable parsing of invalid input.

Last time I checked, jsonlib and simplejson's C extensions are neck-and-neck
performance-wise. In some quick, unscientific tests, jsonlib reads faster and
simplejson writes faster. However, simplejson's extensions are only used for
certain subsets of input -- if you want to use an uncommon feature,
performance will degrade. jsonlib has an implementation in pure C, which
avoids this problem at the cost of complexity.

Apologies for the brain-dump, but even if you skip right over it, please
remember: don't use cjson.

------
piramida
Really surprising that it beats cPickle with binary protocol. Good stuff,
thanks.

~~~
bravura
My understanding is that cPickle can be slow, because of introspection.

