
No JSON/MAPS interoperability in 17.0? - rahij
http://erlang.org/pipermail/erlang-questions/2014-March/078228.html
======
jerf
Erlangers are particularly grumpy about JSON because unlike many other
languages, Erlang lacks a decent default serialization and deserialization of
JSON. This is primarily due to the fact that a string is a list of integers,
thus making it hard to distinguish between the two, and somewhat less
importantly, but still problematic, the klunkiness of the dictionaries. (This
message appears to be in the context of the soon-to-come native addition of
dictionaries, but as I imply, unfortunately that is not the biggest problem
with JSON deserialization, the nature of strings in Erlang is. Binaries are
not a precise conceptual match to a JSON string either, unfortunately; in my
environment I actually use binaries that way because I can guarantee the
differences won't affect me since I control both ends of the protocol, but
that is not safe in general.)

This is not to say that the arguments are wrong, but I think you may not
understand why Erlangers feel the way they do until you realize that Erlangers
also don't generally get to experience the _advantages_ of JSON, either; when
you get none of the advantages and only the disadvantages, anything looks bad,
no matter how good it may be in the abstract.

This is a particularly rich irony in light of the fact that Erlang's native
data model is _conceptually_ the closest language I know to natively using
JSON as the only data type. Erlang has no user-defined types, and everything
is an Erlang term... imagine if JS worked by only allowing JSON, so, no
objects, no prototypes, etc. The entire language basically works by sending
"ErlON" around as the messages, including between nodes. It's just that
there's no good mapping whatsoever between Erlang's "ErlON" and JSON.

~~~
rubyrescue
Erlang has amazing default serialization and deserialization.
term_to_binary(T) and binary_to_term(B)

~~~
jerf
(I edited the last paragraph in before seeing your post here.)

~~~
rubyrescue
The last paragraph is absolutely right. It's incredibly ironic...

------
NathanKP
The first requirement that I look for in a data interchange format is that it
should be readable, because I don't want to look at binary data responses. The
second thing I look at is support for encoding and decoding it.

The only three formats I've found so far that are readable and well supported
are XML, JSON, and YAML. XML is too hefty and wasteful. YAML has had a bad
history of insecure encoders and decoders but overall is my favorite data
format. However, it still has the downside of needing a special decoder since
browsers don't support it, and it requires specific indentation for its
hierarchical data format which is wasteful in its own way.

That just leaves JSON in my opinion. It's easily understood and read, and
native browser and Node.js encoding and decoding is more than fast enough.

~~~
mwcampbell
I always use "python -m json.tool" to pretty-print JSON data before looking at
it. I could just as easily do the same with a MessagePack or BERT pretty-
printer.

------
jfoster
This essentially amounts to "JSON is slower and more CPU intensive than it
absolutely needs to be."

I can live with that. Any solution that people come up with, there will be
folks who can point out some sort of flaw in it. Sometimes the flaws are worth
paying some attention to, when fixing them might lead to some tangible
benefits. It's difficult to imagine a slightly less CPU intensive replacement
for JSON bringing any tangible benefits with it, though.

~~~
icebraining
The lack of hypertext is a valid complaint, in my opinion, for a language
designed in the web age. Sure, you can build a format on top of JSON that
treats certain values as URIs, but that kind of thing should be built-in.

~~~
Tobani
I don't quite follow the whole hypertext problem. I mean URIs can be encoded
as strings, there's no problem there.

HTML doesn't handle URIs any differently. They're just string attributes on
elements(generally).

Is the problem that there aren't clients that can generally consume the JSON
and find the hyperlinks? Is the idea that the JSON should be able to be
rendered as a site kind of like the whole xml/xslt idea?

I guess I just don't get what the problem is here? What would built-in uri's
look like?

~~~
icebraining
_HTML doesn 't handle URIs any differently. They're just string attributes on
elements(generally)._

Sure it does, when it defines the semantics of the elements. HTML says "the
'href' attribute of the 'a' element is the URI of a linked resource". There's
no special syntax, because it doesn't need to have; they're identified by
their elements.

But on a more general format that doesn't define particular semantics to
nodes, you have syntax that identifies URLs. For example, in Turtle - which is
a format that I quite like - you encode strings using double quotes around the
value, but URIs are encoded using angle brackets.

Example:

    
    
      dc:title "RDF/XML Syntax Specification (Revised)" ;
      ex:editor [
        ex:fullname "Dave Beckett";
        ex:homePage <http://purl.org/net/dajobe/>
      ] .
    

The advantage is that smart clients can parse that information and crawl the
documents, generating useful data even if they don't necessarily understand
this particular format. Think search engines, for example - they might not
understand that <span id="author"> identifies the author's name, but they can
still derive useful information for the web of pages that they crawl.

------
MagicWishMonkey
Using something other than JSON is a great idea if you don't want anyone to
ever bother using your API.

~~~
strmpnk
Content negotiation is a great way to expose multiple serialization formats.
No reason it has to be just one of the well known formats. The main tricky
part is equivalence of more complex structures in JSON, for example msgpack
supports more than just string keys in maps.

~~~
hapless
So now I get to support more than one format, with corresponding increases in
complexity...

..But only the JSON format will have any users, and I still have to deal with
anything I find objectionable about JSON.

~~~
strmpnk
In most systems I've deployed multi-format support it was a matter of a few
extra lines of code rather than a doubling of effort. Pretty much all JSON
values have obvious (aka. automatic) equivalents in other formats. JSON is
truly the lowest common denominator, which is its strength.

------
hapless
Only an Erlang mailing list would prompt someone to complain that a document
format requires valid UTF-8.

This is a glimpse of the last place on earth where relying on US ASCII is
considered a positive good.

~~~
jnevill
I was thinking to myself while reading the link "Why on earth would someone
think UTF8 is a problem?" and then I saw it was a erlang thread. To complain
about it being slow to validate while talking in an erlang thread... christ.
It must be frustrating trying to deal with json, which is full of strings, in
a language like erlang.

~~~
rdtsc
JSON decoders decode string to binaries.

[https://github.com/talentdeficit/jsx](https://github.com/talentdeficit/jsx)

Erlang has been receiving and sending data. It just like to deal well defined
binary messages and it likes to encode/decode them into its own
representations (records, terms) at the boundary where they come in and leave
the system.

------
antirez
Prefixed length formats that don't make assumptions on encoding are a very
good pick. The most important thing is that once you read the length, you can
just bulk-read the payload without actually parsing it, that's up to the
higher layer (if parsing is needed at all).

A general purpose serialization format that requires per-char processing is a
terrible pick.

~~~
rdtsc
Agreed. I had to implement base64 encoding to pass data through JSON, yeah it
works but seems terribly wasteful.

------
nubs
> Its numbers representation is double-precision floating-point, meaning it's
> incredibly imprecise and limited.

Except that JSON actually just specifies the number type as an arbitrary
precision decimal number.

Many implementations use floating point numbers when decoding JSON, but that
is not inherent in JSON.

My biggest complaint about numbers in json is that often times floating point
numbers get turned into integers when encoding/decoding using some
implementations. (e.g. (float)2 gets encoded as just 2 and when decoding, it
is an integer rather than a floating point.

~~~
vesinisa
Well, regardless of the specification, most JSON deserializers approximate
numeric values as floats, at least in the default configuration. Enough so,
that the Bitcoin exchange JSON-APIs I know of all put numbers in strings. So
`{"balance": "12.65334221", "currency": "BTC"}` instead of `{"balance":
12.65334221, "currency": "BTC"}`. It would easily be pretty catastrophic if
there are any rounding errors in your currency calculations, and if the bug
arises already in your JSON deserializer it might be hard to detect even if
your program internally employs BigNums (arbitrary precision) or such.

------
jcizzle
JSON and Erlang just don't get along in the way you'd like them to,
unfortunately. Some of that has to do with how Erlang handles (or doesn't
handle) strings. Some of that was that there was not a data structure that
closely mirrored JSON's structure in Erlang up until Maps were introduced.

The problem isn't really JSON - JSON is an exceptional format for what it is
supposed to do - the problem is that Erlang was created for a specific purpose
and that purpose wasn't to vend out strings/JSON over HTTP.

~~~
rdtsc
> Erlang handles (or doesn't handle) strings.

JSON string are translated to binaries. There is no reason to keep repeating
oh inters=strings. Maps will help.

> The problem isn't really JSON

I think the author disagrees. He talks about the problems of "JSON". There are
a few he notes:

* No binaries.

* No way to have hyperlinks

* Limited floating point number implementations

~~~
jcizzle
You have to translate your strings to binaries before you try to encode to
JSON. A JSON encoder has no way of determining whether it should be encoding
an array or a string without this. Instead of one string type, you have
"strings" and <<"strings">>, and due to the lack of typing, it is actually
pretty tough to keep track of which one you have. It also is an exercise in
finger acrobatics to <<>> all your strings.

JSON doesn't solve every problem. It does solve the problem, quite elegantly,
of creating a 'endianless' interchange format that is easily consumed,
constructed, and debugged.

------
denzquix
Since the author seems to have a problem with text-based formats in general,
here's a counterpoint by Mike Pall on the LuaJIT mailing list [1]:

"On a tangent: IMHO most binary serialization formats like BSON, MessagePack
or Protocol Buffers are misdesigns. They add lots of complications to save a
couple of bytes. But that does not translate into a substantial improvement of
the parsing speed compared to a heavily tuned parser for a much simpler text-
based format."

[1] [http://www.freelists.org/post/luajit/Adding-assembler-
code-t...](http://www.freelists.org/post/luajit/Adding-assembler-code-to-Lua-
programs,12)

------
arethuza
So don't use JSON when you need super high performance and be careful about
parsing JSON numbers?

It may not be perfect, but compared to the complexity of the XML world and the
opaqueness of binary formats JSON is a very pleasant compromise.

Shame about it not having comments though.... :-)

------
robgering
The author suggests using MessagePack, which I hadn't seen before but looks
really cool.

[http://msgpack.org/](http://msgpack.org/)

~~~
denzquix
But that also uses double precision floating point number representation and
UTF-8 for strings, so the only benefit (according to the arguments given in
the OP) seems to be text vs. binary-based.

~~~
craigching
> But that also uses double precision floating point number representation

I don't think JSON requires the number representation to be double-precision,
that can be up to the parser to decide how to represent a number. I think most
people misrepresent this because the number implementation in _JavaScript_ is
double-precision, but that's not necessarily true for JSON. Am I wrong about
that? Does JSON require the number format to be double-precision?

~~~
denzquix
I had to check but yeah, that's true, the ECMA spec even says explicitly in
its intro: "JSON is agnostic about numbers", they are "only a sequence of
digits".

(Unrelated observation from looking at the spec again: there's no
standard/recommended decoding behaviour if an object has several members with
the same key.)

------
EC1
>I do not have a magical format to propose to replace it

So why did you even comment in the first place?

~~~
garretraziel
I don't agree with you. I think that someone can point out problem without
providing the solution at the same time.

~~~
barbs
He didn't just point out a problem, he actively tells people to "stop using
JSON" without providing an alternative.

------
peteretep

        > *  It has to be valid UTF-8, meaning it's incredibly
        > slow to validate.
    

If make basic mistakes about what the format allows, the rest of your comment
is liable to be junk.

------
efuquen
> It's text-based, meaning it's incredibly slow to parse.

This is precisely the reason it is used so heavily. Easily readable format to
reason about. And _incredibly slow_ is a pretty relative term, when you're
waiting on database calls or doing other complex logic that is orders of
magnitude slower, who cares how "slow" json parsing is.

> It has to be valid UTF-8, meaning it's incredibly slow to validate.

I think being able to embed all sorts of different characters and languages
is, again a plus. See argument above about performance.

> Its numbers representation is double-precision floating-point, meaning it's
> incredibly imprecise and limited.

This argument I don't get. I'm pretty sure I might be missing something, but
in my experience you can just put a plain old integer and any parser in any
language will extract as an exact integer. Nobody is converting a "1" in json
into a double/float. Maybe somebody can elaborate and what the author might
have actually meant?

So, really, the argument all boils down to, it's slow and wasteful. Well,
while that's true I think it's pretty much been established time and time
again that Moore's law has made it possible to value programmer time over CPU
overhead, to a reasonable extent (i.e. if the overhead you're adding overtakes
Moore's law and makes infrastructure particularly expensive). If you have a
format that is human readable, easy to understand, and simple, that helps
tremendously in software development and it would take order of magnitude
performance hits to really make it bad tradeoff (and even the, if you weren't
getting a lot of traffic, who would care?), not just 2x or 3x.

We live in a web based world, and that world is fundamentally based on a text
based protocol (http) and text based messaging formats. There are plenty of
valid and good reasons why it happened the way it did.

*Final Note: I would like to see some empirical evidence to compare the wasted CPU cycles and energy that JSON uses compared to if all messages were sent with msgpack or something like it instead. While my own inclination is that number would be dwarfed by the overall energy used in computation I would prefer to see evidence rather then conjecture if you're going to make a point like that.

~~~
craigching
>> Its numbers representation is double-precision floating-point, meaning it's
incredibly imprecise and limited.

> Maybe somebody can elaborate and what the author might have actually meant?

I think JSON numbers are not necessarily double-precision, at least I don't
think I've ever seen them specifically described this way. In JavaScript,
however, numbers are double-precision, there are no real integers in
JavaScript. So maybe the author misrepresented JSON numbers because JavaScript
numbers are double-precision. But that isn't necessarily the case AFAIK.

------
HugoDias
Ok, I understand your argument, now tell me a better format to create an API,
Thanks ...

~~~
icebraining
Turtle!

------
Zikes
If msgpack is able to match JSON for serialization/deserialization, couldn't
you make an API in JSON and offer msgpack as an alternative format?

~~~
rhizome31
In a project I'm working on, we do that with Protocol Buffers. Our main client
uses protobuf but we still provide JSON for other clients. We also use the
JSON API for debugging.

------
prottmann
Here the reasons why i use JSON:

* It's text-based, meaning it's readable and easy to parse.

* It has to be valid UTF-8, no hassle with other formats.

* Its numbers representation is double-precision floating-point, i can choose what i want.

Who cares? Use what you want and what solves your problem at the best way.

------
Thaxll
Servers aren't CPU bound anymore, there is no need to worry about that...

~~~
sigmaml
It depends a lot on the service that you provide. I provide compute services.
Most of my servers are CPU-bound most of the time.

------
jackmaney
Yes, because XML is so much more compact.

Oh, wait....

~~~
efuquen
I don't see xml mentioned once, I'm sure he would say it's worse. He'd clearly
probably advocated some sort of binary format, since he mentioned msgpack. Did
you even read the comment?

------
deividy
Oh, now that you mentioned it, I'll use XML, It's very cool and compatible
with new technologies, like a new thing of Microsoft called C#.

