
JSON or XML: Just Decide - urbanjunkie
http://www.mnot.net/blog/2012/04/13/json_or_xml_just_decide
======
justin_vanw
JSON is great, but it is not nearly as flexible as XML, partly because of
attributes. Also, because of it's JS heritage and compatibility, lots of
common things are _not representable_ in JSON. This is mostly because object
keys MUST be strings.

Examples:

    
    
        In [1]: from simplejson import dumps
    
        In [2]: dumps({1: 5, '1': 0})
        Out[2]: '{"1": 0, "1": 5}'
    

Derp, good luck figuring out what that is supposed to mean.

    
    
        In [3]: dumps({None: None})
        Out[3]: '{"null": null}'
        In [4]: dumps({False: False})
        Out[4]: '{"false": false}'
    

Oh snap! That's an ugly bug waiting to happen!

Of course, Python is not immune from such uglyness:

    
    
        In [5]: {True: 'true', 1: '1'}
        Out[5]: {True: '1'}
    

WTF!

My new favorite method of encoding data is MSGPack. It's efficient, fast,
available for all popular languages, and doesn't have inherited uglyness.
Disadvantages: not human readable (it's a compact binary format), and no
Unicode support. The unicode support issue can be worked around by convention
(for example, always encode strings as utf-8), still very annoying though.

I think we can all agree that XML is gross.

Another major issue with XML is bad programmers. XML is an interchange format,
it's meant to be used when you have to give out or accept data from the
'outside'. However, it is very rare to come across industry created XML that
validates. And dealing with invalid XML is a complete shitshow.

~~~
wicknicks
I have spent equal amounts of time with XML and JSON. Here is what my
experience told me:

1] XML is painful to write.

2] Languages don't natively support it. It always requires additional
drivers/libraries.

3] Its non-trivial to store XML also. With the solutions that were out there
we would always run into some requirements which the database didn't support,
and had to done in the business logic. JSON databases (I use Mongo) have a
very clear interface, about what they support and what they don't.

4] Finally, most websites support JSON, if they don't then I resort to the XML
interface. Tools like MsgPack and Google protocol buffers are great, but can
only be used in house. Not over HTTP.

In short, JSON wins for me.

~~~
justin_vanw
You can use MSGPack over HTTP just as easily as JSON. In fact, it performs
better in most browsers.

------
mindcrime
Use the right tool for the job. I wouldn't say that XML or JSON is always
right. But I will say that I believe XML has a better ecosystem around it;
with things like XSD, XSLT, XQuery/XPath, etc., and some pretty easy to use
data-binding frameworks like JAX-B. My feeling is that XML makes it a lot
easier to do certain classes of things that I want to do, like taking a
business event message off a queue, match it against an XQuery expression,
route it to the appropriate place based on that matching, store it in an XML
database where I can later locate it using XQuery, and then render it into a
web-based activity stream by applying an XSLT transform.

Sure, you could get there from here with JSON as well, but it sure seems more
natural using XML.

~~~
derickson
Using an XML native NoSQL database is a game changer when consuming XML and
producing XML HTTP APIs. A good XQuery engine makes many of the problems
people are listing about XML simply go away.

I like @mnot's point about providing "excellent client bindings" in common
languages.

~~~
mindcrime
What DB do you like for storing and querying XML? I've been using eXistDB
lately, but I'd be curious to hear if people are finding something else to be
better.

------
6ren
He's talking specifically about web APIs.

Web APIs tend to have simple, shallowly nested formats. In an informal survey,
the deepest nesting I found was 3 levels. JSON is simple, and has resisted all
efforts to complicate it, or to add to its stack. There is no popular schema
for JSON, no "JSLT". no visual JSON mapping tools. The only tooling is
databinding (and if you consider JSON as a subset of JavaScript, it arguably
has not even that).

The XML toolchain, especially XML Schema and XSLT, is highly engineered -
well, over-engineered. The designers threw in everything they could think of.
As a result, even enterprise tools don't need to support the whole spec.

I think it's fair to say that if you _need_ something more powerful (and
therefore more complicated) than JSON, you should use XML. It seems the very
existence of the XML toolchain helps _keep_ JSON simple: instead of demand for
complexity being channeled into over-tooling JSON, it is harmlessly diverted
to XML.

The deeper question is: _do our tasks really_ _NEED_ _that extra complexity?_
It seems related to loose dynamic typing vs. tight static typing (and
scripting vs. compiled). Maybe web APIs are an exception - or, because very
young, haven't yet needed the complexity that beset CORBA, then XML... Or
maybe they _are_ an exception, but it doesn't matter because everything is
becoming a web API anyway. Or... maybe we're finally got it right...?

There are pervasive needs that JSON doesn't address. For example, there's a
problem with coupling between JSON and application data structures in that
they must be the same basic shape. So to give your JSON format the ideal shape
for consumers, you need to translate into a layer of objects first - and your
consumers need to do the same thing to get it into _their_ internal data
structures, Similarly, you aren't free to evolve; instead, you produce another
version, and all your clients must upgrade. Most web APIs are very very young,
yet have several versions already... The same problems occurred in XML (and
CORBA), and though JSON is an improvement in that it allows fields to be added
more easily, the tooling to support conversion/evolution hasn't grown up
around it (and _isn't_ growing).

I think the answer is that JSON works great when the underlying features of
applications are changing quickly because you can't "evolve" around this, you
need humans to rethink the basics. while "web APIs" continue in vigorous
growth, it will dominate. Maybe it will settle down and consolidate, once
everything has changed into a web API... or maybe continuous churn will become
the rule, as everything accelerates?

[Interestingly, relational algebra squarely addressed and solved these
problems 42 ago. It's still going strong; though also under attack by the
similar forces (NoSQL) allied with loose dynamic typing of scripting
languages, and the need for so-called "web-scale" performance being greater
than the need for evolution/conversion... at present.]

~~~
Yarnage
>For example, there's a problem with coupling between JSON and application
data structures in that they must be the same basic shape. So to give your
JSON format the ideal shape for consumers, you need to translate into a layer
of objects first - and your consumers need to do the same thing to get it into
their internal data structures,

Don't you need to do the same-thing with XML? How do you use XML data without
conforming it to your internal structures first? You can't just guess...I just
don't get the difference here. I'm trying to understand what XML can offer
that JSON cannot. Do you have an example?

>Similarly, you aren't free to evolve; instead, you produce another version,
and all your clients must upgrade.

Same-thing here; doesn't XML have the same issue? Upgrading formats can make
XML useless just as well as JSON. Also, just like JSON, you can upgrade them
without issue so I'm not seeing the distinct advantage of XML over JSON. Do
you have an example?

I'm not trying to be argumentative; I just haven't seen any examples showing
what makes XML better than JSON just lots of "you can't do this in JSON" and I
can't find a way to make that true in my head...

~~~
6ren
Yes, XML has the same problem. The end of that paragraph reads:

> The same problems occurred in XML (and CORBA), and though JSON is an
> improvement in that it allows fields to be added more easily, the tooling to
> support conversion/evolution hasn't grown up around it (and isn't growing).

Maybe it's not clear, but the tooling to solve these problems _has_ grown up
around XML. Mainly XSLT, and visual mappers that operate on top of that.
Examples: MS's Biztalk mapper <http://msdn.microsoft.com/en-
us/library/aa547076.aspx>; MapForce <http://www.altova.com/mapforce.html>;
StylusStudio XML mapper <http://www.stylusstudio.com/xml_to_xml_mapper.html>;
both IBM and Oracle have similar - this category of tooling has been around a
decade or so; XSLT itself started in 1997
<http://en.wikipedia.org/wiki/XSLT>). Have a look at the images in those links
- you'll immediately see what they do.

So to be precise, XML doesn't solve these problems, its tools do.

------
cpunks
Or... they're designed for different purposes and not even competitors. XML is
a markup language:

<http://en.wikipedia.org/wiki/Markup_language>

It was designed for documents. Try converting an HTML page to JSON. Try
something as simple as:

    
    
      <h1> Hello World! </h1> 
      <p> The most common introductory program is called 
      <i> Hello, World </i>. </p>
    

Go on. If you think JSON wins, just do it, and post it below.

The problem was when people started mis-applying XML to send data structures,
for RPC, and similar tasks. That's not what it was designed for. JSON is a
cross-language way of specifying common data structures, and is very good at
doing that.

~~~
thezilch
Yes, JSON is no ML, but YAML is, which is a superset of JSON's
semantics/features. You're perfectly right that JSON is not always the right
tool for transporting a document, but I still wouldn't regard XML as the best
tool for any of those cases.

~~~
cpunks
Please post an example of the HTML above in YAML (or your preferred language)
that would be better than SGML or XML.

Sidenote: YAML doesn't consider itself a markup language. See the renaming:
<http://en.wikipedia.org/wiki/YAML>

------
Yarnage
This "article" is odd. I've worked with multiple systems and I don't see a
reason why one data model can't be bound to XML and JSON without being
awkward. It's so incredibly EASY to output and input with both, why not?
Personally I prefer JSON as I haven't found anything that can't be represented
within it.

I didn't see any example within the article regarding JSON formats that
generate awkard XML ad vice versa. Does anyone have examples of that?

------
icebraining
I like Turtle:

    
    
        <http://www.w3.org/TR/rdf-syntax-grammar>
        dc:title "RDF/XML Syntax Specification (Revised)" ;
        ex:editor [
          ex:fullname "Dave Beckett";
          ex:homePage <http://purl.org/net/dajobe/>
        ] .
    

It's only useful if you drank the RDF kool-aid (like I have), though.

~~~
dajobe
You have good taste, since I invented Turtle. To keep this on topic, I've been
using JSON for data web APIs since that's what it's best at. It sucks at:
markup and graphs of course.

~~~
tptacek
We use JSON for graphs. What sucks about it? In what sense is XML better at
representing graphs?

~~~
dajobe
There's no way to point from one part of a JSON doc to another without
inventing a terminology or convention for marking the start (anchor) and end
of the arc (href). People use 'id' for one end but there's no way to say a
json value is actually a reference (href) not just a string. XML has that
built in (ID IDREF) and so does HTML, but I didn't say XML was better, I said
JSON sucks at markup and graphs. JSON's handy for serializing trees of data
with no loops.

~~~
Yarnage
What do you mean by "built in"? I don't see how or why an ID couldn't be used
in the same manner; the implementation is just a little different because they
store data differently.

XML is only a series of nodes and attributes. There isn't really anything else
special about it and it's trivial to represent it in JSON so I'm not sure I
follow your issue. Could you provide an example?

------
democracy
There is no excuse not to have both + plus something else (like soap web
services) if the end user wants it. It is not entirely host's choice.

------
qbproger
YAML :)

~~~
mcot2
+1

------
officialchicken
Do you need validation? XML. Otherwise use JSON.

------
joe24pack
YAML ? Easy to read, edit and understand.

------
derfclausen
I wish JSON had syntax for comments.

~~~
rollypolly
Good point, but you can use a parser that supports comments, even if it's not
standard. You could also create a node to hold comments.

Then there's also YAML that supports comments.

------
nirvana
JSON. I've decided. Actually, I think I decided in 2001 or so when I decided
that XML was just a bear.

Seems most people have decided to go with JSON as well, and that XML is more
used for legacy systems and systems where there's some enterprise component
you have to interface with.

Frankly, I hope JSON wins, but if it doesn't, XML needs to have a resurgence
really quickly.

