
Stop Comparing JSON and XML - padraic7a
http://www.yegor256.com/2015/11/16/json-vs-xml.html
======
jtmarmon
This article is pretty garbage. compares almost nothing about xml and instead
compares the tooling around xml giving no thought to the tooling around json.

instead, let me give you my unsolicited opinion:

if you need to represent both the structure of your data and characteristics
within that structure, xml is great because attributes are a really good way
to do that. there's a reason most UIs are represented as XML.

if your data is just - well, data - use json. or better yet use edn

~~~
calanya
It's funny to say json and edn represent data, because they are both strictly
UTF8 encoded - raw bytes have to be encoded somehow (with neither standard
specifying a prefer erred method) to be represented. Why is this important?
Sometimes we want to embed a binary formatted piece of data (e.g. an image) as
part of our data.

~~~
frou_dh
Isn't that the point of the 'hashtags' in edn? If a #png arrives then your
program is going to throw an exception unless you've registered a handler that
reports success in decoding it?

In effect it doesn't matter than the encoding is arbitrary edn (e.g. a 150KB
png could be a solitary bigint literal) because the tags prevent false-
positive decodes and keep things "strongly typed".

------
arocks
It is a great example "Worse is better"[1]. XML has a lot more functionality
than JSON but very few people can fully understand all the X* family of
specifications, whereas JSON is a _lot_ easier and handles majority of the use
cases.

There are many, many real life situations where a developer needs to choose
between using JSON, XML or YAML to store their configuration data, message
formats etc. Simply stating that JSON is good only in one scenario and XML
must be used in every other is over-simplification.

[1]:
[https://www.dreamsongs.com/RiseOfWorseIsBetter.html](https://www.dreamsongs.com/RiseOfWorseIsBetter.html)

~~~
cm2187
In all fairness, 99% of the time you will use a library to serialise and
deserialize. So I always found this xml vs json debate a bit sterile. As long
as it is possible to inspect the file visually ocasionally, they're both good
enough. I don't know anyone who edits manually huge amounts of xml/json, and
if they do, there is probably a better way.

~~~
jcrawfordor
One consideration that's worth thinking about is that XML serialization is
larger (often not insignificantly) than the same data serialized as JSON.

On the other hand, XML can make error handling much easier by schema-
validating received documents and thus rejecting a large class of invalid
inputs at an early stage. This is particularly helpful if you're making
something interoperable, like a public API.

So even when you're just using a library, there are ramifications to the
decision.

~~~
cm2187
Correct. But in the order of things I care the most about, using angle or
curly brackets matters a lot less than problems like: can I serialize multi-
dimensional arrays, dictionaries, arrays of bytes, etc.

------
haberman
The entire article rests on this:

> JSON was not designed to have such features [as XPath, XML Schema, XSL,
> etc.]

But this claim is not justified, just stated. And the supposed inferiority of
eg. JSONPath vs XPath is not justified either.

I would actually claim that JSONPath is _superior_ to XPath. JSONPath is much
simpler, easier to understand, easier to implement, and still fulfills the
most common use cases. Also JSONPath can be evaluated on a streaming input,
which is not possible for XPath in general (maybe some subset of XPath queries
could support streaming).

------
Finnucane
The main problem with comparing XML and JSON is that their use cases don't
perfectly overlap. I work with complex text documents, typically encoded in
TEI schema, and it would be insane to try to do this work in JSON. It's
possible, but the result would be an incomprehensible mess.

Unfortunately, a lot of programming languages have poor support for XML, and
standard libraries usually only give you XPath 1.0 compatibility. XPath 2.0
and XQuery 3.0/3.1 are far more powerful and flexible, but you need Saxon or a
good XML database to make proper use of them.

~~~
ingenter
I am yet to see an example that makes me say "this JSON file does not
represent the data very well, I wish they used XML instead". Representing
HTML/XML does not count.

~~~
falcolas
Encode a hash table where key order matters (like the Python OrderedDict). Or
a hash table with non-string keys (like a sky chart with XY coordinates). Or
any other data structure which doesn't conform to a simple list/hash table
format.

Right tool for the job. JSON is a great 80% encoding format, which handles
most cases. The problem is that folks try to make it handle all cases
(ironically, just like they did with XML).

------
vanviegen
Sure, if you define "XML" to include algorithms that operate on it like XSLT
and XPath, but define "JSON" to be just the data format, then the former is
indeed more featureful.

~~~
tracker1
That was my first thought, that there's tooling around JSON to do all of the
above... for that matter straight python, js/node and other languages do very
well processing JSON... if your encoding ensures no unescaped cr/lf, then a
record per line in json + gz is awesome.

XML is literally too expressive, and you can't tell how to deserialize xml
with a good fragment... json you can (at least better)... the query/expression
syntax is even worse than learning/using a simple general purpose programming
language.

------
urvader
Stop comparing JSON with XML... let me compare them to show why.. ;)

------
mehrdada
The article misses the mark, but XML--the language itself, not the tooling--
does have an advantage: it is designed to be quite good at one thing that's
often neglected despite being in its name: extensibility. XML provides
enormous flexibility via namespaces and integrating many schemas that your
application may or may not understand in one document. When done correctly, it
can be a huge win in interoperability in heterogeneous environments and across
versions.

That said, you could very well argue its additional complexity is often not
worth the gain in most applications.

Of course, you can also define your own extensibility/interoperability
conventions with any data serialization format, but the point is XML has it
baked in the standard and already provides an accepted way of doing things
that everyone has implemented.

------
TeMPOraL
It's worth linking to the famous XML rant of Erik Naggum:

[http://www.schnada.de/grapt/eriknaggum-
xmlrant.html](http://www.schnada.de/grapt/eriknaggum-xmlrant.html)

It's both very insightful and enjoyable to read.

------
scotty79
XML is interesting case in point that syntax matters. XPath and XSL are so
awesome but barely anyone cares because XML.

~~~
xaduha
It's easy enough to create an alternative syntax for it, one-to-one
translation. For HTML (XHTML?) there are plenty examples already like Jade.

[http://jade-lang.com/](http://jade-lang.com/)

But people who are serious about XML-related technologies understand that
syntax of XML is mostly fine, interoperability and existing tools matter way
more. It would be hard for such an alternative syntax to catch on. People
often forget that it is a direct descendant of SGML, I imagine for similar
reasons - there were existing tools for it.

~~~
TeMPOraL
The reason it's "easy enough to create an alternative syntax for it, one-to-
one translation" is because XML document is _a tree_. Just that. There's an
argument to be had about tooling & specification ecosystem, but that can be
replicated in other formats, and if all you want is hierarchical data
representation, it's not worth it to poke your eyes out while working with
human-unreadable format XML is.

~~~
xaduha
What's important about XML isn't XML itself. It's specifications, standards,
toolset. Just because it can be replicated doesn't it will be. That's probably
man-centuries of work.

~~~
TeMPOraL
I agree. Though a big part of the XML ecosystem is made of dead ends and very
domain-specific stuff. But either way, we need to be precise - let's evaluate
XML and XML ecosystem separately. If you're not heavily exploiting the latter,
XML is almost never a good choice - because alone by itself, it's just a tree
notation that sucks.

------
atilaneves
Ok, let's take the points one by one.

XPath: JSON doesn't need this in, say, Python or Javascript. You write normal
code once it's a Python/JS dict/array/list and you're done. I don't need yet
another language, I have general purpose ones that will do just fine.

Attributes and namespaces: these can sort of be faked, but fair enough. But
then you get discussions on what should be an attribute and what shouldn't...

Schema: pretty sure this exists for JSON

XSL: Ah, the poor man's Lisp macros... And, again, easier to do in code in a
scripting language.

~~~
krick
> XPath: JSON doesn't need this in, say, Python or Javascript

That's not exactly true. There are books in the library. Here's how I get some
book's main character's name: library[5]['mainCharacters'][3]['name']. Except
I want to know _all_ main character names in all books. This is a clear,
simple request, there's no need for me to put 2 cycles here if I don't have
to. So I would end implementing my own XPath here anyway, to write declarative
stuff declaratively.

But then again, I don't have to, because JsonPath already exists and is just
fine.

~~~
kaoD
IMHO JSONPath is an unnecessary DSL when you have a sufficiently powerful and
expressive language. It basically abstracts a less-powerful version of the
essential higher-order functions.

E.g.:

> Here's how I get some book's main character's name:
> library[5]['mainCharacters'][3]['name']. Except I want to know all main
> character names in all books.
    
    
        library.flatMap(x => x.mainCharacters).map(x => x.name);
    

> This is a clear, simple request, there's no need for me to put 2 cycles here
> if I don't have to.

Using higher-order functions (and a functional programming style) it's still
declarative. Yes, there are still nested cycles but, just like in JSONPath,
they're hidden in the abstraction.

Plus, not using a DSL but a fully-fledged programming/scripting language, you
can store intermediate results, make more complex queries, etc. which is more
often than not what you need.

__

I've used JSONPath mostly in Bash scripts to quickly hack some JSON
manipulation, but I'm slowly transitioning into using more powerful scripting
languages and not using it anymore.

I'm not really _that_ familiar with JSONPath though. Are there any use cases
where it's really convenient?

~~~
krick
I would say, that being a DSL is a benefit by itself. This is how the above
example would look in Python:

    
    
        [character['name'] for book in library for character in book['mainCharacters']]
    

(Python also has maps and stuff, but it's considered non-idiomatic, plus
flatten would be more awkward.)

First of all, syntax is quite different from what you use (Scala, I suppose?),
when were we using JSONPath we could just copy 1 line from one project to
another and that would be fine.

Moreover, both our implementations have the same problems: we assume every
book has mainCharacters and every character has name. Would it be JSONPath —
it doesn't matter, no 'mainCharacters' means path doesn't match the pattern,
just skip it. In our cases this means exceptions.

And what if we want to get all 'name' fields from whatever object at whatever
depth? Or 'name' of every object where 'color' is 'yellow'?

Now, if you consider dictionary structure much more nested (say, some AST) —
processing that without errors would be quite painful. And you also would end
up writing your own XPath (JSONPath), even with all your map's and reduce's.

Of course, should it get complicated enough we would end up needing to write
something custom anyway, but stuff like JSONPath just helps to keep things
simple when possible. That's it.

------
xchaotic
Actually the reverse is true: you should compare XML (and related tech) vs
JSON vs other options (say HTTP headers) and choose what's best for you. I've
seen projects that are all about editing books and stubbornly use JSON, where
XML would let them have schemas, and I've seen huge chunks of XML being
exchanged where a simple key value pair would suffice. Things like JSON Schema
indicate that people are using JSON for things not intended for it, likewise
XML can be put to bad use.

------
herge
The main strength of JSON is that it directly maps to simple data structures
that are first class citizens in many programming languages like Ruby, Python,
JavaScript, Golang, etc.

XML is a bit more complicated, and often need dedicated libraries to manage.
Every time I try to get the nth text element of node X with elementtree, it's
bit of a hassle.

------
white-flame
The metadata argument is rarely applicable. Metadata tends to be rich, and
it's horrible to represent rich data in opaque string attribute values.

Rich metadata is therefore represented as child nodes, giving them similar
child/sibling status as JSON children, and the semantic ambiguity of when to
use XML attributes vs child nodes remains.

------
javajosh
Nonsense. They look very different, and that matters, at the very least. In
particular I like that JSON has some sense of "type" built into the format,
when you omit quotes you know it's a number or a boolean. You can get that
(and better than that, really) with XML Schema (or old-school DTDs) but it's
baked into JSON. Plus, dealing with JSON in a JavaScript interpreter is about
as simple as it gets; only if you're programming in XML (e.g. with Ant) would
you ever say the same about XML. It's quite nice to be able to "dig in" to
JSON using normal, plain-vanilla JavaScript dereferencing tools and looping
constructs.

As for the tooling around XML, that's okay, but it's almost always overkill,
and it almost always turns out the overhead of the tooling becomes a problem
in-and-of-itself.

Anyway, JSON is actually a better data format.

~~~
usrusr
I think your point could have been made even better if instead of
quotes/numbers/strings you would have illustrated the in-band typing using
square brackets and arity. Two bytes for "there can be more", so much better
than the plural/singular convention often seen in XML (rarely without some
creative breaches closely nearby).

In hindsight, the biggest advantage of XML over JSON was that it was painful
enough to make schemas popular, a quality JSON is lacking. Unlike schema
languages, which do exist.

To me, XML tooling lost quite a bit of its appeal when I realized that all the
typing available via the various schema languages is completely lost to the
world of xsl/xpath/xquery. I understand the reasons for that, but that does
not provide much consolation.

------
chipsy
There is a point, but I think the wrong things are compared to indicate that
it's "apples to oranges".

XML is designed towards a type of problem that is not an everyday programming
problem. It is designed for a full-fledged _schema_ \- it builds on the
lessons of SGML and its predecessors such as GML[0], for people who need those
things, which has historically meant "documentation writers". DocBook and DITA
do not really have equals at what they do, which is semantic, textual content
with rich markup. (Yes, you like TeX. But it focuses on presentation for
typesetting, not semantic meaning.)

This means that in practice XML is really useful for describing an abstract,
pre-tokenized syntax. This is a useful tool from a language design
perspective; it lets you take an intermediate position between human-friendly
and machine-friendly formats, without going straight to binary data or writing
a full string parser. When computer language tooling emits an XML AST, they
give tool-writers who would like to manipulate or inspect the AST a major leg
up.

Simpler forms like sexps or JSON exact additional overheads on that problem
that can be nearly as bad as just writing a custom string parser, once you get
beyond the "strings, numbers, and simple containers" cases that are basically
about data serialization, not data parsing. You want to have nodes that have
unique names or attributes once you get into the parsing problem, but they are
superfluous if you have plain old data. And as soon as you get into mixing
different types of documents validation becomes a major concern and XML has
the right groundwork for that.

It's just that most people don't want or need to deal with data of that
complexity, especially since XML as a plaintext document just looks like
angle-bracket trash. For those needs they are better off writing in something
that a string parser can work with and then using XML as an intermediate, if
at all. Even for the documentation-writing situation, it's easier to write
Markdown for 98% of the prose and then convert it to add the last 2%.

Basically, XML has been used for far too many things that it shouldn't, and
the blame for that lies on some 90's-era hype machines that decided that XML
would be the buzzword of the future and pushed it into every technology. We
got some nice tooling out of it, but in the end, it's still most useful for a
certain kind of document markup.

[0]
[https://en.wikipedia.org/wiki/IBM_Generalized_Markup_Languag...](https://en.wikipedia.org/wiki/IBM_Generalized_Markup_Language)

~~~
TeMPOraL
XML looks well for markup use - for documents with lots of text and sparse
semantic tagging. Matching opening and closing tags is actually a pretty nice
feature there.

RE AST, somehow Lisp managed to encode it in S-expressions since forever, and
there was never a problem. In fact, writing Lisp code is mostly writing an AST
directly. There is apparently no need for the additional syntactic sugar XML
adds.

The one thing JSON misses that actually is important is symbol type.
S-expressions have it, and at this point it's no longer "strings, numbers and
basic containers" \- per code=data, you have everything you need to encode an
AST conveniently.

------
EdiX

        But XML has XPath, XMLSchema and XSLT and JSON doesn't
    

XML has those things because it's data model is hostile to every programming
language in existence and you need tools designed specifically for it to
manipulate it concisely.

JSON fits well with the list/map/primitve data model that is common to most
scripting languages so those tools aren't needed, you can just use javascript
or python, something you use everyday anyway and doesn't look alien.

    
    
        A similar document would look like this in XML
    

... plus encoding and schema declaration and also all the garbage involved
with dealing with namespaces. Or, at least, don't sing praise to XMLSchema and
namespaces if you aren't going to put them in your example.

PS. XML is a markup language, if your data is a text document with markup then
XML is a good choice, otherwise leave it alone.

------
krick
So, what screenshot from "The Men Who Stare at Goats" is there for?

------
jamesmalvi
This tool will help to convert json to xml
[http://jsonformatter.org](http://jsonformatter.org)

------
adnam
Not to mention that XML is generally more suitable for streamed i/o.

~~~
dozzie
Except that it is generally less suitable for streamed I/O than linewise JSON.
Remember that you don't need to stream a single big document. You can make a
stream of several separate documents.

~~~
adnam
Provided your data can be chunked in that way, yes.

------
vmorgulis
Looks like a troll...

