JSON is great, but it is not nearly as flexible as XML, partly because of attributes. Also, because of it's JS heritage and compatibility, lots of common things are not representable in JSON. This is mostly because object keys MUST be strings.
Examples:
In [1]: from simplejson import dumps
In [2]: dumps({1: 5, '1': 0})
Out[2]: '{"1": 0, "1": 5}'
Derp, good luck figuring out what that is supposed to mean.
In [3]: dumps({None: None})
Out[3]: '{"null": null}'
In [4]: dumps({False: False})
Out[4]: '{"false": false}'
Oh snap! That's an ugly bug waiting to happen!
Of course, Python is not immune from such uglyness:
In [5]: {True: 'true', 1: '1'}
Out[5]: {True: '1'}
WTF!
My new favorite method of encoding data is MSGPack. It's efficient, fast, available for all popular languages, and doesn't have inherited uglyness. Disadvantages: not human readable (it's a compact binary format), and no Unicode support. The unicode support issue can be worked around by convention (for example, always encode strings as utf-8), still very annoying though.
I think we can all agree that XML is gross.
Another major issue with XML is bad programmers. XML is an interchange format, it's meant to be used when you have to give out or accept data from the 'outside'. However, it is very rare to come across industry created XML that validates. And dealing with invalid XML is a complete shitshow.
I don't really understand your point here. Those JSON data dumps don't make any sense...and wouldn't make sense as XML either. Why wouldn't you enforce a schema? I've never worked on a system that didn't do validation and schema enforcement regardless of xsd, json, etc; you need to have well defined ways of laying out your data or it's going to be useless regardless of the format used.
>JSON is great, but it is not nearly as flexible as XML, partly because of attributes.
Attributes can be stored in JSON as well. XML and JSON simply store data in different ways.
>Also, because of it's JS heritage and compatibility, lots of common things are not representable in JSON.
Like what? You can actually put real values in their correct types with JSON. You can't do that with XML, so what types of items are not representable?
>This is mostly because object keys MUST be strings.
The same is true for XML so I'm not sure what your point is. Don't nodes and attribute names need to be strings in XML?
I have spent equal amounts of time with XML and JSON. Here is what my experience told me:
1] XML is painful to write.
2] Languages don't natively support it. It always requires additional drivers/libraries.
3] Its non-trivial to store XML also. With the solutions that were out there we would always run into some requirements which the database didn't support, and had to done in the business logic. JSON databases (I use Mongo) have a very clear interface, about what they support and what they don't.
4] Finally, most websites support JSON, if they don't then I resort to the XML interface. Tools like MsgPack and Google protocol buffers are great, but can only be used in house. Not over HTTP.
But I mean, yes it does. Imagine if implementations were inconsistent as to what they allowed for JSON keys so that your browser couldn't understand the JSON your python code sent it. That scenario is prevented by the limitation being well documented.
If you are you looking for typing or schema then JSON would be an issue. OTOH almost every XML/SOAP endpoint that I've used could be switched over to use JSON.
Use the right tool for the job. I wouldn't say that XML or JSON is always right. But I will say that I believe XML has a better ecosystem around it; with things like XSD, XSLT, XQuery/XPath, etc., and some pretty easy to use data-binding frameworks like JAX-B. My feeling is that XML makes it a lot easier to do certain classes of things that I want to do, like taking a business event message off a queue, match it against an XQuery expression, route it to the appropriate place based on that matching, store it in an XML database where I can later locate it using XQuery, and then render it into a web-based activity stream by applying an XSLT transform.
Sure, you could get there from here with JSON as well, but it sure seems more natural using XML.
Using an XML native NoSQL database is a game changer when consuming XML and producing XML HTTP APIs. A good XQuery engine makes many of the problems people are listing about XML simply go away.
I like @mnot's point about providing "excellent client bindings" in common languages.
What DB do you like for storing and querying XML? I've been using eXistDB lately, but I'd be curious to hear if people are finding something else to be better.
That's cool to know, and I might just give it a look. But my perception is still that the supporting ecosystem around XML is more mature and comprehensive, at least for some use-cases.
It was designed for documents. Try converting an HTML page to JSON. Try something as simple as:
<h1> Hello World! </h1>
<p> The most common introductory program is called
<i> Hello, World </i>. </p>
Go on. If you think JSON wins, just do it, and post it below.
The problem was when people started mis-applying XML to send data structures, for RPC, and similar tasks. That's not what it was designed for. JSON is a cross-language way of specifying common data structures, and is very good at doing that.
I feel that the real distinction is whether you need to validate the contents against a grammar. The document/data distinction is fairly arbitrary. As far as I know, JSON has no equivalent of the DTD.
Data representation is great in JSON. Actual layout information? Well it's doable but it's not as easy to see. This is why HTML is NOT XML and also why you wouldn't want to convert HTML to JSON.
So I'm not really sure what your point is. Are you trying to say that, because HTML (which is NOT XML) is awkward to represent in JSON, that XML wins?
Yes, JSON is no ML, but YAML is, which is a superset of JSON's semantics/features. You're perfectly right that JSON is not always the right tool for transporting a document, but I still wouldn't regard XML as the best tool for any of those cases.
This "article" is odd. I've worked with multiple systems and I don't see a reason why one data model can't be bound to XML and JSON without being awkward. It's so incredibly EASY to output and input with both, why not? Personally I prefer JSON as I haven't found anything that can't be represented within it.
I didn't see any example within the article regarding JSON formats that generate awkard XML ad vice versa. Does anyone have examples of that?
You have good taste, since I invented Turtle. To keep this on topic, I've been using JSON for data web APIs since that's what it's best at. It sucks at: markup and graphs of course.
There's no way to point from one part of a JSON doc to another without inventing a terminology or convention for marking the start (anchor) and end of the arc (href). People use 'id' for one end but there's no way to say a json value is actually a reference (href) not just a string. XML has that built in (ID IDREF) and so does HTML, but I didn't say XML was better, I said JSON sucks at markup and graphs. JSON's handy for serializing trees of data with no loops.
What do you mean by "built in"? I don't see how or why an ID couldn't be used in the same manner; the implementation is just a little different because they store data differently.
XML is only a series of nodes and attributes. There isn't really anything else special about it and it's trivial to represent it in JSON so I'm not sure I follow your issue. Could you provide an example?
Holy %!#@, I never knew the inventor of Turtle posted on HN. That's awesome. Turtle does, indeed, rock. But I'm also a Semantic Web Koolaid drinker, so my viewpoints may be a bit out of line with the "mainstream."
Web APIs tend to have simple, shallowly nested formats. In an informal survey, the deepest nesting I found was 3 levels. JSON is simple, and has resisted all efforts to complicate it, or to add to its stack. There is no popular schema for JSON, no "JSLT". no visual JSON mapping tools. The only tooling is databinding (and if you consider JSON as a subset of JavaScript, it arguably has not even that).
The XML toolchain, especially XML Schema and XSLT, is highly engineered - well, over-engineered. The designers threw in everything they could think of. As a result, even enterprise tools don't need to support the whole spec.
I think it's fair to say that if you need something more powerful (and therefore more complicated) than JSON, you should use XML. It seems the very existence of the XML toolchain helps keep JSON simple: instead of demand for complexity being channeled into over-tooling JSON, it is harmlessly diverted to XML.
The deeper question is: do our tasks reallyNEEDthat extra complexity? It seems related to loose dynamic typing vs. tight static typing (and scripting vs. compiled). Maybe web APIs are an exception - or, because very young, haven't yet needed the complexity that beset CORBA, then XML... Or maybe they are an exception, but it doesn't matter because everything is becoming a web API anyway. Or... maybe we're finally got it right...?
There are pervasive needs that JSON doesn't address. For example, there's a problem with coupling between JSON and application data structures in that they must be the same basic shape. So to give your JSON format the ideal shape for consumers, you need to translate into a layer of objects first - and your consumers need to do the same thing to get it into their internal data structures, Similarly, you aren't free to evolve; instead, you produce another version, and all your clients must upgrade. Most web APIs are very very young, yet have several versions already... The same problems occurred in XML (and CORBA), and though JSON is an improvement in that it allows fields to be added more easily, the tooling to support conversion/evolution hasn't grown up around it (and isn't growing).
I think the answer is that JSON works great when the underlying features of applications are changing quickly because you can't "evolve" around this, you need humans to rethink the basics. while "web APIs" continue in vigorous growth, it will dominate. Maybe it will settle down and consolidate, once everything has changed into a web API... or maybe continuous churn will become the rule, as everything accelerates?
[Interestingly, relational algebra squarely addressed and solved these problems 42 ago. It's still going strong; though also under attack by the similar forces (NoSQL) allied with loose dynamic typing of scripting languages, and the need for so-called "web-scale" performance being greater than the need for evolution/conversion... at present.]
>For example, there's a problem with coupling between JSON and application data structures in that they must be the same basic shape. So to give your JSON format the ideal shape for consumers, you need to translate into a layer of objects first - and your consumers need to do the same thing to get it into their internal data structures,
Don't you need to do the same-thing with XML? How do you use XML data without conforming it to your internal structures first? You can't just guess...I just don't get the difference here. I'm trying to understand what XML can offer that JSON cannot. Do you have an example?
>Similarly, you aren't free to evolve; instead, you produce another version, and all your clients must upgrade.
Same-thing here; doesn't XML have the same issue? Upgrading formats can make XML useless just as well as JSON. Also, just like JSON, you can upgrade them without issue so I'm not seeing the distinct advantage of XML over JSON. Do you have an example?
I'm not trying to be argumentative; I just haven't seen any examples showing what makes XML better than JSON just lots of "you can't do this in JSON" and I can't find a way to make that true in my head...
Yes, XML has the same problem. The end of that paragraph reads:
> The same problems occurred in XML (and CORBA), and though JSON is an improvement in that it allows fields to be added more easily, the tooling to support conversion/evolution hasn't grown up around it (and isn't growing).
JSON. I've decided. Actually, I think I decided in 2001 or so when I decided that XML was just a bear.
Seems most people have decided to go with JSON as well, and that XML is more used for legacy systems and systems where there's some enterprise component you have to interface with.
Frankly, I hope JSON wins, but if it doesn't, XML needs to have a resurgence really quickly.
Examples:
Derp, good luck figuring out what that is supposed to mean. Oh snap! That's an ugly bug waiting to happen!Of course, Python is not immune from such uglyness:
WTF!My new favorite method of encoding data is MSGPack. It's efficient, fast, available for all popular languages, and doesn't have inherited uglyness. Disadvantages: not human readable (it's a compact binary format), and no Unicode support. The unicode support issue can be worked around by convention (for example, always encode strings as utf-8), still very annoying though.
I think we can all agree that XML is gross.
Another major issue with XML is bad programmers. XML is an interchange format, it's meant to be used when you have to give out or accept data from the 'outside'. However, it is very rare to come across industry created XML that validates. And dealing with invalid XML is a complete shitshow.