Hacker News new | past | comments | ask | show | jobs | submit login

I will risk that this will be an unpopular opinion, but if you are having problem with XML, you are using it to solve the wrong problem.

I understand writing XSLT and XML Schema can be difficult and I see how typing out XML namespaces can be a pain, but every sentence about XML in that article is a joke. Those quotes are all intended to be funny, not objective. Noone actually brought an objective facts against XML. Because they can't. The fact is it is widely used in many places. Anyone tell me an alternative to serialise an object tree where you also need to preserve ordering and type information, you need to store text longer than one line, or you just need to store any kind of formatting information. (and yes, you can use JSON to do that, but the resulting document will be 5x longer)

(meta: Funny quotes bashing useful technologies is the cat video equivalent of HN. Last week's article beating OOP was the same pattern.)




Objective facts:

* XML is complicated enough that its parsers are commonly full of obscure bugs. JSON/YAML doesn't have this problem.

* XML is complicated enough that its parsers can have security vulnerabilities (e.g. see billion laughs for just one). JSON/YAML doesn't have this problem.

* XML is complicated enough that you can create an almost-but-not-quite valid encoding. The (already complicated enough) parsers have to deal with this and the ones that don't are considered broken. JSON/YAML doesn't have this problem.

* XML's complexity does not give you any additional benefit over YAML or JSON. Serializing/deserializing dates as strings is not a problem. It never was.

XSLT is just the shitty icing on the already crappy cake. A committee created a disastrous turing complete programming language to munge this already overcomplicated data format.


YAML allows deserialization into arbitrary native types, which most definitely is [1] an issue (see: the flood of Rails/YAML vulns a while back)

[1] http://blogs.teamb.com/craigstuntz/2013/02/04/38738/


That is an issue, but it's more of an education/naming issue since it is, after all, intentional.

I think it's really dumb that most YAML libraries have a load() and a safe_load(). If they had a load() and a dangerous_load() then the problem basically wouldn't exist.


XML is very simple. Maybe YAML is simpler but both are very simple. XML is tedious to write, verbose and repetitive, but not complicated nor complex. XSLT is also conceptually very simple. XML, XML schema tools and XSLT make a very powerful combination that has proven to be useful in a myriad of real world problems.


If only XML were actually as simple as you claim. If only there weren't a myriad of namespace, encoding, character literal substitution and other complexities.


If we eliminated everything where implementations have had obscure bugs or security vulnerabilities, there would literally be nothing left.

XML's complexity does not give you any additional benefit over YAML or JSON.

This is so incredibly wrong, on every level, that it belies belief and reads like something you would come across on a "beginning programmers" forum. As others have said, JSON/YAML thus far have seen limited usage (no, that configuration file on your app is not a complex example). But as it grows people are starting to ask questions like "Gosh, wouldn't it be nice if my perimeter or the source system via a metadata file could validate the JSON passed to us". "Wouldn't it be nice to be able to convert from one JSON form to another."

And the exact same complexity is arising...poorly, and with the same hiccups that the XML system went through.

I mean some of the comments are incredible. Like "JSON is simple enough that errors aren't big" -> Hey, sorry that those bank transfer got lost, but it turns out that we mistyped the account number field name and the destination system just ate it. Json.

¯\(°_o)/¯

Sorry that the dates are completely wrong, but all of those years of discovery about time zones and regional settings...just make it some sort of string and they'll figure it out.

¯\(°_o)/¯


> Hey, sorry that those bank transfer got lost, but it turns out that we mistyped the account number field name and the destination system just ate it.

The JSON approach does not give you everything-and-the-kitchen-sink. A lot of people consider that a feature.

If you want to do schema validation on top of json messages, you're free to do it when you receive them - the data format does not prevent you from that, it merely does not advocate and standardize one-way-of-doing-it.

The fact the various existing json schema solutions have not found a leader amongst themselves speaks loudly to the fact that it's a useless feature for most people, and the format is better off without it. Whatever the RFC would come up with, people would find fault in it... so if most users don't care, why force one solution over any other?

GP is foolish to think XML does not have benefits over JSON, but you're a lot more foolish to think those benefits (the ones you advertise, anyway) should be part of the language. You say "As JSON grows...", but that's exactly the thing: it doesn't grow. It's a simple data format and needs no new feature. Would trailing commas and comments be nice? They sure would. But we can live without them in the format itself... let alone schema validation which can be done externally.


>GP is foolish to think XML does not have benefits over JSON

I am? What benefits would those be?


It's a simple data format and needs no new feature.

XSLT was developed entirely independently of XML. XML Schemas were developed entirely independently of XML.

XML itself is absurd simple. It is the epitome of simple. But you build an ecosystem of tools and standards around it. And that is of course already happening in JSON -- JSON Schemas, for instance, are now a thing.


> XML itself is absurd simple. It is the epitome of simple.

I can't possibly argue with you if you actually believe that. XML is not simple. XML has CDATA, DOCTYPEs, comments, attributes, significant whitespace and so much more which JSON does not have.


>If we eliminated everything where implementations have had obscure bugs or security vulnerabilities, there would literally be nothing left.

The point is that by eliminating this data format you get rid of those obscure bugs and security vulnerabilities and you lose nothing of value doing it.

>This is so incredibly wrong, on every level, that it belies belief and reads like something you would come across on a "beginning programmers" forum.

I wouldn't find this quite so pathetic if I didn't have to school you on XML parser vulnerabilities.

>As others have said, JSON/YAML thus far have seen limited usage

What are you smoking? JSON is everywhere these days. More commonly used in new web APIs than XML for sure.

>But as it grows people are starting to ask questions like "Gosh, wouldn't it be nice if my perimeter or the source system via a metadata file could validate the JSON passed to us". "Wouldn't it be nice to be able to convert from what JSON form to another."

The first I hear occasionally, but it honestly isn't ever a problem. You can put validation in the code that parses the JSON. Invalid date sent? Return an error when your javascript/python/java returns an error parsin it. Name too long? Ditto. You don't need additional outside validation if your programming language doesn't suck.

The second question isn't one I have ever heard in 12 years of software development. Generally you want to do something useful with JSON input. That useful thing isn't normally "make more JSON that looks slightly different".

>And the exact same complexity is arising

Nope. Ain't no billion laughs vulns in any JSON parsers that I know of. No subtle parser bugs causing fucked up behavior down the line either.

>I mean some of the comments are incredible. Like "JSON is simple enough that errors aren't big" -> Hey, sorry that those bank transfer got lost, but it turns out that we mistyped the account number and the destination system just ate it. Json.

If you mistyped the account number on your banking system and it got caught by an XML validator your systems must be fucked.

That's the worst excuse for XML I've ever heard: that your systems are so terribly programmed that you must find user errors via validation of your data interchange format. Jesus.

>Sorry that the dates are completely wrong, but all of those years of discovery about time zones and regional settings...just make it some sort of string and they'll figure it out.

Essentially, yes. ISO 8601 and you're done. Where's the problem?


It's a glorious time in software development when people who make and use trivial web apps think that their domain dominates, and that their superficial knowledge reigns supreme.


Exactly this.

It's one thing to knowingly keep use simpler data formats or approaches (callback based concurrency model) to build systems that are small, _and will remain small_.

That's defensible.

But what I see is a bunch of new programmers not bothering to learn established systems, systems that have tackled a much larger problem domain, and deriding them as legacy garbage.

XSLT has its cruft, but lets see the JSON YAML fanbois tackle the same problem domain with their toy formats, then we can compare like with like.


Or someone else has used it to solve the wrong problem.

Or your customer demands you to solve the wrong problem with XML.

A nice example: Simple configuration files which are best described as simply option=value or maybe json if someone wants to go really wild.

A customer comes and wants configuration files to be XML. Then your sales department agrees and now you have to implement XML files. The end result: Configuration files are no longer easily editable by humans. Yay!

Another example: Someone decided that using makefiles is too hard, so let's make the equivalent but with XML! I'm looking at you ant! Now they're still have the same problems as makefiles but they are much harder to edit.


In my experience

* Configuration files are best made with YAML (it's the most human readable).

* APIs / other forms of serialization/deserialization over a network are best done with JSON (chop it in half and it will fail fast unlike yaml. still fairly readable tho).

* Programming languages (like ant) should not be written in either one ever (fortunately I've never heard of a YAML or JSON based language).

* XML does a bad to terrible job of all three.


Agreed. How much XML was used to introduce DSLs and dynamic typing into Java?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: