Hacker News new | past | comments | ask | show | jobs | submit login

Thank you for articulating this, but I’m familiar with these complaints. XML does give you a lot of freedom to format your data in different ways, which can get you into traps. I’ve run into those traps before, like the decision between attributes and child nodes.

This doesn’t add up to XML hate, for me. The way I would probably write the document is:

  <book>
    <title>XML Cookbook</title>
    <author>Jane Doe</author>
    <author>Tim Pickens</author>
  </book>
This is a fairly boring way to write out a document and while you can bikeshed all you want, I don’t see the possible bikeshedding as a major drawback. The above is concise and easy to understand.

I wouldn’t use YAML as a basis for comparison. YAML has a fair number of oddities and inconsistencies that led me to stay away from it. XML is at least consistent and simple, there are not really any surprises to speak of and there are plenty of tools for modifying XML documents even when you don’t have the schema. For YAML, although there’s a spec, it’s complicated enough that different implementations are inconsistent with each other and there seems to be some inertia at work here.

There’s also the downright bizarre set of regexes that YAML uses to recognize bare strings as other types, that means that '3.3.0' is a string, but '3.3' is a number. If I write 'ni' that’s a string but 'no' is a boolean. I personally find it harder to read or author YAML due to all these rules. You also have to be a bit more careful to sanitize YAML input due to things like the way !! is handled by various libraries, or the way YAML allows object cycles. It gives you too much rope to hang yourself, has too many surprises, and too many footguns. The fact that YAML is a bit more concise just isn’t enough of an advantage.

    # Quiz: What value does this give you when parsed?
    MAC Address: 11:02:03:04:05:06
For data serialization, I would stick to something like Protocol Buffers. You get a text and binary format, a schema, consistency across implementations, and good tooling.

XML is workable in a lot of situations and in some cases the verbosity makes it a bit more self-documenting than e.g. JSON.

TOML would be my choice for config files that I maintain.




I've grown to like Avro, mostly because of its ability to support schema evolution for reader and writer independently. You get the usual niceties around binary wire format, schema, dynamic parsing and/or code generators etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: