

8 Reasons why XML Sucks - arthurk
http://firstclassthoughts.co.uk/xml/why_xml_sucks.html

======
jrockway
_XML is character based_

No, XML is binary. The <?xml at the beginning of the file is the "magic
number" that lets the XML parser guess enough of the encoding to read the
"encoding='whatever'" part, and then the rest of the file. Everything is
binary though, never treat it as text!

The author claims that most editors use latin-1 character encodings; I don't
think this is correct. Everything uses UTF-8 these days, and if your
UTF-16-encoded XML has a BOM at the beginning, it will use UTF-16 instead. So
basically, this is not really a problem. You _can_ use your text editor to
edit XML.

In the end, it sounds like the author doesn't like file formats that take more
than the minimum possible way to store the data. XML is pretty verbose, but
since disk space is cheap and libraries do all of the generation/reading, I
don't really care. XML is far from the solution to every problem, but it is
much quicker than rolling my own format which nothing else understands. (BTW,
did the author ever try gzipping the XML? That should make it much smaller.
libxml2 can even operate directly on gzipped XML files.)

~~~
jrockway
So I tried gzipped XML versus just dumping out 4 bytes longs; here's a script
that makes an XML file with 1000000 integers, and puts the same data in a
binary file next to it:

<http://scsys.co.uk:8001/16872>

The gzipped XML uses about 5.5M, the binary file uses around 4M. So yes, XML
uses more space, but you have to ask if 1.5M for 1000000 records is worth
inventing your own format that nobody else understands. Sometimes it might be,
other times dealing with XML's trade-offs might be a better use of your time.

------
bct
Look, it's simple: XML is a _markup_ language. It's for documents. Don't use
it for data.

------
kenver
Some of those reasons seem a bit strange to pin on a technology that from my
understanding has never claimed to be good at encoding binary data or none
hierarchical data.

XML is used to create custom mark-up languages for documents, so why you would
put binary data in it I don’t know, surely you’re missing the point if that’s
what you’re doing. Perhaps store a link to a binary file rather than encoding
the binary inside the XML.

As for the verboseness of it, that’s a trade-off that you consciously make
when you choose to use XML, if you want to use binary data and have small
files then use a binary format. You gain file size and efficiency loss, but
you also get the added advantages of interoperability over hardware and
software systems - which is probably the main reason you would want to use XML
in the first place.

As for human readability, I don’t think the creators of XML claimed it to be
readable like a book, but rather a human being could look at it and see
language rather than gobbledygook! It might be hard to understand when there’s
loads of XML but it’s still 'readable'.

Editor problems aren’t really the fault of XML either, that’s a bit like
saying a song sucks because you’ve got a rubbish CD player.

------
shaunxcode
It would be sweet if he showed his BML format to compare and contrast in
regards to his list of grievances.

------
danielrhodes
I agree with just about everything he said. I hate XML and find it extremely
tedious, with almost no payoff. I think it's useful when using it in the
context of XSLT transformations, but as protocol to exchange data between two
services etc., it's far more work than it's worth.

------
DarkShikari
A ninth reason:

<http://www.tomychen.org/?p=8>

(Look at the includes)

