

You probably misunderstand XML - msbmsb
http://lemire.me/blog/archives/2010/11/17/you-probably-misunderstand-xml

======
dgreensp
Thumbs down.

He taught an entire course on XML, which he calls a "great meta-example on how
to deal with semi-structured data"? And his only defense of XML over JSON
is... it's worked ok for some file formats?

The only point in this whole article is that XML is not well-suited for RPCs,
though he fails to argue that it's well-suited for anything else.

One argument is that XML is better than JSON for use cases like XHTML, where
you heavily mix tags and content. I get the feeling XML wasn't really made for
this case, though, it was made for the JSON-like case. Processing XHTML with
E4X (the "XML for JavaScript" standard) is painful, and XML libraries in
general assume your document basically consists of a tree of tags, maybe with
text nodes at the leaves.

I was expecting some argument invoking the power of DTDs and XSLT or whatever
else, or the original point of XML that people overlook, and all I got was an
extremely weak defense of XML from someone who taught a whole course on it.

~~~
jamesbritt
"I get the feeling XML wasn't really made for this case, though, it was made
for the JSON-like case."

Back in 1997, XML was "SGML for the Web." It was a way to pass around
structured, plain-text, human-readable documents that did not require
expensive, buggy, incomplete parsers.

It then got misapplied as an RPC transport encoding, and tools vendors were
more than happy to start pushing specs, such as W3C Schemas, that demanded the
use of tools.

It started out to be simple, but, as things happen, got hijacked. But the
fault is with the misapplication, not XML itself.

~~~
jasonwatkinspdx
If you read the annotated xml spec, it really is quality work. I don't
necessarily agree with every design decision, but I think a lot of people look
at the complexity of xml applications and falsely blame xml itself.

Sadly there were some sensible early formats that were left behind. XML-RPC's
serialization is a bit verbose but otherwise is quite similar to JSON. Somehow
that got turned into SOAP and then eventually the WS- tar pit of complexity.

Likewise XML as a configuration file language can be quite elegant, almost
like a literate coding version of common .ini or .conf files. But instead of
this simple flat document littered with variables, xml config files in the
wild end up with deeply nested structure that contributes dubious value and
makes the files far less human friendly.

XML itself, with the possible exception of namespaces and a few other
features, is quite simple. I totally agree it's the applications that have
gotten out of hand, particularly in areas where XML is used as structured data
exchange rather than document markup.

------
adulau
My biggest issue with XML is "XML misunderstands Unix philosophy". You can't
easily use cut, awk and grep with XML without having a million of edge case to
handle. There are some tools like XMLStarlet, xsltproc or xalan. But you can't
safely extract content from XML files with standards tools even if you use the
XML extension for gawk.

You could argue that XML documents are complex and cannot be described using
simple comma separated. Maybe but some many XML documents are just there to
store simple key,value data.

And now, we have "jsawk" (<https://github.com/micha/jsawk>) for parsing JSON
under your terminal...

~~~
ams6110
xsl and transforms are what you use to extract data from xml. xslt is the
coolest thing about xml IMHO.

the only real complaint I have is that xsl, being itself xml, is pretty
verbose and can be tedious to write.

~~~
JonnieCache
xslt is beyond tedious, it is infuriating. There are few use cases for xslt
that would not be better served with a procedural technique, eg. python and a
parser.

also the whole "using xml to define a transformation on some other xml" thing
is _so_ overly meta as to induce a massive brain hemorrhage out of my nose and
all over my desk.

~~~
wmf
Fortunately you can now use XPath for simple queries and XQuery beyond that.

~~~
JonnieCache
Forgot about them. Both XPath and XQuery are excellent technologies and a
probably the best thing about XML in my experience with it. I highly recommend
everyone concerned with XML check them out if they havent already. I never
seem to see much mention of them around XML discussions.

<http://www.w3schools.com/xquery/xquery_intro.asp>

~~~
elblanco
> I never seem to see much mention of them around XML discussions.

Which is a shame since I've been banging my head against a particular set of
problems for a while with XML, and XQuery nicely resolves that, and I think
it's easier to use than SQL for the most part.

------
mojuba
Glad to hear SOAP is basically done:
[http://blogs.msdn.com/b/interoperability/archive/2010/11/10/...](http://blogs.msdn.com/b/interoperability/archive/2010/11/10/ws-
i-completes-web-services-interoperability-standards-work.aspx)

That's laws of natural selection at work.

------
gruseom
As Tim Lister once said, if everybody's getting it wrong, there's something
wrong with _it_.

~~~
dkarl
That's a cop-out. You really have to define "everybody."

I know a guy who deployed a Java application on servers with 64MB of memory,
and he did it back before the JIT compiler was any good. It was performant and
got the job done. He's not unique: lots of performant Java applications were
built on hardware that was tiny compared to today's hardware. But for some
reasonable meaning of "everybody," everybody writes horrible bloated Java code
that requires costly hardware to run.

I've used simple, practical XML web services -- in fact, we have several
running at work, and when adding or changing functionality, dealing with the
XML aspect is a rounding error compared to implementing the application logic.
But for some reasonable meaning of "every," everybody writing enterprise XML
web services creates overengineered, overcomplex, finicky interfaces that
require ongoing error-prone tweaking of DOM or SAX code.

Sometimes when everybody's getting it wrong, that just means "it" has proved
irresistible to stupid people and PHBs. It doesn't mean a sensible, tasteful
engineer won't be able to use it correctly. Ditching a technology because
stupid people love to misuse it may be a good fashion choice, and it may have
a good way to influence hiring if you don't have more direct influence, but
there's no engineering justification for it.

And don't forget that for some reasonable meaning of "everybody," everybody
who has tried Lisp programming has become horribly lost and failed to
accomplish anything with it. (This may be less true since Lisp is rarely
taught in colleges nowadays, but it was true at some point in time.)

------
unwind
Dupe, and not very old at that: <http://news.ycombinator.com/item?id=1916489>.

~~~
msbmsb
I looked for a prior submission before posting - search and browse. Somehow I
missed it.

~~~
RiderOfGiraffes
<fx: thoughtful frown>

<http://searchyc.com/submissions/xml?sort=by_date>

First result.

~~~
msbmsb
Got it.

------
uriel
It is the author of this article who, despite claiming to have taught a course
on XML, seems to misunderstand XML.

I think one of the persons who best understood XML was Erik Naggum, or at
least few have explained it so eloquently:

<http://harmful.cat-v.org/software/xml/s-exp_vs_XML>

~~~
jallmann
That Naggum email was beautiful, informative, funny and wildly digressive.
Mind blown.

------
drivebyacct2
Which is the part that I was supposed to have misunderstood exactly?

