Sometimes I feel I am the only person in the world who likes XML. It just followed the trajectory of all formats, where it is used in places it shouldn't have been used.
It is moderately readable and writable, and the tooling is great. Whenever I have to write it Emacs verifies the doctype for me and handles the structural part of it.
And, as the document shows, xslt makes it easy as hell to scan the contents of a file.
OPML is a good example in my opinion. I used it maybe once a year and it has never failed me
JSON can be a valid choice, as can XML, but I feel that the decision about which to use is too often based on fashion rather than choosing the best tool for the job. I wish this was different but there seems to be something structural in web development that favours the new over the proven, regardless of the circumstances.
I’ve never liked XML, per se, but I used it extensively, for decades, and even got fairly good with it.
These days, I mostly use JSON, but XML is pretty much an ironclad data definition and transfer protocol. You can define and transfer just about any type of data, using it, albeit, in a rather “prolix” manner.
> but XML is pretty much an ironclad data definition and transfer protocol.
I suspect that may be why you might not have liked it. XML is not a good serialization of data for network protocols. XML is a good serialization of documents. Ok, that's the received wisdom that I'm echoing, but it's also my experience and my opinion.
When it comes to serialization of data for network protocols there are and have been many other better-suited schemes. XML got used as a serialization protocol for the the web because it's what existed at the time that was... close to HTML and textual, but it's got the disadvantage of being verbose.
Yes and no. You are correct about it being a document protocol, but it has long been set up as a big document protocol.
Most XML parsers are structured to parse and deliver XML in element-delimited packets, in asynchronous fashion. I don't know of many JSON parsers that can do the same. They do exist (I use one, in the backend of one of my projects[0]), but they aren't as common. XML has packet/async built into OS SDKs, but JSON tends to be "The Whole Nine Yards" handling.
With Big Data/ML, I'm surprised that this is still a thing.
> but it has long been set up as a big document protocol.
[Meaning stream parsing.] Yes, but typically one does not need that for small messages. I see that streaming decoders for Protocol Buffers is a thing now, but historically one does not bother with streaming for small messages, instead one streams lots of small messages.
> I don't know of many JSON parsers that can do the same.
libjq has one. With jsonlines or similar, if each text is small, there's no need for streamed decoding. Typically a DB query will produce a sequence of lots of small JSON texts, not one very large one.
If you're updating a DOM from XML then stream decoding makes sense, but in many cases streaming isn't ergonomic, just necessary when dealing with large documents (e.g., if there's not enough memory to hold them in memory without thrashing).
XML is such a versatile format, and I wished it was used so much more. It doesn't have exactly the cleanest syntax, but it would be so much better than JSON in some of the cases I've seen. Especially when you are transfering document-type data. Why use JSON to represent rich text, when XML is infinitely better?
XML was/is great. It was the issue with people abusing the shit out of CDATA and comments to do meta programming inside the XML and making it an absolute nightmare.
I believe it was an additional reason why comments were excluded from the JSON spec. I can't find the exact quote but Crockford's comment about excluding them for parser directive reasoning is pretty bang on considering usage of JSON primarily as an interchange format (and evidence of a good decision based on it's staying power).
Its not just you. I think web development would be in a much nicer place today if we spent the last 20 years improving XML and XSLT rather than abandoning it for JSON and client-side JS.
People are starting to realize that most sites really boil down to parsing server state and rendering DOM. We never needed to do all this nonsense with serializing all state to JSON and shipping the entire rendering pipeline to the browser, that was just a heavy handed solution for a very specific scaling issue at Facebook.
Just as a thought exercise, if XML had been the de facto interchange format, how much would that have added to the historical bandwidth transfer of the Internet? Even 1 KB added to every AJAX call would add up pretty significantly pretty fast, I'd imagine...
Obviously JSON isn't well optimized either but I wonder how much, if any, progress might have been slowed by XML syntax clogging the pipes even more.
Resource requirements expand until it they hit a user-noticable limit. Even ultra-compressed every-bit-counts encodings[0] would be ignored and abused until they're bloated to a user-noticable limit. Or the extra bandwidth would be used for more video ads.
> how much, if any, progress might have been slowed by XML syntax clogging the pipes even more.
Depends what you mean by "progress", and if you think Web development has been improving or devolving over time.
XML is definitely more verbose than JSON, though I'd be very surprised if an average content-heavy site would be smaller with something like JSON + react. I'd be surprised if server components tipped the scales either given that the server state would still be shipped as HTML and/or a virtual dom representation.
I liked XML 1.0. I gave up after getting tired of the standards community not prioritizing users - the thickets of interdependent specs, dearth of good documentation, and critical lack of work on quality implementations of the standards (e.g. no decent editors except $$$ oXygen, libxml2 and Xalan never implementing XSLT after the 90s, often missing or conflicting examples of anything non-trivial, etc.). I really wish there’d been an effort to focus on the basics so there wasn’t such a gap between the vision of the standards committees and the lived experience of most users.
I love love love XML, but when I encounter an effort to use it to carry presentation like HTML alongside executable code -- such as Apache Jelly -- I regret the choices that brought me to that place in my life.
I wouldn't call OPML a good example of XML for reasons I detail elsewhere in the thread. But if you need a subscription list of feeds for import or export it's alright.
It is moderately readable and writable, and the tooling is great. Whenever I have to write it Emacs verifies the doctype for me and handles the structural part of it.
And, as the document shows, xslt makes it easy as hell to scan the contents of a file.
OPML is a good example in my opinion. I used it maybe once a year and it has never failed me