I'll probably listen to this at some point and genuinely like hearing about the origins of JSON from Crockford himself.
Apart from that, I'd be surprised if this adds anything to what I've already heard and read ad nauseam, including from Crockford himself.
JSON is just a nice, simple serialization format that compresses well and interoperates acceptly with most programming languages. Most importantly with the language that it was designed to be consumed by: Javascript.
If one needs better performance or better guarantees about serialized data types, I guess one should use Protobuf, and/or ASN.1 (was that even correctly spelled?) or any of the countless other formats I have no clue about.
If you need better schema restrictions and interconnected entities + circular references: use a database, I guess.
At least everything XML has done in this space has more or less failed (e.g. SOAP). Well, failed might be too strong a word.
But all the work on defining data types and schema seems almost independent of the underlying syntax.
Since I've now spent too much time on this uninformed first comment, I will now listen what Douglas Crockford wants to say. But I think I've already read this story in another form a couple of years ago.
JSON is really a good example of "worse is better". E.g. an OpenAPI spec with automated client code and TypeScript definitions is not much different from SOAP on the surface level.
1: text with markup. Does not need all the XML machinery, but tags make a lot of sense here.
1b: Arbitrarily nested stuff
2: extensability by third parties without coordination. Namespaces as a central concept help out a lot here, and this use case can even justify some of the more insane features of XML (downloading remote schemas etc)
If you don't need either of those things, you won't get much value out of XML. If you do need them, XML can be neat. Certainly easier than trying to allow people to add arbitrary data (with validation) to your protobuf. It's just that these use cases are somewhat rare
XML tends to be more feature-rich than people need, but what's really lacking in JSON is any support for using custom types to disambiguate polymorphism.
I can't seem to find it, but there's a serialization that's basically Python, so
Stuff(a=10, b={"foo": bar})
It makes it easy to have a list with objects of different types in it.
> what's really lacking in JSON is any support for using custom types to disambiguate polymorphism
Unless I misunderstood you, you can use tagged unions[1] for this so not sure JSON needs any change there. Slightly more verbose but shouldn't be an issue in practice.
I was recently downloading and parsing the US Patent application database, which is emitted in XML.
In 2023, Python lxml can't handle multiple documents per file. And no one has any suggestions on how to handle this other then "break on the new document tag". Other libraries didn't look promising in other languages.
Then if you try to read it, you need to disable all the DTD stuff because it immediately tries to open other local files on disk...which it can't find. There's some corruption in how it parses DTDs because I had to implement a custom DTD handler in my jupyter-notebook to read it.
I could blame the dominant XML processing library of Python, I could blame the lack of attention XML gets, but really? This is just another contribution in a long line of XML not working in some way in every language as soon as you're not using that languages specific XML flavor.
I haven't had to work with XML all that often. For the few programs I've made that need to speak XML, I've always used the XML module in Python's stdlib. Is that bad practice? What does lxml provide as a benefit?
Yeah I had SVG in mind when I wrote that post but I thought of different use cases.
Of course every document format can also be called a serialization format.
I enjoy JSX as well (and even SOAP!). That's why I weakened the quoted sentence in my comment.
When I hear "JSON vs XML" I can't help but mainly think of data payloads in web applications, especially non-static client/server communication.
It's worth listening to, or at least reading the transcript, just for the Dave Winer burn. (And so much more!)
>Adam: That was it. That was the creation of JSON, which everyone is using today. But back then, everyone rejected it.
>In some ways it was a marketing problem. On one side, you had Doug, trying to convince customers that they can build interactive applications on the web using JavaScript and this simple thing called JSON. But on the other side, you had XML that had these big companies behind it, IBM, Microsoft, and big consultants. And later they even had some tech influencers like Dave Winer.
>Douglas: He’s someone who should have known better. He had a website called scripting.com. His style of scripting came from a clever program that he had written for the Macintosh called Frontier, in which he had a scripting language and an outliner and a word processor and a database, all in one program. And the idea was that you could do virtually anything in Frontier with a little bit of scripting. And he was also one of the big promoters of SOAP, the Simple Object Annoying Protocol.
>I don’t remember what the A was, but it might have been atrocious or abominable, I don’t know.
>But SOAP was a big deal at the time. They were right in wanting simplicity. They didn’t accomplish it, but they put simplicity in the name as sort of an aspirational thing. And so, when I started showing how JSON works, he was really threatened by that. And on his website, which was well-read at the time, he complained that, “this isn’t even XML. We should find who did this in string them up now”, which was a really ugly thing to say.
>Fortunately, nobody listens to Dave Winer, so I’m still here.
Dave's done some brilliant influential stuff, which Doug credits and I've written about before, but being annoying is Dave's brand, so Doug's "SOAP" joke is dead on. It's just as fair as referring to Marc Canter's "People Aggregator" as "People Aggravator".
I did listen to the podcast lateron and was pleasantly surprised by the breadth of topics, although I have to admit I fell asleep halfway through (not because of the contents).
Thanks a lot for the money quotes! Casual conversation in English is still sometimes hard to listen to for me without focusing a lot.
> a clever program that he had written for the Macintosh called Frontier, in which he had a scripting language and an outliner and a word processor and a database, all in one program
Fascinating concept. I found a bit more info in an article about UserLand Software, which Dave Winer founded after leaving Symantec.
> In January 1992 UserLand released version 1.0 of Frontier, a scripting environment for the Macintosh which included an object database and a scripting language named UserTalk. At the time of its original release, Frontier was the only system-level scripting environment for the Macintosh, but Apple was working on its own scripting language, AppleScript, and started bundling it with the MacOS 7 system software. As a consequence, most Macintosh scripting work came to be done in the less powerful, but free, scripting language provided by Apple.
> UserLand responded to Applescript by re-positioning Frontier as a Web development environment, distributing the software free of charge with the "Aretha" release of May 1995. In late 1996, Frontier 4.1 had become "an integrated development environment that lends itself to the creation and maintenance of Web sites and management of Web pages sans much busywork," and by the time Frontier 4.2 was released in January 1997, the software was firmly established in the realms of website management and CGI scripting, allowing users to "taste the power of large-scale database publishing with free software."
> Frontier's NewsPage suite came to play a pivotal role in the emergence of blogging through its adoption by Jorn Barger, Chris Gulker, and others in the 1997–98 period.
> UserLand launched a Windows version of Frontier 5.0 in January 1998 and began charging for licenses again with the 5.1 release of June 1998.
> Frontier subsequently became the kernel for two of UserLand's products, Manila and Radio UserLand, as well as Dave Winer's OPML Editor, all of which support the UserTalk scripting language.
Exactly, they could have added an extra wire type to signify messages without costing any extra bytes. Nothing is stopping ProtoBuf from being self describing except incompetence.
I'll probably listen to this at some point and genuinely like hearing about the origins of JSON from Crockford himself.
Apart from that, I'd be surprised if this adds anything to what I've already heard and read ad nauseam, including from Crockford himself.
JSON is just a nice, simple serialization format that compresses well and interoperates acceptly with most programming languages. Most importantly with the language that it was designed to be consumed by: Javascript.
If one needs better performance or better guarantees about serialized data types, I guess one should use Protobuf, and/or ASN.1 (was that even correctly spelled?) or any of the countless other formats I have no clue about.
If you need better schema restrictions and interconnected entities + circular references: use a database, I guess.
At least everything XML has done in this space has more or less failed (e.g. SOAP). Well, failed might be too strong a word.
But all the work on defining data types and schema seems almost independent of the underlying syntax.
Since I've now spent too much time on this uninformed first comment, I will now listen what Douglas Crockford wants to say. But I think I've already read this story in another form a couple of years ago.
JSON is really a good example of "worse is better". E.g. an OpenAPI spec with automated client code and TypeScript definitions is not much different from SOAP on the surface level.