"HyperText is a way to link and access information of various kinds as a web of nodes in which the user can browse at will. Potentially, HyperText provides a single user-interface to many large classes of stored information such as reports, notes, data-bases, computer documentation and on-line systems help. We propose the implementation of a simple scheme to incorporate several different servers of machine-stored information already available at CERN, including an analysis of the requirements for information access needs by experiments… A program which provides access to the hypertext world we call a browser — T. Berners-Lee, R. Cailliau, 12 November 1990, CERN "
Web apps are somewhat backwards in my opinion. We completely lost the idea of a markup language and decided “wait, we need an API”. So we started using JSON instead of the actual document (HTML or XML) to represent the endpoints. And then we patted ourselves on the back claiming “accessibility".
XHTML wouldn't have fixed anything. To make XHTML work on the web, browsers would have needed to make so many compromises we'd just have ended up with something even worse and less specified than HTML is now. HTML5 won because it codified all the compromises browser vendors had to make to deal with broken, real-world markup.
Sure, there are HATEOAS implementations that use HTML as a data format but the semantics of HTML are so generic and insufficient that all you're really gaining is being able to use an HTML parser instead of something easier to implement and more widely supported, like JSON.
Let me repeat that point: HTML semantics are insufficient and too generic. That is why microformats were a thing and embedded metadata was standardised in HTML5. So you can bolt on your custom semantics in a standardised fashion if you really want people to parse your markup.
HTML semantics are only sufficient to describe the structure of a document. And even that they don't do very well right now (just ask anyone who takes accessibility seriously and knows the WHATWG and W3C HTML specs).
Not just people. What's the point of being able to embed MathML in your XHTML if there's only a single browser that understands it? And what's the point in using XHTML when you need to pretend it's just HTML tagsoup for 90% of your vistors' browsers?
> rather than in a schema or DTD (which used to resolve in band, even)
Except that only happened with a very small number of tools. Browsers never cared about DOCTYPE declarations other than as a signal (or at best a bunch of ENTITY definitions). And schemas provide no use other than validation.
XML is more verbose than JSON and XML Schema is more established than most of the JSON equivalents but none of that is relevant when talking about HTML. The only thing you gain from embedding your metadata in your HTML is colocation.
Truth in DOM was a dream that web developers chased for more than a decade. The reason it doesn't work is simply that rendering information is a lossy process. You either end up duplicating the same information multiple times in your markup (once for presentation, once for display) or adding layers of indirection to render or reverse the render at runtime. It's a fool's errand.
The changes don't have to be breaking, if the HTML is well designed. Adding or removing an intermediate tag is OK as long as the data tags have identifiers to avoid having to write complex, fragile XPath queries.
Well established methods for authorization that is easy to use from code/cli and so on.
Basic Auth is very established and easy to use.
(I say 'XHTML' because it's just HTML that happens to be XML, rather than actually being served as XHTML. But having it in XML obviously makes things a lot easier.)
For that reason alone, an API response format that stays constant no matter how the site looks appeals to me.
It's just that nobody uses HTML like this and hence IDs and classes in particular have turned into being almost exclusively used for visual / UI aspects of a website (using CSS).
There's nothing in principle though that keeps you from mandating that specific IDs and classes have a meaning beyond what's represented visually and that those signifiers therefore have to both stay the same and must not be used for entities other than those they were meant for.
So everything after HTML 2.0 (RFC 1866)? The STYLE and FONT elements were added in HTML 3.2, but when tables were added to HTML 2 in RFC 1942 they already included presentational attributes. Heck, RFC 1866 already included the IMG element with an ALIGN attribute, as well as typographic elements like HR, BR, B, I and TT (for monospace text).
It sounds like you're being nostalgic about a time that never was, especially for XHTML (which the Semantic Web crowd loves to misremember as being 100% about semantics and not just a hamfisted attempt to make HTML compatible with the W3C's other XML formats).
That is an incredibly uninformed view of early 2000s web content authoring. The reason people didn't like XHTML was that it provided no tangible benefit.
In fact, my experience was quite the opposite: technical people loved XHTML because it made them feel more legitimate, they just had to sprinkle a few slashes over their markup. Validators were the hot new thing and being able to say your website was XHTML 1.0 Strict Compliant was the ultimate nerd badge of pride.
But these same people didn't use the XHTML mime type because they wanted their website to work in as many browsers as possible.
> And now most web pages don't even parse in an XML parser.
Again with the nostalgia for a past that never was: most web browsers never supported XHTML, they supported tag soup HTML with an XHTML DOCTYPE but an HTML mime type. Why? Because they supported tag soup HTML and only used the DOCTYPE as a signal for whether the page tried to be somewhat standards compliant.
I don't remember HTML 3.2 being such a problem (I wasn't around for HTML 2.0) because either people didn't do such complex things with their sites, or they didn't redesign much.
I did enjoy how simple publishing documents online was back then, and I'm nostalgic for that. Though definitely not for dial-up internet!
The shift that made HTML the mess it is today isn't technology but target audience and sponsorship. The early web was mostly hobbyists and academia, people who didn't care much about layout and were fine with just having a way to make something bold. The equivalent of people writing their blog posts in markdown these days.
I'm saying you're being nostalgic because you claim that there was a point when XHTML was about semantics and not presentation. That time never was. It's true that today it's easier to use HTML for presentation than back in the day, mostly thanks to CSS and the DOM, but what held HTML back initially wasn't technical.
That said, there were numerous progressions and many of them overlapped. Flash and Java applets were infinitely worse than the interactive blobs of web technologies we have today. Table layouts were followed by a second semantic renaissance led by the CSS Zen Garden (which for the first time really popularised the idea of separating markup and presentation).
The common sense of the user is that either the information on a website is free or it is not free. But "web APIs" have tried to blur this clear distinction. The information is free only if taken in small amounts. Consume too much, too fast and it is not free.
It is the Google model. Allow users to search public information on the web collected that Google collected for free from wesbites and has stored on its computers. But users may only query this public infromation in small amounts, and not "too fast". Otherwise users get blocked.
Now some websites might claim they need the ability to rate limit because if too many users started accessing their website at Googlebot speeds, the websites performance would degrade. But can Google make that claim? Is their infrastructure really that brittle? We are continually bombarded with PR that suggests Google is state of the art.
Is this really the era of "big data"? Users are restricted to very small data. The solution to any website performance being degraded by "too many" requests is to provide bulk data. For example, pjrc.org, where users can buy Teensy microcontrollers, makes the entire available as a tarball for users to download. The SEC traditionally provided bulk access to filings. There are countless other examples.
Is this really the age of Artificial Intelligence, Machine Learning, etc.? Teenagers and companies around the world build robots and the press gets excited. Yet websites block requests out of fear they are coming from "(ro)bots"? Is there something wrong with automation? (The majority of requests on the web are indeed from bots; using software to make requests, as Google and myriad other companies do, is far more efficient than manual typing, clicking, swiping and tapping.)
Summary: The "Web API" is nothing more than an another senseless urging for users to "sign up" to receive free information. Not every "Web API" user is an app developer submitting to an app store, nor are they necessary running a "competing" website. Everytime a website collects unnecessary "sign up" credentials it is one more unnecessary risk for the user that those credentials will be leaked.
As for rate-limiting usage of a provider’s services, that’s what the terms are for. They’re providing the infrastructure and direcly covering the costs. It is an agreement between you and them to abide by their rules for what fair access is. If you disagree, no one is forcing you to pay and/or use that service. You may use another service or even start your own if you feel you can provide a better service. What you can’t do, however, is expect that these services provide some minimum threshold of computation power, especially at their ditect cost. Not unless it’s in the agreement you signed with them.