I came here expecting to find a mention of Dave Winer. In geekland, I believe he belongs in the pantheon of underappreciated shepherds. In 2000, Dave began crafting OPML [1] to provide a means of sharing lists. He composed his weblog using an outliner, marking the first instance someone demonstrated to me that structure could support both thinking and writing.
Admittedly, Dave used to be rather cantankerous, which turned many people off. However, when I was 17, I encountered him in a pizza shop near Harvard, greeted him, and ended up spending the day together.
In 2003, he was working on RSS 2.0, and I contributed a Comments namespace to enable the syndication of comments. This occurred in a more idealistic and less spam-dominated web era. I can still remember the feeling I got when I hit Send to share the proposal with Dave and the community. [2]
Their human-readable and easily understood nature is a byproduct of a time when those qualities mattered. I recall rushing to upload an RSS module to CPAN. It was not simply a conglomeration of other libraries but rather something entirely crafted in my mind.
OPML and RSS are both awesome. They are two standards that have played instrumental roles in maintaining the free flow of information on the web. They were not created by a committee or the IEEE. They were created by someone who was passionate as hell about a free and open web. That's rare now.
On a related note, it’s one of my biggest irritations how it’s becoming more and more difficult by the day to find the RSS feed underlying any given podcast.
To me, a podcast always just was just an RSS feed of type audio. If you don’t provide that feed, you’re not a podcast. You’re something else.
These walled gardens are rolling in and claiming territory over what was a beautiful bastion of freedom. I hate it so much.
What podcast client do you use? Every podcast in the iTunes directory has a public RSS feed. If you're having trouble figuring out the feed from that directory, you can use a tool to show you the link from the podcast's directory page. For example, paste Smartless' page https://podcasts.apple.com/us/podcast/smartless/id1521578868
At this point, just about the only podcasts that don't have public RSS cost money, now that Spotify has abandoned its exclusives strategy. If you were subscribed to a public feed that moved to Spotify exclusive, it's probably started repopulating in the last year.
Do someone knows if it is possible to get a RSS feed for a podcast on spotify? If yes, how do you do in spotify (or other ways) to see a podcast RSS feed?
If it’s hosted by Spotify, the easiest way is to plug in its Apple Podcast URL to a tool like the one I linked. Try Joe Rogan’s for an example of this (his RSS is hosted by Spotify’s Anchor and every podcast index has that link.)
If the podcast is just distributed to Spotify, as most are, then the feed will be hosted someowhere else.
The output of yy084 is input to SQLite. For example,
echo k12 educational software|1.sh|sqlite3 1.db
echo select url from t1 where t1.query glob \'\*k12?educational?software\*\'|sqlite3 1.db
https://k12techtalkpodcast.com/feed.xml
du -hc /usr/bin/yy025 /usr/bin/yy084 /usr/bin/nc
48K /usr/bin/yy025
128K /usr/bin/yy084
84K /usr/bin/nc
260K total
readelf -d /usr/bin/yy025 /usr/bin/yy084 /usr/bin/nc|grep .
File: /usr/bin/yy025
There is no dynamic section in this file.
File: /usr/bin/yy084
There is no dynamic section in this file.
File: /usr/bin/nc
There is no dynamic section in this file.
Of course this could also be done using curl and jq or other contraptions.
du -hc /usr/bin/curl /usr/bin/jq /usr/lib/libcurl.so.4.8.0 /usr/lib/libjq.so.1.0.4
260K /usr/bin/curl
32K /usr/bin/jq
740K /usr/lib/libcurl.so.4.8.0
372K /usr/lib/libjq.so.1.0.4
1.4M total
(Opinion: There should be more open repositories of these feeds and other metadata not controlled by commercial entities like Apple.)
<sarcasm>I'm sure that everyone who listens to podcasts can figure out how to do this.</sarcasm>
The obfuscation of RSS from podcasts (or completel disconnection from it) is a problem. I'm in the "if it doesn't have a feed it isn't a podcast" camp personally. I have no problem with there being login-only feeds for paid subscribers, or even feeds that are only teasers to sign up, but RSS is a core feature of podcasts. If there's no feed, then it's a subscription channel not a podcast.
Something that might be useful is if RSS feeds were mirrored. As such, there would be mutiple URLs from which a feed could be retrieved. This would allow aggregators to fetch the feed from multiple alternative sources.
"The data presented on this website is derived from a collection of 155785 podcasts, with a total of 17648609 episodes, sourced from the iTunes and fyyd podcast directories."
Those are public resources. Anyone can access them for free.
The second "S" in RSS stands for simple. Why make a "simple" standard. Perhaps because it creates flexibility for everyone who wants to publish a podcast, everyone who wants to host a podcast, and everyone who wants to search and retrieve a podcast feed. NB. RSS is focused on feeds, which are nothing more than marked up text files. It does not address listening to a podcast. Listening requires an audio player. A podacst listener may choose any audio player they prefer. RSS feeds just provide a URL for the audio file. Listening is outside the scape of RSS.
Because RSS simplifies publishing, hosting, searching and retrieving feeds, there an endless options for how to do these things, e.g., how to search and retrieve a podcast feed. This means that way that one person searches and retrieves a podcast feed may differ from the way another persons does it. There are hundreds of computer languages that could be used to search and retrieve, dozens if not hundreds of existing programs that could be used, and countless different approaches to accomplish this relatively simple task.
Hence, the way that I search for a podcast feed may differ from the way someone else does it, and vice versa. One could use relatively tiny, simple programs written in minutes by potentially anyone, even a naive end-user like me, or one could relatively large, complex programs written by multi-billion dollar companies that provide advertising services or by their "non-profit" business partners, e.g., Mozilla. These programs are so large and complex almost no one outside of the companies wants to modify them, even when they dislike what the programs do. In case of any doubt, in the preceding sentence, "programs" means "modern" web browsers.
Thus a benefit of keeping a standard "simple" is that is allows for "There is more than one way to do it." Gigantic, complex programs and the involvement of multi-billion dollar companies is not a requirement. Ironically, Apple, a 350+ billion dollar corporation with a closed source web browser running on non-free operating systems, has probably the most complete directory for locating RSS feeds. Go figure.
Joe Rogan ceased his podcast the moment he signed with Spotify. The Joe Rogan Experience lives in a walled garden and I stopped paying attention to it as a result. Did the move get Spotify more subscriptions? Maybe. It turns out he recently renegotiated his contract to include other outlets, which is telling.
I don't use podcasts, but I always found it strange that people speak of podcasting as this open thing (because of RSS) while the audio files are usually hosted inside walled gardens (?). The RSS file is nothing but a link to the audio file, right? Or am I missing something?
There's metadata as well, title, episode description. Everything is just a standard RSS item except for the podcast specific stuff. You can read it with a standard RSS reader.
Regardless of who is hosting the audio, if you can get the free, the audio files are publicly available without an account.
The podcaster can put those files on any server. You can host yourself (e.g., with WordPress) or pay someone else to do it for you. The apps you listen with pretty much don't host the audio, though there are technical exceptions.
There’s nothing walled garden about the file storage IME. You could use an IPFS link if you wanted too…though without a proxy I doubt most clients would be able to download it.
"On a related note, it's one of my biggest irritations how it's becoming more and more difficult by the day to find the RSS feed underlying any given podcast."
It is unclear if this statement is intended to mean it's becoming more difficult specifically for the author, or more difficult in general, for everyone.
Please offer some examples of (free) podcasts to support the statement showing how it is "more difficult" to find the .rss/.xml feed.
In return I will demonstrate how to "easily" find the feed for the examples given.
It's a pretty unreasonable request. Unless parent, after noticing the trend, began to keep track of each instance in order to write a blogpost or something, they're not going to recall any unless there's a particular stark and irksome example that they still can't access.
Personally I agree. This is getting harder and harder to do. Often there's a way to finally find the rss if you dig enough, but I remember cracking out a web inspector in the last year to do it.
This is effectively blocking 99% of people from pulling these things out of walled services, which is the important question, not whether it is technically feasible to retrieve.
Even ignoring that anecdote, if someone has to find a "convert iTunes link to RSS" website, it relies on them even knowing what RSS is, knowing that iTunes podcasts have an underlying RSS feed powering them, etc.
RSS used to be something that only the marginally more literate than usual understoof how to use, and was ubiquitous. But if this trend continues, it'll be more akin to plan files.
When it talks about XSL stylesheets, which in theory you can use to have all your documents be XML and transformed into HTML, it’s worth noting that this is a part of browsers that’s been neglected for over a decade, with the minimum of maintenance to keep it still roughly working, and it shows. The failure modes are bad, and a lot of it is very difficult to debug. It’s like opening a portal to how you’d develop pages fifteen or more years ago.
The worst problem I know of is that loading an XML file with an XSL stylesheet into a new tab in Firefox just hangs. Has done for years, and I still haven’t searched for a bug report about it (surely someone’s already reported it). Try opening https://kmaasrud.com/blogroll.xml in a new tab (Ctrl+click or whatever), and observe the correct document title, but how the actual document area remains blank and the status bar says it’s still “Transferring data from kmaasrud.com…”. A quick Ctrl+R and it loads just fine the second time.
XSLT is also so much more than just a way to render html. especially version 2. the browsers afaik are still XSLT 1 which is kinda limited.
back in the day i built so many nice solutions for both web and internal services using XSLT as a key component. i also found it interesting that it was pretty easy for less experienced devs and particularly non-devs to pick up and make changes to. declarative programming has been much neglected sadly.
interesting!
way way back in the early ’00’s (like pre-9/11 iirc) i built a toolkit on js/xml/xslt in IE 5.5 to transform data into UI controls. it wasn’t quite React but it did form the basis for the most of the UI of the startup i was at during that time, which was a web-based radiology product.
unfortunately it was much harder to convince an employer to open-source support libraries and tools that weren’t the primary IP of the company, so it’s been lost to posterity (not that anyone would want to use it today!)
still, good to know there’s at least some support for XSL in modern browsers.
XSLT was one of the things that brought me to React indirectly over a decade ago. I wanted to runs something like apply-templates (components) and that got me looking at XHP. That was the inspiration for JSX which became popular via react. Ironically I don't even use jsx anymore now that javascript has caught up, and a lot of what react does was really appealing from the declarative side. Weirdly so many of the "experts" are pushing very non-declarative approaches to it now. I'd love if xslt could achieve its stated goals, but like the rest of xml it's too academic and ignores obvious practical requirements.
What I like best about XSLT is that it reminds me that someone thought this was a great idea. They went so far as to declare it as “the way” and everyone should use it everywhere. There were conferences, books, blogs, everything. Anyone who criticized it was harshly rebuked and criticized. It isn’t just XSLT. The highway of IT progress is littered with these. XML, XSLT, CORBA, JEE… It’s helpful to keep in mind that nothing has changed, the current ones just haven’t blown up yet.
We need a word for these snake-swallowing-its-tail indirect abstractions we keep building. I have problem X, so I solve with abstraction a. Then I have problem Y, so I build b on a. Then to solve Z I build c on b on a. Then we discover a problem that is actually just X again, but rather than use a to solve we build d on c on b on a to solve it.
I remember installing enterprise java beans to support my soap wsdl 16 years ago. Meanwhile shopify built a basic Rails app using http verbs and killed the product I was using.
Ever since I played around with xhtml/xslt ~15-20 years ago when I was a lot newer to programming, I've always had the sense that it was a great way to do things, and that most sites should be using it. The last few days I've been playing with it again, and... I feel even more like that's true.
Even in its neglected state, it's got tons of features. You know how when Hacker News is having server issues, it tends to still work when logged out? Well you can just always serve the logged out page so it's cached for everyone, and then in the xslt template you can do an include of the user's data (that's right, you can directly include and query other xml documents in xhtml/xslt), e.g. <xsl:variable name="myinfo" select="document('/app/users/myinfo')"/> or something, and set that URL to private cache or no-cache, and then in the parts where you need the page to be different for the logged in user, you can e.g. compare '$myinfo/user/@id' to '@authorId' of a comment to decide whether to show an edit/delete button, etc. You basically get graceful degradation by default if it can't fetch that myinfo document. XPath is like jq that's been built into the browser this whole time.
The same thing can be used for things like subreddit side bars that are common across many but not all pages. You can even do an xsl:copy-of and serve it as xhtml. No need to send the same data over and over with SSR. No need to do client side routing or have any frontend tooling. It's all built right into the browser. The code is concise (it's verbose in the sense that it's xml, but it's still declarative) and easy to read. You can have a clean separation between backend sending the data in an xml model that makes sense for the domain, and then frontend having templates to present it.
The downside is of course that it all runs before the html for your page is produced, so you can't use javascript inside of xsl (unless you run a separate xsl processor in javascript), and it's about the same as SSR in terms of how dynamic it is (i.e. not interactive after page load, though you can of course have javascript in the output html). That and there's no info out there about how to use it because no one uses it, so you have to be a little creative. But it's been quite fun to see how much I can milk out of client side static template rendering using a technology that browsers haven't even bothered to update past 1.0 from 1999.
The first time I encountered XSLT I thought it was the most ridiculous thing to ever have been invented. A programming language written in XML?
but damn, the things you can do with that thing in a tiny bit of code. It was doing expression filtering years before anyone else, including the likes of Haskell.
yes, the learning curve can be a problem initially but, once you grok it, it opens up a whole world of interesting possibilities. and i do think, once you have created your initial transforms, it is then much easier for non-tech folks to "see" how things work and make tweaks/changes without having to mess with "code".
Ad-hoc and framework JavScript rendering JSON into the DOM is the modern way, don't you know? There's a reason for this: JSON is easier to produce than XML in many server-side systems.
Maybe what's needed is an extension to XSLT/XPath to consume JSON and transform it to XML.
yes. back in the day i would tend to wrap my database api in xml first in a standardised way and then i could just use xslt to return json or other formats from the api depending on the mime types requested by client. this fits in nicely with fielding's REST principles. all those technologies (HTTP/REST, XML, XPath, XML Schema, XSLT) actually work very nicely together and allow you to build nice systems that are very flexible, easy to integrate with and easy to change, even though they can be hard to make fast. maybe "move fast, break things" was a bad idea? =)
yes, there's definitely a bandwidth overhead with xml compared to json. that was an important factor in the shift i suppose. but then we have things like graphql, which has pretty much erased all those gains. you can also do tricky things like using binary on the wire and transforming at the endpoints.
it would be nice to see if fundamental xml tech can be improved in a new hardware and software environment. afaik, there hasn't been much if any innovation in that space in recent times. would love to know if there has.
I wasn't mature enough to grasp xslt at the time (the match construct confused me, and trees everywhere wasn't natural yet), but while you talk about corba and jee I had a few moments in the recent years where I thought all the json/rest/microservices are hand knitted remote beans..
well yes. i agree re corba, j2ee etc. but i think there were a lot of very nice technologies around xml that enabled automatic discovery, interop, validation etc. and we have ditched them all because there was also a lot of unnecessarily complex baggage and tooling that came with the ecosystem around them.
i even thought SOAP was quite nice if you used the message oriented flavour instead of RPC flavour. REST was imo better and won out, but then nobody really does REST today and we seem to keep on re-inventing the wheel. am watching the htmx discourse with interest.
I was at IBM during peak SOAP. It was crazy how much business pushed for abstraction for abstractions sake in this domain. I'm really hoping htmx helps people get their heads on straight with REST. All these scores of devs have been writing `POST /item {"method":"replace","id":12}` or `POST /item {"method":"delete","id":12}` for decades now.
yeah. RPC masquerading as REST is what people do now.
if your web service had a truly RESTful design then all you would need to do is point something like postman (or a browser) at it and it would be able to automagically figure out the whole set of available resources and interactions instead of having to use all this crazy swagger/OpenAPI stuff. there's a whole cottage industry around this api stuff rn.
i guess this is what happens when you put "programmers" in charge of a hypermedia system. everything becomes an RPC and nothing is standardised.
My day job combines assembly and XSL (GPU dev work can be weird). I have a soft spot for XSL; but, at this point, the combination of Python 3's performance story (concurrency, threading, multiprocessing), ElementTree, and itertools... I just don't see it anymore. Especially since any new hire has Python experience due to the current ML summer. I've been porting all my XSL to Python, and I couldn't be happier.
I still find it hard to write transformations nicely in normal programming languages. I actually even used it to transform structured formats and always am quite happy (using [0]). Also generated spreadsheets and docs (actually via markdown and pandoc) just 2 days ago. xsltproc is for me the awk/sed of structured.
yeah, making XML/XSLT fast is definitely a hard problem. still in pretty wide use in enterprise systems where correctness and interop are bigger priorities. i did quite a lot of work with shipping api's recently and you would be surprised how many of them are still XML and even SOAP based.
i'm intrigued by the combination of Assembly and XSL - any pointers to what you use those for?
> Try opening https://kmaasrud.com/blogroll.xml in a new tab (Ctrl+click or whatever), and observe the correct document title, but how the actual document area remains blank and the status bar says it’s still “Transferring data from kmaasrud.com…”.
It opened fine in Safari (macOS), and it opened fine in Chrome. Firefox did not apply the XSL file.
I tried to "view source" on both Safari and Chrome, and neither gave me the option. I could Inspect Element, and I got the HTML DOM, but not the original XML source. On Firefox, it showed me the XML file.
This surprised me. I was under the understanding that the browsers had effectively abandoned the XML/XSL concept any more. Apparently Firefox does.
It's also a shame the browsers stopped at XSL v1 as well, v2 is much better.
But, the browsers basically said "we're not in the XML ecosystem, we're in the HTML ecosystem".
The idea of downloading XSL, and then having a blog post that was little more than just content, with a wee bit of meta data, with all the chrome rendered locally via the template, I find that idea compelling. One less thing to download. More stuff to cache on the local system.
I had grand visions of an XProc pipeline terminated in the browser itself, but when I found out the browsers weren't really playing along anymore, kind of took the wind out of its sails.
It did for me. The "This is a list of blogs and news sources I follow." part of the page as rendered comes from the XSL file (opml.xsl). The actual list is rendered. I'm did not check whether the XSL was applied correctly, but applied it was.
> On Firefox, ["view source"] showed be the XML file.
Can confirm.
> The idea of downloading XSL, [...], I find that idea compelling.
> [...] but when I found out the browsers weren't really playing along anymore, kind of took the wind out of its sails.
Sad, yes, but for me Firefox did download the XML, the XSL, applied the XSL, and rendered the resulting page. I dunno why it didn't for you or GP, but maybe it's an add-on interfering? Do check, and please report, because I'm curious if the issue is add-ons.
Did you open it in a new tab? In an existing tab, it loads fine.
On reflection, it’s also quite possible this is a platform-specific bug. I’m on Linux. I think I experienced it in Windows a few years back, but the memory is fuzzy.
… hmm, apparently if I start a blank Firefox profile this doesn’t happen either. Though I know I’ve encountered this with at least two completely distinct Firefox profiles over the years. This bears closer investigation.
Anyway, it’s still indicative of the disrepair of this feature set. It’s still used widely enough that I don’t think it’ll be torn out, but honestly on technical robustness grounds it could warrant being removed.
It works fine for me, with a current FF on Linux (Ubuntu), opening in a new tab. I have lots of extensions enabled.
But I do share your frustration: the modern web seems unusable, unless you have a collection of extensions that block ads, annoying consent forms and the likes. But once you have these extensions installed, you basically lose all support.
Sometimes stuff breaks, and it works in a new profile even with the same extensions installed... and then what? You're supposed to throw away your profile with bookmarks, CAs, years of history etc.?
>But once you have these extensions installed, you basically lose all support.
Its gotten to the point where I have multiple browsers as fallback. FF with lots of essential extensions > Brave with just a few > Unextended Chromium. A necessarily ridiculous state of affairs, but there you have it.
> creating a recommendation system that is based on concious curation, not statistical metric
That's good, but that's still a recommendation of authors/channels, not articles. It's still who vs what. For me the concept of "subreddits" is the best we've achieved in in terms of social sharing. It was about what over who. You picked a subject, followed a subreddit, and found articles, videos etc... about that subject. Conscious curation on specific subjects.
I still use the old.reddit.com, but few non-tech people do and the primary Reddit interface has pushed people away from the platform and a lot of the smaller subreddits are drying up.
I understand why the changes were made and it brings the question of how can we create social sharing based on subjects that is either distributed (but easy to use by non-tech people) or centralised and profitable?
For those who don't know, Reddit provide feeds at URLs like `/r/foo/.rss` (although they're actually Atom...)
Annoyingly this doesn't work when combining subreddits, e.g. `/r/foo+bar+baz` is HTML containing posts from those three subreddits, but `/r/foo+bar+baz/.rss` only includes posts from one :(
What I would really like to see is “subscribeable OPML feeds”, so GitHub could provide a OPML feed for what shows up in your home page, and you any changes in your subscriptions (repos you watch etc) would change the OPML, which would then cause your feed reader to unsubscribe/subscribe to specific RAS feeds.
Unfortunately, this isn’t supported by the majority of RSS clients (tt-rss is the only one I know) which means in practice OPML feeds are merely import/export mechanisms.
So many memories. That was a big part of my first startup in 2005. Grazr was the “next level of rss”. Unfortunately the first level never quite took off.
Sometimes I feel I am the only person in the world who likes XML. It just followed the trajectory of all formats, where it is used in places it shouldn't have been used.
It is moderately readable and writable, and the tooling is great. Whenever I have to write it Emacs verifies the doctype for me and handles the structural part of it.
And, as the document shows, xslt makes it easy as hell to scan the contents of a file.
OPML is a good example in my opinion. I used it maybe once a year and it has never failed me
JSON can be a valid choice, as can XML, but I feel that the decision about which to use is too often based on fashion rather than choosing the best tool for the job. I wish this was different but there seems to be something structural in web development that favours the new over the proven, regardless of the circumstances.
I’ve never liked XML, per se, but I used it extensively, for decades, and even got fairly good with it.
These days, I mostly use JSON, but XML is pretty much an ironclad data definition and transfer protocol. You can define and transfer just about any type of data, using it, albeit, in a rather “prolix” manner.
> but XML is pretty much an ironclad data definition and transfer protocol.
I suspect that may be why you might not have liked it. XML is not a good serialization of data for network protocols. XML is a good serialization of documents. Ok, that's the received wisdom that I'm echoing, but it's also my experience and my opinion.
When it comes to serialization of data for network protocols there are and have been many other better-suited schemes. XML got used as a serialization protocol for the the web because it's what existed at the time that was... close to HTML and textual, but it's got the disadvantage of being verbose.
Yes and no. You are correct about it being a document protocol, but it has long been set up as a big document protocol.
Most XML parsers are structured to parse and deliver XML in element-delimited packets, in asynchronous fashion. I don't know of many JSON parsers that can do the same. They do exist (I use one, in the backend of one of my projects[0]), but they aren't as common. XML has packet/async built into OS SDKs, but JSON tends to be "The Whole Nine Yards" handling.
With Big Data/ML, I'm surprised that this is still a thing.
> but it has long been set up as a big document protocol.
[Meaning stream parsing.] Yes, but typically one does not need that for small messages. I see that streaming decoders for Protocol Buffers is a thing now, but historically one does not bother with streaming for small messages, instead one streams lots of small messages.
> I don't know of many JSON parsers that can do the same.
libjq has one. With jsonlines or similar, if each text is small, there's no need for streamed decoding. Typically a DB query will produce a sequence of lots of small JSON texts, not one very large one.
If you're updating a DOM from XML then stream decoding makes sense, but in many cases streaming isn't ergonomic, just necessary when dealing with large documents (e.g., if there's not enough memory to hold them in memory without thrashing).
XML is such a versatile format, and I wished it was used so much more. It doesn't have exactly the cleanest syntax, but it would be so much better than JSON in some of the cases I've seen. Especially when you are transfering document-type data. Why use JSON to represent rich text, when XML is infinitely better?
XML was/is great. It was the issue with people abusing the shit out of CDATA and comments to do meta programming inside the XML and making it an absolute nightmare.
I believe it was an additional reason why comments were excluded from the JSON spec. I can't find the exact quote but Crockford's comment about excluding them for parser directive reasoning is pretty bang on considering usage of JSON primarily as an interchange format (and evidence of a good decision based on it's staying power).
Its not just you. I think web development would be in a much nicer place today if we spent the last 20 years improving XML and XSLT rather than abandoning it for JSON and client-side JS.
People are starting to realize that most sites really boil down to parsing server state and rendering DOM. We never needed to do all this nonsense with serializing all state to JSON and shipping the entire rendering pipeline to the browser, that was just a heavy handed solution for a very specific scaling issue at Facebook.
Just as a thought exercise, if XML had been the de facto interchange format, how much would that have added to the historical bandwidth transfer of the Internet? Even 1 KB added to every AJAX call would add up pretty significantly pretty fast, I'd imagine...
Obviously JSON isn't well optimized either but I wonder how much, if any, progress might have been slowed by XML syntax clogging the pipes even more.
Resource requirements expand until it they hit a user-noticable limit. Even ultra-compressed every-bit-counts encodings[0] would be ignored and abused until they're bloated to a user-noticable limit. Or the extra bandwidth would be used for more video ads.
> how much, if any, progress might have been slowed by XML syntax clogging the pipes even more.
Depends what you mean by "progress", and if you think Web development has been improving or devolving over time.
XML is definitely more verbose than JSON, though I'd be very surprised if an average content-heavy site would be smaller with something like JSON + react. I'd be surprised if server components tipped the scales either given that the server state would still be shipped as HTML and/or a virtual dom representation.
I liked XML 1.0. I gave up after getting tired of the standards community not prioritizing users - the thickets of interdependent specs, dearth of good documentation, and critical lack of work on quality implementations of the standards (e.g. no decent editors except $$$ oXygen, libxml2 and Xalan never implementing XSLT after the 90s, often missing or conflicting examples of anything non-trivial, etc.). I really wish there’d been an effort to focus on the basics so there wasn’t such a gap between the vision of the standards committees and the lived experience of most users.
I love love love XML, but when I encounter an effort to use it to carry presentation like HTML alongside executable code -- such as Apache Jelly -- I regret the choices that brought me to that place in my life.
I wouldn't call OPML a good example of XML for reasons I detail elsewhere in the thread. But if you need a subscription list of feeds for import or export it's alright.
OPML is useful because people use it, but it was a bad design decision to store long blocks of HTML inside XML attributes instead of XML elements. Look at the escaped HTML inside these "text" attributes:
Because the HTML is stored in attributes, it can't be wrapped in a CDATA block. There's also no limit to how long the text can be.
OPML is also a moving target without a standard. Any time there's a desire to represent something new with the format, attributes are added without any public participation from existing implementers. The value of the "type" attribute on outline elements determines whether new arbitrary attributes can be present.
Because of this unorthodox extension process, the OPML format has become a catch-all for unrelated uses: outlines, blog posts, RSS/Atom subscription lists, programming source code, and more.
All of the uses could've been represented as XML formats. XML didn't need OPML for storing and transmitted arbitrary data. We already had XML for that.
The original thing OPML was created to do -- represent outlines -- is severely hampered by the fact there's no way to store the expansion state of collapsed nodes of an outline.
> I believe the simple fact that there is a known person behind each recommendation is advantageous.
I think this is key to OPML gaining traction again.
I for myself added an OPML list to my website (see bio) which contains hundreds of independent blogs. You can all download that for free. Most of the blogs I found are from here, but I've been doing that for two years straight, so there might be blogs you don't know. They've been roughly selected for quality.
edit: OPML file can be found at the bottom, behind the link 'Or see all shared links', then click the OPML icon
A side question. I personally believe that RSS is a failure, and I don't know why. There are easy comments and responses to this, but I am looking for something deeper - an analysis that would show how it could be changed and adopted on a wider basis.
My context:
When Apple introduced the iPhone and it was a success almost everyone introduced an "iPhone Killer". The zune being a prime example. They were announced with much glee and anticipation. They were not products of flimsy development and thought processes, but serious failures for companies like Microsoft. No one, that I remember, was able to point compelling reasons why the new products would fail, but they inevitably did. Over and over.
How and why did these phones fail? If you have read the book "Flatland", you perhaps came away with a vision of how creatures living in two dimensions would perceive actions that used a third dimension. As I did. Things would magically appear and disappear in a way would leave the Flatlanders completely perplexed. This imaginary "missing dimension" experience reminds of how bewildered people were that the Zune and other iPhone killers failed.
What could that dimension be? My guess is the human dimension. I don't mean in trivial ways like user friendly, I rather as way for people to find it useful and helpful.
My question then, if anyone wants to try, is what would one have to do to RSS to fix it?
[Note 1] Oddly, the OPML "blog roll" example resonates with me much more than RSS ever has.
[Note 2] I have repeatedly tried to use RSS and failed. Both as a provider and subscriber.
iPhones were expensive, so iPhones were mostly used by wealthy people. That made it attractive for companies to make iPhone apps: not because of market share, but because it was targeted to those willing to pay. Software developers were happy for the busy-work, and once a company had its own iPhone app they might as well mention it in their marketing. The public saw all this focus on iPhones and concluded that they must be more popular than they actually were (i.e. the only thing more desirable than a status symbol, is a status symbol you think everyone except you already has!). That boosted sales, making it more popular but reducing its correlation to wealth. At that point it was firmly established culturally, technologically, economically, etc. which makes it attractive via the usual market share arguments.
When Teslas came out in the model S format, they were the “iPhone of cars”, not just because of wealth (which is a piss-poor argument for why something succeeded btw), but because of a fundamental improvement in the essence of the category
If “status signaling” was really as significant an influence as you say, then every single original Macintosh would’ve been a huge hit and clearly was not. Every single Powerbook and Macbook as well, since those are more visible and portable. If you want to talk “fashion accessory,” then there are plenty of phones that were expensive as well at the time that did not take off either.
The reason why Apple was finally successful, is because Steve Jobs took a calligraphy class for fun instead of an engineering class, and all those things like the Zune were designed by engineers, first and foremost. Holistic thinking is not rational.
It's not so much that OPML is the interesting part here, it's that it's a file. A few weeks back Andrej Karpathy had a twitter thread[1] about blogging software and shared this link on 'File vs App' - https://stephango.com/file-over-app - and that really was great for ecosystem interoperability. I can download the file using whatever tool is appropriate, store it however I want, and then upload it somewhere else using whatever tools is appropriate. I have the OPML export I took of my subscriptions from the day Google Reader shut down and there's still a fighting chance that other services could actually import that file.
It's also worth noting that OPML is only the container format here. Agreeing on a container format is obviously important and we won't get very far for interop if we can't even agree on the container format, but OPML is supposed to be a generic tree of 'outline' format, and conveniently RSS subscriptions (and folders) look like a tree.
I sorta expected that there would be a second standard that says "here's how you use this generic OPML container format to represent RSS feed subscriptions" but oddly that's actually included right in the OPML spec[2]. In fact RSS subscriptions are the only application format defined in OPML - there's a 'type' field defined for <outline> element and if type is set to 'rss' then there's also a required xmlUrl of the feed and optional things like the html link for the blog, the version of RSS used. This is the data and part of the spec that makes the actual subscription list exchange work.
But again the only entry for 'type' defined in the OPML spec is 'rss'. If you want to use OPML as a container for something else, like Youtube subscriptions or Twitter followers, you of course can but you gotta find some way to get everyone to agree on how to interpret the 'type' you set for that <outline> element. And as far as I know, no one's done anything like that for any other domain.
So it'd be awesome if more domains defined 'type' fields and set out some specs so I can export my video streaming subscriptions or Amazon wishlist or whatever but without defining more 'type' fields OPML is really not any more interesting than a CSV of URLs.
If someone wanted to extend OPML into another domain, even if they got others to agree on their proposed type value and the new attributes added to support that type, there's nothing to stop a collision with somebody else choosing the same attribute names.
There also is nothing to stop the author of the OPML specification from opposing the new type.
It would be far easier to create a new XML format.
I seem to recall that OPML wasn’t as interoperable as it should have been and there was no guarantee an OPML file from one app could be used in another app. Has this improved?
I don't recall having issues with that on the handful of occasions I used it.
It's not a particular hard format to parse. Basically it's just a list of stuff. Boggles the mind that you would use a different format for exporting a list of a urls that point to lists of stuff than for those actual lists of stuff. Either way, parsing a list of stuff is not rocket science. But you could just use RSS or Atom for this.
> Boggles the mind that you would use a different format
My understanding is that Winer's main interest has always been outliners, and RSS etc. were just fortunate side excursions, so I find it about as boggling as the phenomenon that people with hammers are likely to be pushing Open Nail Markup Languages.
But semantics… RSS is flat and items are usually ordered by pubdate. OPML is better suited for relatively static collections of things organised in a tree. Using OPML to store blogrolls makes sense (but is not the only viable option of course).
I have an observation on the original use of OPML, i.e., for outliners (the one I utilize frequently) : Many recent outliners (logseq, roam) added custom properties/fields (for DB-like block-level queries), but they don't seem to support them when importing or exporting opml. The only "field" that is somewhat supported is the "_note" [1].
Tangential: if you want to play around with XSLT, I've made a little tool for that a while ago: https://xsltbin.ale.sh/
(Known bugs: indent="yes" doesn't work on Firefox, and unfortunately I couldn't be bothered with adding a separate formatter, so probably try it in WebKit/Blink browsers)
I would like to share opml, but at the same time I would like to have some of my subscription staying private. Would be cool to have some elisp or shell + xmlstarlet script to export only things that I would feel normal to share.
Admittedly, Dave used to be rather cantankerous, which turned many people off. However, when I was 17, I encountered him in a pizza shop near Harvard, greeted him, and ended up spending the day together.
In 2003, he was working on RSS 2.0, and I contributed a Comments namespace to enable the syndication of comments. This occurred in a more idealistic and less spam-dominated web era. I can still remember the feeling I got when I hit Send to share the proposal with Dave and the community. [2]
Their human-readable and easily understood nature is a byproduct of a time when those qualities mattered. I recall rushing to upload an RSS module to CPAN. It was not simply a conglomeration of other libraries but rather something entirely crafted in my mind.
OPML and RSS are both awesome. They are two standards that have played instrumental roles in maintaining the free flow of information on the web. They were not created by a committee or the IEEE. They were created by someone who was passionate as hell about a free and open web. That's rare now.
1: http://scripting.com/manila/opml/spec.html 2: http://scripting.com/2003/08.html#When:7:12:20AM