I hope this list is constantly maintained and updated.
Additionally, do you have any recommendations for a good RSS reader that can properly handle this OPML file? I tried using Omea Reader, but it fails to fetch many of the feeds.
If you want a more casual approach then to right away import 600 feeds into your reader, check out the feed options at IndieBlog.page - it gives you daily random posts not only from the 600 URLs of that recent thread but also from a similar one back in April last year and a bunch of other indie sources...
> File exceeds maximum allowed number of subscriptions! Your file contains 659 out of 150 maximum allowed subscriptions for your plan.
Alright.. I'm gonna build my own RSS reader then.
Edit :
A kind of "decentralized" social media platform where we post on our respective personal blogs, and we may follow each other by using RSS subscriptions. If HN is one such platform, here's the chronological timeline : :https://gist.github.com/altilunium/c2fcbe1e23aeb1cb9564f2593...
I know I don’t have enough karma, but it seems like the log output should have logged something for that. I also wonder if you’d consider account age as an alternative to karma for anti-spam, as I’m a bit of a hardcore lurker here and I highly doubt that I’m the only one.
I've been playing around with Node-RED[1] for a while and thought I would recreate this using Node-RED (also being a big fan of Node-RED). The flow[2], i.e. code, is online to have a look at (editable but not deployable) and the feed[3] is cached and updated every hour or so.
It's only a small Heroku server so it might well be down or about to crash, I make no promises!
Thanks to the OP for the inspiration, I did take a lot of ideas from the original codebase :)
Looking at how many "https://url.com//feed.xml" there are in the list, I have a feeling that the scraping logic needs some work. Is it just concatenating "https://url.com/" and "/feed.xml"?
I was extracting URL from the alternate link in the HEAD of the blog website. The issue is that some people will do "//example.com/feed.rss", some "feed.rss" or "/feed.rss". So I built pretty stupid URL resolving logic, when I just should have used ResolveReference from go runtime. So the last list will fix that.
It's also not a good idea. On https://andinfinity.eu/ I have a header for that and it clearly indicates that the rss feed lives at index.html. Standard hugo thing I guess.
Thanks for the effort, this is a fantastic resource.
I installed 5 RSS reader apps in my Linux Mint machine to test the OPML. 4 of them failed to import the file: Thunderbird, Lifera, Quitrss and another one I forgot the name.
Gnome RSS did import the file but it became slow and buggy.
I think the file is too long for a typical RSS feed app to process it. Also, there might be some formatting issues a couple apps threw a "incorrect format" error (Thunderbird being one of them).
Now trying CPod and giving it time... I hope in 20 mins or so it can work with it.
I wonder if web-based RSS readers will work better with the file.
I tried importing at feedbin.com, 524 of 598 imported successfully. The failures being a mixture of not found or connection errors and a couple of invalid URL errors
I just resubmitted the list, I believe around 500+ there is an invalid XML tag, which is fixed now. So all of them should be valid now. Forgot to do xml escape for the URLs
Ugh, just need 20 more karma. Don't know anything about farming karma for hackernews though. Oh well, maybe I'll have to fork and allow users to specify their min karma to record the blog...
I will regenerate the list with just 2+ karma. Will let users to figure out what they like or not. 100 was just a random number I used without any idea what is good or bad.
I dunno, 80 seems like a pretty good cut off, let me just pull this ole ladder up after myself.... jk, Thanks! Love to be featured on the list, very cool idea. Feels like web rings are coming back
That is good idea as well! That would require me to actually run the server (or github actions) to refresh the RSS feed, maybe something I would do later on.
OPML allows your to export/import the list of RSS/Atom feeds from your RSS reader. So this is a list of 500-600 blogs that people submitted at that HN discussion, that you can import in your RSS reader.
OPML (when used in this context) is basically a list of URLs with a touch of metadata. It is the most common format for importing and exporting lists of feeds to move between feed readers.
So you generally don't subscribe to an OPML file, it just provides a list of feeds to subscribe to.
E.g. I am surprised to not see Julie Zhou's blog in this list--which, I believe, is one of the most influential blogs in the tech product space: https://www.juliezhuo.com/.
I saw some people asking for a good RSS reader.
Personally I will use elfeed, and I also use elfeed-org so I could import the OPML file and export it into my existing RSS feeds.
The fact that FreshRSS can fetch full article contents for truncated RSS feeds is precisely the thing you need in order to make RSS useful today. Not the biggest fan of the UI but this one feature makes the whole thing so fucking good.
Then why not use RSS, or Atom, or CSV, or HTML, for that matter?
That OPML "outline" has no tree structure at all, except for the single top level container item:
<outline title="HN Personal Blogs" text="HN Personal Blogs">
What is the point of using OPML for a flat structure?
Outlines are all well and fine, but a list of links to blogs is hardly an outline.
And isn't JSON the simplest most ubiquitous accessible standard format for outlines and even arrays of links these days, if not just raw-dogging XML?
More thoughts and opinions and history of OPML, RSS, Dave Winer, More, Frontier, Manila, Radio UserLand, XML-RPC, SOAP, Xanadu, spreadsheets, CSV, JSON, XML, etc:
I think it's just that OPML has traditionally been used to import/export lists of RSS feeds, and therefore most readers are capable of importing an OPML file whereas other formats would not be as widely supported.
Maybe I'm thick, but it bewilders me that someone wouldn't simply use RSS or Atom directly, instead of abusing OPML to mis-represent it (which is itself just as terribly designed as RSS, by the same person, Dave Winer -- don't even get me started).
Is this some kind of a masochistic corrupted data ingestion challenge from TikTok that I'm not aware of, akin to eating Tide Pods or washing your clothes in Campbell's Soup?
Does anyone even ever use OPML for outlines any more? Or is it like YAML's backflip, where its original name meant "Yet Another Markup Language" until somebody pointed out to their great surprise that it WASN'T actually a markup language, so they retroactively recursively renamed it "YAML Ain't Markup Language" on opposite day?
So does OPML no longer stand for "Outline Processor Markup Language" any more, but now stands for something else like "OPML Presents Mere Lists"?
What's wrong with exporting and importing RSS as RSS? Has its meaning also changed, flipping places with OPML, so it now means "RSS Serves Structures", so you use RSS for nested trees and OPML for flat lists now?
I suppose you could recursively encode outlines as nested RSS feeds inside the text content other RSS feeds as <![CDATA[ at each level, or since those can't be nested, exponentially doubling the number of escaped entities at each level.
I Wanna Be <![CDATA[ -- Sung to the tune of “I Wanna Be Sedated”, with apologies to The Ramones:
This is the way. I just swapped as many news, social (reddit/etc) and the like over to newsblur via RSS and I'm VERY happy. This OPML will go a long way to filling out my categories :-) Thanks a bunch for your efforts!
What about subscribing to OPML files hosted on URLs ? That way one can follow OPML list "hosts" (like the one on OPs repository) and discover new RSS feeds. I'm quite surprised that this feature do not exists in RSS reader I tried.
There seems to be many valid blogs that got skipped due not having the `rel="alternate"` link in the HTML head (mine included). I just added it, as I'm sure others will, so it'd be a good idea to update the list regularly
Yeah, the best way is to go back to the original post and submit comment there on top level. Make sure HN recognize that as a correct href and your blog has alternative link to rss/atom. And for now only users with 100+ karma are included.
I left the console.log there as well to show which blogs aren't recognized.
I am also open for PR. It was just a few hours project to get it going, but if people will find it useful, would be nice to get some tags from the blogs, so it would be possible to extract the list only for specific technology/topics.
When do you plan re-run your crawler? HN users are updating that original HN discussion every minute as we speak! It'll be nice to get an updated list sometime today. Possible?
I am fixing some issues, trying to get more blogs from the post, so do it a few times in a day.
And hope at some point to actually build something larger from that. I love RSS and this is the best way to discover new content. I actually already have found a few interested blog posts, links, information.
I hope this list is constantly maintained and updated.
Additionally, do you have any recommendations for a good RSS reader that can properly handle this OPML file? I tried using Omea Reader, but it fails to fetch many of the feeds.