Hacker News new | past | comments | ask | show | jobs | submit login

I'm a very happy daily RSS user, but we ought to be upfront about its ginormous deficiencies:

- every day I deal with broken links and broken images because my feed reader doesn't know from which URL it should resolve relative links

- there's no pagination mechanism, or way for a client to ask for "all posts since last Friday". If you want your feed to always have all posts available, you need to include the entirety of your archive in every feed response.

- if I want to follow any link I get kicked back out to a bloated web page

- lots of posts include iframes, or content that only makes sense with JS enabled

- XML

Despite this, I still love RSS. It's what we've got and we should continue to make the most of it. But it also sucks, and it's not any surprise that mostly only technical people use it. I think feeds are pretty much doomed to keep breaking and disappearing. Many of the replacements or pseudo-replacements that people come up with seem fine, but the problem - as always - is getting people to actually use it, which mostly never happens.




A lot of us are happy to just use RSS for the aggregated listing. I really don't mind going to the original site in a browser to read the article.


Yup, the only information I need in RSS is a post title and a URL, maybe a date. Literally never read content from my reader (indeed my current custom reader doesn't support it). It's still nice to get blog posts, webcomics, youtube videos and more all through the same reader.


RSS is my preferred way to stay up to date. The only reason to remove it is to force consumers onto a platform that can push more advertisements. So it's not so much obsolete, but it is less profitable.


What's wrong with XML? It has its flaws, but so do JSON, YAML, and all the other formats.

In fact, for a standard, I like XML despite its verbosity because it supports schemas, essentially a machine readable specification. Also, the entire web runs on HTML, which is poorly defined XML, so why not?


All formats do indeed suck, but not equally (YAML does suck equally, though). Your point about HTML is the best argument for XML though, since every client needs an 8 million LOC HTML parser anyway.

It was probably a mistake to resurrect this flame war about serialization formats. XML works, it's just a bummer.


A parser for what most people think of as XML can be fairly simple, certainly simpler than a performant or even just error-tolerant parser for HTML. Though Yxml[1] is probably overdoing it, what with its lack of Unicode handling or tree construction, a couple of tens of kilobytes seems like the right order of magnitude to me. Heavier than JSON, but not drastically so, unless you decide you also need three types of schemas, two types of XPath, XSLT, etc. (the libxml2[2] approach).

The thing that requires the most bookkeeping is also the thing that nobody uses and not a lot of people even know about: a full textual macro system by the name of <!ENTITY ...> (the “internal DTD subset”), which a conformant XML processor is required to implement. (Not coincidentally, this is the part that the XML profile mandated by XMPP excludes.) There was a proposal to define something that excludes this part by the name of MicroXML[3], but it threw out compatibility by also excluding namespaces, and it doesn’t seem it actually went anywhere.

[1]: https://dev.yorhel.nl/yxml

[2]: http://www.xmlsoft.org/

[3]: https://www.w3.org/community/microxml/wiki/Main_Page


> there's no pagination mechanism

Atom has this:

  <link rel="next" href="http://example.org/index.atom?page=2"/>
https://datatracker.ietf.org/doc/html/rfc5005


From that specs it looks like Archived Feeds using <link rel="prev-archive"> are what should be used for most feeds as pagination does not let you reliably reconstruct the whole feed and is meant more for dynamic content like search results. I'll definitely look at implementing archives for all of my feeds even though client support is probably low - can't expect clients to implement this with no feeds having archives.


Nice, good to know. Do you know if many common clients support it?


> - every day I deal with broken links and broken images because my feed reader doesn't know from which URL it should resolve relative links

> - lots of posts include iframes, or content that only makes sense with JS enabled

These are the fault of broken RSS feed generators. The RSS feed should contain a textual summary of the article.

> - if I want to follow any link I get kicked back out to a bloated web page

This is working as intended and not an issue. I don't think any alternatives would make sense.

> - there's no pagination mechanism, or way for a client to ask for "all posts since last Friday". If you want your feed to always have all posts available, you need to include the entirety of your archive in every feed response.

Agree that this sucks.

> - XML

Really?


> These are the fault of broken RSS feed generators. The RSS feed should contain a textual summary of the article.

"Should" indeed. But a standard that isn't followed isn't exactly helpful to me.

> This is working as intended and not an issue. I don't think any alternatives would make sense.

Yeah, I agree that an alternative doesn't make sense for RSS in particular, but it's not hard to imagine an only-slightly-more-featureful syndication system where individual feed items are directly addressable.

It's just a shame that I've got this really nice, personalized feed reading interface, and I get pulled out of it every time I follow a link.


>> - every day I deal with broken links and broken images because my feed reader doesn't know from which URL it should resolve relative links

> These are the fault of broken RSS feed generators. The RSS feed should contain a textual summary of the article.

Footnote relative links are quite common in blog posts, and my RSS reader punts me out to a browser to view them. Given that most blogs I follow provide full-text feeds, it's a bit disappointing I can't easily navigate that last (textual) portion of content.


> every day I deal with broken links and broken images because my feed reader doesn't know from which URL it should resolve relative links

I have never had this problem. Do you have any example feeds you encounter this with? Are you sure its not just a bug in your reader?

> there's no pagination mechanism, or way for a client to ask for "all posts since last Friday". If you want your feed to always have all posts available, you need to include the entirety of your archive in every feed response.

Agreed that this would be nice to have, but for its primary purpose - to syndicate new content - this does not matter in practice. Having RSS completely static means that hosting an RSS feed is super simple and does not require the increased attack potential of a dynamic service - something that the current federated web protocols completely fail at. It should of course be possible to host support retrieval of past posts with a static-hosting-compatible protocol, in theory at least.

> if I want to follow any link I get kicked back out to a bloated web page

If that page has a linked RSS feed and a permalink mathing the URL or embedded in the page metadata then in theory your reader could show you the corresponding RSS post. I don't think creating a separate RSS-only web of links makes sense.

> lots of posts include iframes

Lots? Really? iframes are more dead than RSS, at least for anything outside of ads.

> or content that only makes sense with JS enabled

Limiting RSS to static mostly-text content provides a lot of consistency. Most articles don't need JS. For those that do RSS can still link to a webpage just like podcasts don't embed the audio in the RSS feed. There is also nothing stopping an RSS reader from showing link-only posts as an embedded web view - personally I prefer a link that opens in a new browser tab though.

> XML

Ugly, yes, but how is this a problem in practice? Just because it's not as hip as JSON?


> I have never had this problem. Do you have any example feeds you encounter this with? Are you sure its not just a bug in your reader?

I mean, it's hard to say whether it's a bug in my reader, given that the RSS spec makes no mention of how relative links should work. I've variously used Newsbeuter, Newsboat, Newsblur, and The Old Reader and have never not had this problem.

(Edit: an example from this morning: Newsblur didn't know what to do with the footnote links in Dan Luu's blog [https://danluu.com/atom.xml]. Aside: Newsblur seems to be responding with 500s for about every other request today. sigh isn't technology wonderful?)

> Having RSS completely static means that hosting an RSS feed is super simple and does not require the increased attack potential of a dynamic service - something that the current federated web protocols completely fail at.

Pagination would work fine with static hosting. And "updates-since" requests wouldn't preclude static hosts from ignoring them and working exactly how they work now.

I agree with you about federated web protocols though - I'm pretty bummed ActivityPub requires an inbox, otherwise I think it would pretty much work with static hosting.

> I don't think creating a separate RSS-only web of links makes sense.

You're right, it doesn't really make sense. I think the crux of my frustration there is that we've got a syndication system that's bolted on top of the web, rather than properly integrated. It's probably the best that can be done though, without an entire separate hypertext system.

> Lots? Really? iframes are more dead than RSS, at least for anything outside of ads.

Youtube and Vimeo embeds are mostly where I see them.

>> or content that only makes sense with JS enabled

> Limiting RSS to static mostly-text content provides a lot of consistency.

I'm not complaining that feed items can't execute Javascript; I'm annoyed that sometimes the (textual) content of the post refers directly to other content that's only there after JS adds it. (I'm not blaming authors, this is just a problem of the content being written for a webpage and then being included in a different context. Again, an issue of the syndication system having been bolted onto the web.)

> Ugly, yes, but how is this a problem in practice? Just because it's not as hip as JSON?

No, I don't particularly love JSON either. XML does indeed work okay, but I pretty regularly (a couple times a year, maybe?) see broken feeds because of mis-escaped content. I think one of the feed generators of someone I follow produces bad CDATA tags too. Pretty hard to mess up JSON escaping (as much as HTML-in-JSON makes me shudder).


> - every day I deal with broken links and broken images because my feed reader doesn't know from which URL it should resolve relative links

I have a restrictive whitelist configuration for uMatrix and in most cases I don't bother to whitelist images when I read articles. I've found that the vast majority of the time, the images are stock images that only tangentially relate to the subject of the text and add no real value. I guess they're included because editors feel images are essential even if they're unrelated.


I'll add another - Chronological ordering within the day isn't very useful. I'm more interested in what other readers (of my type) have found "important/interesting/insightful/etc", so I care about the likes/upvotes/etc...

I don't want everything every day. I want to read the X most important articles that day, with X being different each day depending upon how much time I have.

RSS cannot really provide that prioritization.


> I don't want everything every day.

Yeah! I'm with you. Although I don't use it, I really like some of the features of https://fraidyc.at/. For the majority of feeds I follow, I don't need to keep track of unread status, and high-frequency feeds would be much more bearable if they were grouped together to avoid taking over an aggregated listing. I feel like a lot of client defaults tend to hew too closely to email clients or something.


> I don't want everything every day

Exists: mynews.com/rss ; mynews.com/world/rss ; mynews.com/economy/financial_commentaries/rss ...

> I'm more interested in what other readers (of my type) have found

Could be done: mynews.com/rss?u=q1w2 (upon cookie authentication)

But does a service properly clusterizing users exist? I do not know any... (I know about extremely badly implemented, clumsy attempts to automatically corner registered users into some sketchy profile.)


Before the big social media companies slurped up most of the traffic, there used to be a lot of experimentation with this kind of filtering/aggregation. Nowadays with much better ML libraries, I see no reason why you couldn't train a classifier on your own preferences. Unfortunately most of the devs that are interested in RSS type stuff tend to not know much about machine learning so there's not much of it going on.


Doesn't feedly do something like this?


Maybe try another reader? What are you using? None of these are issues for me in my feed reader, Emacs Elfeed. (XML aside, though I can't say I've ever had to view any feed's source.)

https://github.com/skeeto/elfeed


Can you view relative links (for footnotes) in Elfeed? That hasn't been my experience.


No, I don't think so. Fair enough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: