> fetching every post ever just to learn about whether there's been a single new post
Usually you'd perform a HTTP HEAD request to see if there's new content by relying on the last modified property, avoiding the need to retransmit the whole thing.
Of course, when the content is updated, you'd need to perform a full refresh. While that's wasteful, there may be a way to use range requests, or skip over the xml stream? It may be harder on the server than just serving the 2-10 MB, though.
In the last 12hr I've had 4,998 RSS requests. Only 6% of these were HEAD requests, so I don't think it's common for feed readers to send a HEAD first and only follow up with a GET if there's a change. Of the GET requests, 55% were trying validation, while the other 45% were just unconditionally requesting the feed. This is not a place where you can trust feed readers to do something sensible if you offer a large feed.
Most APIs that support this kind of "show me a subset of many items" behavior use pagination or specifying date/ID ranges.
I agree it isn't perfect, and like most open APIs, some will do the lazy thing.
Thanks for the numbers. Just a heads up: a single misbehaving implementation can cause a huge spike in traffic, so I suggest you de-duplicate before computing stats.
Usually you'd perform a HTTP HEAD request to see if there's new content by relying on the last modified property, avoiding the need to retransmit the whole thing.
You can also use ETags: https://web.archive.org/web/20080823141004/http://www.kbcafe...
Of course, when the content is updated, you'd need to perform a full refresh. While that's wasteful, there may be a way to use range requests, or skip over the xml stream? It may be harder on the server than just serving the 2-10 MB, though.