

There's a Reason RSSCloud Failed to Catch On - TomOfTTB
http://workbench.cadenhead.org/news/3555/theres-reason-rsscloud-failed-catch

======
mcdtracy
Here's my understanding of the potential for a scalability problem.

EXAMPLE: twitter.com has 24,650 twitter followers. If Dave gets 24,650
followers on an RSSCloud architecture then this is what happens when he posts.

1\. Dave creates a 140 char post. Hs blogging software sends a notice to the
cloud server that he has an updated RSS feed.

2\. the cloud server sends update notices to the 24,650 subscribed "listeners"
to Dave's "RSSCloud-twit-sream".

NOTE:It does not send Dave's new post text just the alert event.

3\. the 24,650 listeners then do an RSS GET from Dave blogging server. This
could create "cattle stampede" (i.e. slashdot effect) and many users may not
get service when Dave's server is overrun. The server would likely be swamped
with this massive interest in Dave's blog in a few seconds from these real-
time subscribers.

At small levels of users the architecture is effective and elegant. At very
large numbers it's missing an essential optimization. Only the "new blog" text
should needs to be sent... maybe with the RSSCloud event for example.

An RSS GET will pull the whole string of recent blogs posts for all 24,650
users. A _lot_ of excess text that most users already have from being real-
time listeners anyway.

The RSSCloud Blogger's software needs to see a difference between a RSS GET
for the recent blog text and an RSSCloud GET for the latest update text ONLY.
Reducing the amount of text being sent out but a change to the protocols as
described I think.

Of course, I could be _way off base_ but I'm really trying to understand the
overall architecture and the "realtime" problem this is intended to resolve
for us all.

NOTE: If you federate the RSSCloud servers you just make the "GET" problem
even worse. More demand on the blogger's RSS feed in a few seconds. It's like
a user driven "slashdot effect". Post a 140 char message and notify the cloud
and _boom_... you're server falls over.

I'll await corrections to my understanding.

NOTE: PubSubHubBub has an entirely different approach to the real-time
optimization for bloggers. The Hub Server gets the blogger's new post text and
the Hub Server forwards this delta to subscribed listeners. The Blogger's
server never sees any excess traffic in or out. Of course, the PubSubHubBub
service could require the resources of a Google, Amazon or Yahoo. A
centralized service that could potentially have a "fail whale". Dave's RSS
Cloud has a million "fail fishies".

Life as always is rife with tradeoffs. Go figure. YMMV.

~~~
easp
A good deal of your critique is a rehash of concerns people expressed about
RSS in the first place. They generally did not come to pass because either
they weren't real problems in the first place, they were easily addressed, or
general advances in technology moved faster than their onset.

You are concerned about the inefficiency of fetching all the items in a feed
when just one item changes. Is that a real issue? Consider your use case. How
much data is really being requested? If it is a real issue, the server might
want to limit the # of entries returned based on the if-modified-since header.
As for the load of all that traffic hitting the server in the space of a few
seconds, ngnix can push a lot of requests on modest hardware and the load on
whatever application logic is involved in generating the feed can be knocked
way down by having it cache all feed requests for a second or two.

------
patio11
I think a more fundamental reason is that RSSCloud solves the problem "I used
to use RSS to read articles, but the 15 minute delay between the article being
posted and me being able to skim it in my RSS reader was unacceptable" and
that _this is a problem real people do not have_.

~~~
wmf
I think it was originally designed to reduce the server load caused by polling
but now it has been dragged out of the attic to join the real-time hype wave.
I agree about real people, though.

------
igrigorik
RSSCloud or not -- I do happen to think that we need a push RSS solution --
what really bugs me is the fact that once again, we're inventing different
standards to do the exact same thing. As if ATOM vs RSS wasn't enough, now we
have RSSCloud + PubsubHubbub to worry about. PSHB already has Google behind it
(all of Feedburner feeds support it), so I really fail to see what wordpress
won by adopting RSSCloud.

Besides, while Dave is a brilliant guy, PSHB already has a lot more people
working on it with open source hub and client implementations (heck, I wrote
one for Ruby!).

~~~
blasdel
Atom : RSS :: PSHB : <cloud>

Dave will never stand by and let a perfect solution replace an old poorly-
specified ill-used mediocre one that he somewhat-falsely claimed to have
invented. Instead he'll repeatedly change the canonical version of the spec
live, without changing the version number or telling anyone (much less
preserving the previous versions). _It's super effective_ , at least for
creating drama.

------
jonknee
The proposal never made sense. Most people don't use desktop feed readers and
even if they did this solution wouldn't be scalable (as Cadenhead mentioned).
Google Reader knows when you update your feed because you already ping Google.
In my experience they are grabbing the feed in a few seconds anyway--it's up
to them to show this to users in real time if they want to but there is
nothing stopping them.

It seems like Dave wants to continue using his out dated desktop software (the
OPML Editor) to view feeds in a manner much better suited for a hosted
product. That might be great for him, but I'd rather not change publishing on
the internet so his decades old software can keep pace with Google Reader.

~~~
udekaf
The proposal has the potential to support pooling of multiple feeds. For
example, you can get the updates of thousands of feeds in one request. That
saves a lot of bandwidth and processing time. Does that make sense?

~~~
blasdel
The client would still have to be publicly routeable!

------
blasdel
RSSCloud is useless, for increasingly ridiculous reasons:

    
    
      10e2  It's idiotically-designed
              (he thinks traditional SOAP posted to a resourcey URL is REST!)
      10e4  It doesn't help centralized aggregators scale at all
      10e8  It doesn't work with NATed clients
      10e16 It was specced/never-implemented/forgotten by Dave Winer 8 years ago

~~~
EastSmith
You seem to like that word "idiot", arent't you? In two days you used it for
Dave Winer and Matt Mullenweg (<http://news.ycombinator.com/item?id=810288>).
Who's next?

~~~
blasdel
a) I'll take that back about Matt -- he's not an idiot -- he just writes code
naively, and accidentally became a prominent BDFL after SixApart repeatedly
threw away their own accidental prominence.

b) Dave isn't an idiot -- he's a poisonous egomaniac asshole that successfully
weedles his way into anything tangentially related to any of his pet projects,
and proceeds to do whatever he can to be credited for other people's work,
impede progress, and fuck people over. It's _his designs_ that are idiotic.

------
jaymon
The thing that really gets me with all this real-time hype is people attribute
the rise of Twitter to all the awesome desktop clients built for it so you can
receive your tweets in real-time, the problem is they aren't actually real-
time, all those desktop clients use polling, just with a really short
interval.

If you really wanted your desktop RSS reader to be just as awesome and real-
time as Twhirl then just set all the feed refresh intervals to 5 minutes, no
RSSCloud needed.

~~~
bmelton
Having done almost this exact thing, the issue is that 4 out of 5 large
websites will have blocked you buy your fifth poll for abusing their service.

I always thought that it was somewhat ironic that nobody had a problem with me
refreshing slashdot.org every 10 seconds trying to get a first post, but that
they had a very large problem when my rss poll hit them more than once an hour
(is at least what their recommended interval used to be.)

~~~
jaymon
I was going to mention this but I decided it wasn't really necessary to make
my point. I was just trying to show that no matter how hard certain people
push real-time it usually still comes down to polling in a desktop situation.
I think people have gotten it into their minds that 15-60 minutes delayed
isn't real-time while 1-15 minutes is.

But I agree that most places will throttle you if you request their feed too
often, so how are they going to feel when every one of their subscribers
requests their feed within the same minute every time they make an update? I
guess that's the big question.

