

Ask HN: How do sites such as Flipboard scale out their social data aggregators? - iamclovin

There are others such as Friendfeed, etc. which aggregate social data.<p>The naive explanation is that they have "worker processes" running <i>per</i> service <i>per</i> user which either streams or polls content, but this seems like an expensive proposition (or is it?)<p>Are they using something else such as the Twitter "site-stream" functionality so that they can multiplex several users as part of one API call?<p>(Original Question posted on Quora: http://www.quora.com/How-do-sites-such-as-Friendfeed-and-Flipboard-scale-out-their-social-data-aggregators)
======
Skywing
I know this can be tricky. I wrote a Twitter-based keyword filtering tool and
subscribed to their garden hose. They say to only use one connection at a
time, so I think it may be violating the terms of service to be running more
than one stream. Twitter offers larger stream bandwidths, too. Perhaps one
stream of all the tweets can be obtained. Even with my one stream connection,
I was missing incoming tweets because my local connection was too slow. So,
this is a good question. I'm interesting in hearing the answer, too.

~~~
a3camero
The firehose is actually a sample of the full amount of data. If you want more
coverage you need to use other techniques. One way to do it is to search for a
specific word. Constantly...

Here's a poor quality video from TC Disrupt last year which tells you a bit
about how to geosearch Twitter to mine information:
<http://www.youtube.com/watch?v=_QCnPHHCTXg>

------
YuriNiyazov
Not all users are created equal. I might be running 1 process per user for a
currently logged in user, but move all scanning for non-logged in users into a
low-priority queue.

------
daniel_iversen
I think the principle (and there are many commercial and open source tools -
Jasper is one I believe) is called "event stream processing" (picking meaning
out of the firehose), it's big in the financial space of course where they
have had these volume problems for a long time.. what do you think about that?

Cheers,

Daniel

