
DataSift Architecture: Realtime Datamining at 120,000 Tweets Per Second - aespinoza
http://highscalability.com/blog/2011/11/29/datasift-architecture-realtime-datamining-at-120000-tweets-p.html
======
geuis
I find the architecture to be really interesting and useful to learn from.
However, they are _way_ too expensive. 1000 tweets is not very much data. I'm
building a realtime app now and easily am processing tens of thousands of
relevant tweets a day. While a service like Datasift could alleviate a lot of
heavy lifting on my part, the cost just doesn't make up for it. It feels like
their business model is currently focused on use-cases requiring highly
specific targeting, but not intended for use where services need high volumes
of certain types of data. Shame, that.

~~~
nirvana
Where are you getting your tweet feed? My initial interest in data sift was
because they let you get a feed from the firehose. Twitter doesn't seem to let
you do this.

All I want is tweets relevant to a particular subject, and in the early days I
don't want to be paying hundreds of dollars for it... I've got keywords and
phrases I can use to find them, if I just had access to an API that would let
me. (Maybe twitter offers this, I couldn't find it in the past.)

~~~
geuis
I'm using the stream, but using track filtering. So far it's working very well
for my purposes. You are right in that different use cases might need the
firehouse and that's where services like gnip and datasift really come in
handy. It's too bad that there's not a middle ground.

------
alexro
We have recently seen a number of startups (including YC founded AFAIK) that
look for ways to make twitter data useful. But I don't remember catching any
notice of such successes.

While DataSift realtime capabilities look really impressive, I'm afraid there
isn't that much of use-cases to pay for the data mined that way. Even
DataSift's own list of possible use cases looks bleak.

In any way though DataSift should be fine with applying their expertise to
other sources of data, which doesn't bear the same cost as the twitter's
firehose.

------
djb_hackernews
Excellent article.

I worked on a project that I integrated with DataSift.

Lorenzo and every one else I emailed with were very quick to respond and the
product performed as expected (besides some hiccups that I can easily
understand due to growing pains).

Compared to Gnip, (which we also integrated with) DataSift won hands down on
both quality of product and customer relations.

However, I'm still suspect of the usefulness of Twitter analytics/ data
mining.

~~~
kalkat
There is a lot of value in mining data off social media, but only if you can
convert that into simple and easy use cases. We are doing something like that,
and we are very happy about the way it is shaping up.

