
Driving down the cost of Big-Data analytics - DanielRibeiro
http://www.allthingsdistributed.com/2011/08/amazon-emr-on-ec2-spot-instances.html
======
espeed
What open-source tools are people using to build real-time big data analysis
systems?

Jeff Jonas of IBM doesn't recommend batch systems such as Hadoop
([http://jeffjonas.typepad.com/jeff_jonas/2011/04/the-data-
is-...](http://jeffjonas.typepad.com/jeff_jonas/2011/04/the-data-is-the-
query.html), <http://techcrunch.com/2010/10/27/big-data/>) for real-time
context accumulation systems. I have heard that IBM is making use of
topological embedding algorithms to build distributed graph databases capable
of real-time analysis, but very little of that research is public.

~~~
noelwelsh
The production systems I know of are:

\- Yahoo's S4 (<http://s4.io/>).

\- Twitter nee BackType's Storm
([http://engineering.twitter.com/2011/08/storm-is-coming-
more-...](http://engineering.twitter.com/2011/08/storm-is-coming-more-details-
and-plans.html)).

\- Esper (<http://esper.codehaus.org/>).

While Amazon's announcement is great news for Hadoop users, I do think the
future is in these real-time systems. However there isn't a clear winner yet,
so it is appropriate that AWS is yet to bundle one up as an offering.

~~~
espeed
The non-open-source one I keep hearing about is Vertica
(<http://www.vertica.com/resources/videos/>), which was recently acquired by
HP and was founded by Mike Stonebraker
(<http://en.wikipedia.org/wiki/Michael_Stonebraker>), the inventor of
PostgreSQL.

