

Storm's first birthday - nathanmarz
http://nathanmarz.com/blog/storms-1st-birthday.html

======
newobj
What I'd like to understand at a glance about Storm is durability,
idempotency, and specifically those two things in the face of all your
standard failure modes.

Is Storm something that is going to get 99.9% of my data there which is good
enough for some side-channel processing like ad targeting, or is it something
that's D durable in the strongest sense of the word? If I'm ultimately
delivering data to something that is 5 9's durable, or better yet S3 10-11 9's
durable, is Storm going to be my "low point" of durability?

~~~
mattmcknight
Storm supports transactional topologies that can provide durability. They are
dependent on a spout that is repeatable- where you can request the failed
input again. [https://github.com/nathanmarz/storm/wiki/Transactional-
topol...](https://github.com/nathanmarz/storm/wiki/Transactional-topologies)

~~~
nathanmarz
Actually transactional topologies are superseded by Trident, which is much
easier to use:

<https://github.com/nathanmarz/storm/wiki/Trident-tutorial>

Here's an explanation on how Trident achieves fully fault-tolerant, idempotent
semantics:

<https://github.com/nathanmarz/storm/wiki/Trident-state>

------
sanswork
Nice to see this. I started trying to implement storm towards the beginning of
this year since it seems like something I've wanted/reimplemented partially so
many times and just found it to be so time consuming trying to get good
resources or find a community for misc support that I gave up and
reimplemented simple parts of it as I needed to.

I'm off the buy the books and bookmark this page for future reference. Great
work Nathan.

~~~
nathanmarz
Sorry to hear that you had trouble finding support for your issues. The
mailing list is very well trafficked so hopefully you have more luck there
next time.

~~~
sanswork
No need to apologize, it was quite early in the development of the project(and
expected). I'm just glad to see the progress you've made in a year and look
forward to implementing Storm to solve a few of our issues in the near future.

------
djb_hackernews
Even if you have no use for storm, I'd strongly suggest you check out the wiki
on github and browse the source code. It is IMO the nicest, cleanest, most
exciting OSS project out there.

<https://github.com/nathanmarz/storm>

~~~
chubot
Can you expand on that? What's nice and clean and exciting about it?

Someone else posted link to Apache Kafka developed at LinkedIn.

<http://incubator.apache.org/kafka/design.html>

It seems like they address similar use cases... it would be interesting see a
comparison.

~~~
djb_hackernews
Oh easy questions:

The code itself is nice and clean, take a look yourself (chosen randomly)
[1][2]

It's exciting because if you've ever done any real time processing of data
streams at scale or even been daunted by the idea of it, and you read the
wiki, it's clear storm is an exciting option. It does for Big Data Streams
what Hadoop did for Big Data.

Kafka is interesting, but they don't really address similar use cases. They
are more complimentary and in fact storm has a spout implementation for
kafka[3].

[1]
[https://github.com/nathanmarz/storm/blob/master/src/clj/back...](https://github.com/nathanmarz/storm/blob/master/src/clj/backtype/storm/daemon/supervisor.clj)
[2]
[https://github.com/nathanmarz/storm/blob/master/src/jvm/back...](https://github.com/nathanmarz/storm/blob/master/src/jvm/backtype/storm/task/ShellBolt.java)
[3] [https://github.com/nathanmarz/storm-
contrib/tree/master/stor...](https://github.com/nathanmarz/storm-
contrib/tree/master/storm-kafka)

~~~
dkersten
_They are more complimentary and in fact storm has a spout implementation for
kafka_

I would add to this that a lot of Storm users (myself included) use Storm
together with Kafka - that is, Kafka is used to get data into (and possibly
out of) Storm while Storm does the actual processing. Kafka is more along the
lines of Kestrel and RabbitMQ.

------
someone13
One of my major wishlist items for Storm is an end-to-end tutorial on how to
get it working for a non-JVM language. I'm not a JVM coder, nor do I know much
about it, and the documentation on how to get it working for Python is hard to
follow at best, IMHO.

~~~
mattmcknight
There's about a 9 page chapter in the new storm book (more of a large pamphlet
really) on that. [http://www.amazon.com/Getting-Started-Storm-Jonathan-
Leibius...](http://www.amazon.com/Getting-Started-Storm-Jonathan-
Leibiusky/dp/1449324010)

------
encoderer
Keep up the good work, Nathan. I presented Storm as a tech-talk at Formspring
and no question, you made my job easy by building a platform that makes it
easy to get a topology running without a ton of boilerplate.

------
floppydisk
I've used Storm for a couple of projects and really liked it. It significantly
simplifies building real time data pipelines and makes deploying them KISS
easy. I'm really excited to see some improved metrics coming out as well--if I
had one gripe with Storm .7 it was that I wanted some more metrics to monitor
cluster performance. Congrats on a year and keep up the good work, Storm is
awesome!

------
designatedInit
I know this has been repeated ad nauseum, but I'd really like to see blogs put
what their product is somewhere on the page.

~~~
marshray
LMDDGTFY <https://duckduckgo.com/?q=storm>

Hmmm...

~~~
scotty79
<https://github.com/nathanmarz/storm/wiki>

------
dkersten
I've been using Storm for a few weeks and it works great, especially Trident,
which I've been using for the past 2 or so weeks. Nathan has also been very
helpful answering my n00b questions :)

------
paperwork
so trident is apparently a high level api for Storm, but I don't see any
mention of it in the Storm book. Is that right?

I am very interested in Storm, but wondering if the book is already out of
date.

~~~
mattmcknight
It is a little bit out of date, in that it doesn't cover anything in 8.0, but
most of the basic stuff didn't change, it was more of new features being
added.

------
dmix
Are there plans to integrate Storm with Datomic?

~~~
nathanmarz
Not by the core team, though I think you could make a pretty cool Trident
state implementation for Datomic.

