Is Storm something that is going to get 99.9% of my data there which is good enough for some side-channel processing like ad targeting, or is it something that's D durable in the strongest sense of the word? If I'm ultimately delivering data to something that is 5 9's durable, or better yet S3 10-11 9's durable, is Storm going to be my "low point" of durability?
"Guarantees no data loss: A realtime system must have strong guarantees about data being successfully processed. A system that drops data has a very limited set of use cases. Storm guarantees that every message will be processed, and this is in direct contrast with other systems like S4."
Here's an explanation on how Trident achieves fully fault-tolerant, idempotent semantics:
I'm off the buy the books and bookmark this page for future reference. Great work Nathan.
Someone else posted link to Apache Kafka developed at LinkedIn.
It seems like they address similar use cases... it would be interesting see a comparison.
The code itself is nice and clean, take a look yourself (chosen randomly) 
It's exciting because if you've ever done any real time processing of data streams at scale or even been daunted by the idea of it, and you read the wiki, it's clear storm is an exciting option. It does for Big Data Streams what Hadoop did for Big Data.
Kafka is interesting, but they don't really address similar use cases. They are more complimentary and in fact storm has a spout implementation for kafka.
I would add to this that a lot of Storm users (myself included) use Storm together with Kafka - that is, Kafka is used to get data into (and possibly out of) Storm while Storm does the actual processing. Kafka is more along the lines of Kestrel and RabbitMQ.
I am very interested in Storm, but wondering if the book is already out of date.