

How FlightCaster Squeezes Predictions from Flight Data - pskomoroch
http://www.datawrangling.com/how-flightcaster-squeezes-predictions-from-flight-data

======
jimbokun
These guys appear to be having a hell of a lot of fun. Their technique of
wrapping a stack from Amazon EC2 through Hadoop all the way up into Clojure,
was the kind of thing I wondered about being possible, so it's pretty awesome
to hear it is being done and done well by someone. The idea of iterating with
Clojure in a REPL on a small dataset to develop or refine an algorithm, then
pressing a button and see how it does running on some large dataset on EC2,
sounds sublime.

Even if they never release any of the glue code that makes all this happen,
just knowing it is possible is very encouraging.

~~~
pskomoroch
The real world Lisp/Clojure + Hadoop approach is definitely fun to hear about.
Also interesting to see a YCombinator team with 8+ people including a domain
expert. Mashing up 4+ messy data sources is tough to do. Very unconventional
on many fronts.

~~~
jaf12duke
It's totally fun--we love it. Everyone has a role that they own and we trust
each other to execute. Stay tuned--this is just a small slice of what's
coming...

Jason (@FlightCaster)

------
pskomoroch
An “in the trenches” interview on building a machine learning application with
Rails & Hadoop. During the interview on FlightCaster, Brad describes some of
the challenges of working with flight data, statistical approaches for flight
prediction, false negatives in FlightCaster, Clojure, Hadoop & Amazon EC2,
YCombinator, and more. Was pleasantly surprised at how open Brad was about the
model internals and data crunching pitfalls.

------
caffeine
In the article, it mentions Bradford's Amazon wishlist.

For the curious: <http://www.amazon.com/gp/registry/wishlist/3RB4REDIKE28I>

And those that have been purchased (a better list):
[http://www.amazon.com/gp/registry/wishlist/3RB4REDIKE28I?rev...](http://www.amazon.com/gp/registry/wishlist/3RB4REDIKE28I?reveal=purchased)

~~~
bradfordcross
FYI, the purchased books list is not very representative since I buy so many
books directly without flowing them through the wish list.

My book lists are better than my wish lists, not just because there is stuff
in the book lists that is not in the wish lists, but also because I take time
to maintain the book lists and only include the really good stuff.

I'm planning to update my book lists soon with recommendations for statistics,
AI, machine learning, and other treasures.

Here are my book lists as of now:

[http://www.amazon.com/gp/richpub/listmania/byauthor/A1JKHQFC...](http://www.amazon.com/gp/richpub/listmania/byauthor/A1JKHQFC9WMPN5)

------
hexis
It's great to see someone taking piles of data and pulling some meaning out of
it. I think it's easy enough these days to be able to see the potential of
data-mining a site like Facebook, but I expect a lot of value to come out of
sites like FlightCaster that are getting value in domains that folks don't
normally think of being data-intensive. Google was more or less data-mining,
but it was a relatively easy set of data to access: public websites. Now,
we're seeing the exploitation of more obscure, but not necessarily less
valuable, data.

~~~
jaf12duke
Thanks Hexis. The key for us was being able to combine deep domain expertise
with Brad's data-mining capabilities. The model is a nice mix of statistical
induction and domain-based logic. We're adding more data sources to it, so
both the power of the algorithms and capabilities will only get better.

~Jason (@FlightCaster)

------
cianchette
Great interview. It's amazing that they built all of this during the past
couple of months. Awesome work guys!

~~~
jaf12duke
Thanks!

------
madang
An informative interview providing better understanding of the amazing work
this team has done. A well balanced team with great creativity, energy,
dedication, perseverance not to mention their awesome talent. Great team work
resulting in a quality product. You guys rock!

~~~
jaf12duke
Much appreciated! Thanks for the kind words. We're really excited to push out
a lot more stuff in the next few months...

~Jason (@FlightCaster)

------
lucraft
Yes, but are their predictions correct? Anyone tried them out?

------
physcab
Great interview. I'm more curious now as to each of your personal histories.
Each of you seem like incredibly gifted domain experts.

