
How FlightCaster (YC S09) was built (RoR + JVM + Clojure + Hadoop) - zemariamm
http://groups.google.com/group/clojure/browse_thread/thread/4e2b193812c59bdb/9203e07d197dbd29?hl=en#9203e07d197dbd29
======
minalecs
This is an awesome post, would love to see more overview and analysis like
this from more of the YC companies. Good stuff.

~~~
dschobel
I sent it to all my functional programming fancying friends who are working at
big anonymous corporations in ten year old codebases of OO code and their
response was equal parts despair and envy.

~~~
bradfordcross
I prefer a functional to an imperative style, but IMO, let's not bash OO too
badly. :-)

I think the kind of message-centric OO that Alan Kay was talking about was
very different from what we see in practice today.

Likewise, if you read "The Art of Meta-object Protocol" there are some deep
insights about the configurability and power you get whan you have pre and
post hooks into everything in the system. MOP systems are very powerful.

It is also nice to have the implicit "this," "self" or "message recipient"
that you don't have to pass around all the time into the same family of
functions (methods.)

The kind of typical "enterprise java" code that you see in the wild is what I
call "class-oriented programming" - using classes as containers for procedural
code. This is at the extreme end of crappiness for the imperative world.

If you go back to the initial spirit of OO, and combine that together with
techniques like constructor injection with good citizenship, lots of immutable
value objects, and polymorphic strategies, you get a style of OO that is more
friendly with FP. It still isn't that delightful in verbose languages like
java, but is a lot better than the norm.

The problem is that not a lot of people deeply grok OO in this way and you
don't run into projects using this style often.

But let's not bash OO, let's learn from the cool stuff and throw out the
garbage.

~~~
greentree
"If you go back to the initial spirit of OO, and combine that together with
techniques like constructor injection with good citizenship, lots of immutable
value objects, and polymorphic strategies, you get a style of OO that is more
friendly with FP. It still isn't that delightful in verbose languages like
java, but is a lot better than the norm."

Any references for this style of development? Have you written about your
development/architecture practices anywhere?

------
physcab
I'm curious to know what type of machine learning algorithms are being used
for the prediction. Anyone have any thoughts?

~~~
zemariamm
me too, but I also would love to know which data sources they are using

~~~
gojomo
They may consider their methods too proprietary to supply details, so I'll
speculate:

Information on earlier-in-day flights alone would probably be enough (mixed
with historical data) to be fairly accurate about later-in-day flight delays
-- even those that aren't continuations of the same plane -- because of giant
overlaps in delay-causing conditions (weather, airport/mechanical mishaps,
crew issues, etc.).

Weather forecasts might give another advantage, especially in predicting
'seed' delays that then hint at later delays.

If there are any other semi-public feeds related to FAA reporting or air-
traffic control -- even if mostly meant for other pilots or General Aviation
-- those would be incredibly valuable.

 _If_ those regional and national maps of planes-in-flight also contain
sufficient positioning detail to notice when they're spending a little extra
time on runways, or waiting for/at gates, etc. -- another positive early
influencer for predictions.

~~~
physcab
I don't think it would be harmful for them to give a clue about what
algorithms they use. There is so much tuning required to get these algorithms
to perform the way you'd expect that I think they could still keep their IP
locked up even if they gave a general hint.

With that said, I'll speculate as well: Perhaps you'd need some type of ideal
dataset, one that included departure times/arrivals and distances and weather
conditions of flights that came in on time as expected. Then you might start
introducing some noisy data, ie. flights that made the same trip but came in
late or early with same weather conditions. Then you'd add in the effect of
weather conditions and see how flights fared. I'd speculate that you could get
away with doing some type of regression analysis, but maybe you'd need to
resort to a more complex algorithm for classifying ("On-time", "Early",
"Late") based on a series of features ("Distance","Weather","Mechanical",
"Time", etc). SVM could pull this off, or perhaps even a naive bayesian
classifier. For research purposes I'd probably check out RVM because it might
need less information to classify. Not sure if it would be realistic to use it
though...this problem is in need of a highly scalable solution.

------
euroclydon
I wonder if they'll be able to transition all that logic toward analyzing a
different set of data if predicting flight delays doesn't prove that
lucrative? I guess if they take this far enough though, the government or an
airline consortium might just buy it.

~~~
bradfordcross
Yes. Some people seem to want us to do traffic too. Seems like we've gotten
ourselves into trouble by attacking the messy problems. :-)

------
brown9-2
Very interesting read.

Let's say I know next to nothing about Clojure, Lisp, functional programming,
etc.

Could anyone suggest some resources for a beginner on this topic who is
interesting in learning more about functional programming? Some tutorials
perhaps?

~~~
tim_sw
programming clojure is good for java devs coming to clojure, functional
programming, check this one out (<http://learnyouahaskell.com/>) for haskell
Little Schemer (and subsequent books are good too) I personally started from
the Little Schemer a few years back.

------
zandorg
Sounds like a great service - I'm not in the USA and don't travel much, but it
sounds like some great data mining!

------
californiaguy
Wait, so... this tells me that my flight is "probably" delayed?

What if I take that as gospel and show up to the airport late or otherwise
make plans based on that data and then it turns out it wasn't actually
delayed?

Isn't the equilibrium action for me to show up at the airport on time no
matter what?

~~~
physio
Agreed; this is fairly useless on its own. However, the TC blurb says:

 _"In the future, the company plans to offer a list of alternative flights so
you can quickly rebook once you learn of a delay."_

I think the real application of this is going to be purchasing fully-
refundable tickets, then switching an 85%-likely-delayed flight for an
85%-likely-on-time flight, for a 72% chance of making the right call.

If enough people do this, airlines are going to end up cancelling entire
flights when everyone switches their tickets. This will mean they either A)
offer cheaper flights on the cancelled-then-rebooked "new" flight for anyone
who might need it, B) offer discounts if you DON'T cancel, or C) raise the
prices of refundable tickets so high that you will go to another carrier at
the outset.

Better information for customers inevitably leads to more competition and
lower prices.

Of course airlines are barely surviving as it is, so this kind of thing would
(eventually) kill off some number of stragglers.

On an editorial note, I say _good riddance_. As far as I'm concerned, the
entire airline industry can go out of business for not standing up for their
customers. I don't want to be physically molested or scanned naked, deal with
"freedom baggies", a lack of water, taking off my shoes (note in other public
venues you MUST wear shoes due to health codes), power-tripping morons, people
rifling through my luggage, stuff stolen out of my luggage, late luggage,
damaged luggage, "lost" luggage, showing my ID, showing my ID to 3 different
people, mission creep leading to arrests for NON-safety-related issues,
"behavior detection" specialists looking to harass nervous and/or agitated
individuals, "no fly lists", quasi-police-powers bestowed upon flight
attendants so "interfering with a flight crew", e.g. arguing with a
stewardess, is now a federal crime; a hundred other things, all capped off by
_secret laws_ which heretofore had always been held to be _unconstitutional_ ,
but now we aren't allowed to know what the laws are, pertaining to aviation
security.

Finally, if you want to know if a given flight will be late, the answer "yes"
also works about 85% of the time.

~~~
greyboy
_If enough people do this, airlines are going to end up cancelling entire
flights when everyone switches their tickets._

With a family member who spent many years working for a large carrier, this
isn't how the airline industry works (in America, and I assume many other
nations). Flights are almost never cancelled except due to mechanical failure,
crew fatigue, or inclement weather. The reason being, that same plane taking
you from NYC to SFO is also the plane that 130 people are waiting for in SFO
to SEA. So, whether it is 100% full or only has 2 people, it still makes the
trip. Have you ever been on a plane with only 2 passengers aboard? I have -
and you can pick any seat you want! They will even send an empty plane out
(with crew, of course).

 _C) raise the prices of refundable tickets so high that you will go to
another carrier at the outset._

Refundable tickets are already that high - almost nobody buys them. They are
already 3-5x as high as your average discount ticket and almost solely
purchased by business travellers or foreigners who need that flexibility. The
99% of cow-herded casual travellers buy the bottom-of-the-barrel discount
tickets, usually from Travelocity and the like, with the most restrictions.
Unfortunately, I don't see that changing unless there is a massive uprooting
of the current airline industry.

