
Analyzing Caltrain Delays - jsweojtj
http://www.svds.com/the-trains-project-analyzing-caltrain-delays/
======
mattcaywood
Data scientists blog about Caltrain data, come up with convoluted hypothesis
about bias in sensors at two stations.

Commenter on blog notices that Caltrain is occasionally single-tracking
between those stations due to a bridge replacement. [1]

"Data science" ends up with a bloody neck from Occam's razor.

[1]
[http://www.caltrain.com/projectsplans/Projects/Caltrain_Capi...](http://www.caltrain.com/projectsplans/Projects/Caltrain_Capital_Program/San_Mateo_Bridges_Replacement_Project.html)

~~~
jsweojtj
There are a couple of reasons that this isn't the explanation.

To pick one, the random data selection in the blog post showed data from
October 2015 -- Feb 2016, and this Caltrain link appears to show the bridge
work starting Feb 26th, 2016.

So, no the just-so story doesn't appear to be just-so.

~~~
mattcaywood
The data is very much consistent with the bridge work hypothesis. The website
indicates a series of bridges are being replaced. Starting 9/28/15 with Tilton
Ave. and proceeding northward to Monte Diablo Ave then Santa Inez Ave.

------
guard-of-terra
Seriously, need to break that vicious cycle: you get used to things going bad,
people who run them get used to not delivering, breaking due process, posting
unrealistic schedules, and BAM - now you think the problem can't be solved,
only plotted.

I don't remember last time when I have not seen a train at a station at
scheduled time, and the place where I live doesn't expire confidence. When
something is late, it gets in the news. Not every week.

~~~
ZanyProgrammer
To be (grudgingly) fair to Caltrain, a lot of things are perhaps indirectly
out of their control-stuff like suicides (until you get gated platforms to
stop people from jumping in front of Baby Bullets) and vehicles stopped on
tracks are hard to predict. Likewise they are a victim of their own success-
standing room only trains make boarding and unboarding longer than they have
to be.

Of course, Caltrain has put off electrification and grade separation for a
long time, and even now its massively expensive and ruinous (see the CBOSS
fiasco) so its not like they are totally off the hook.

~~~
archagon
I can't help but feel that if there was a BART line where the Caltrain
currently runs, things would be about a hundred times better than they
currently are, on all fronts. Alas...

~~~
ZanyProgrammer
Electrified Caltrain is infinitely preferable to BART.

~~~
nulltype
Why electrified vs normal caltrain? Will that decrease the variance or door to
door time of using caltrain?

~~~
avuserow
Faster acceleration. Each stop will cost less time, and Caltrain has a
surprising amount of stops between SF-SJ (22 weekday) for its length (47
miles). You can see this effect already with the amount of express schedules
that Caltrain has and their varying trip times.

~~~
nulltype
Yeah I couldn't find any estimates about how much time though. It seems like
the express train would not be affected much at all.

------
ZanyProgrammer
If only Caltrain had a decent API for developers that produced data that is
needed for any serious analysis (like train number, speed and lat/long). I
hate having to resort to hackey workarounds like scraping for this info. I
believe the NextBus API for MUNI has actual positional data and vehicle info.
Caltrain, being the mediocre agency it is, has a next to useless API (if I
remember the docs correctly on 511.org).

~~~
jsweojtj
An API would go a long way and realtime information would obviously be
awesome. It appears that even if Caltrain provided historical data of actual
departure times per train per station, that would be a huge improvement.

