Hacker News new | comments | show | ask | jobs | submit login
MTA Releases Subway Time App (mta.info)
45 points by memset 1545 days ago | hide | past | web | 40 comments | favorite

This app also has a web version: http://apps.mta.info/traintime/

Which means the MTA has unintentionally developed an API: http://advisory.mtanyct.info/Test/gtfs/scraper.php?line=1...

They have a real api: http://datamine.mta.info/

I doubt it was unintentional. I know some guys at MTA who work on this stuff, and they are all about open data and APIs and robots etc.

The fact this information is finally out there is fantastic. But for God's sake, if this is the case:

“The existence of this sleek digital interface barely hints at the investment that had to be made in terms of hardware and infrastructure to make this enormous public benefit a reality,”

Why is this literally the worst app I've ever used? A weird old pixellated icon, non-retina bitmapped user interface elements, scrolling that doesn't continue and decelerate once you let go... not even any clear indication of which trains are uptown or downtown, no map to more easily pick a station, etc...

If they've been spending millions on the infrastructure to make this possible, couldn't they have spent more than what I would guess to be about $500 producing this app? It's like a third-grader wrote it. A bad third-grader.

I live in NYC and have been following the city's technology initiatives closely. The silver lining here is that the city has come to terms with the fact that they are le suck at making awesomely designed consumer apps, and releasing the data to the public will be beneficial for all. My favorite part of this post is the "Commitment to Open Data".

Someone out there will make something amazing with this data, it's practically inevitable.

Speaking of, if you live in NYC you should come to this Reinvent Payphones Hackathon - same sentiment (full disclosure, my startup is helping to organize): http://reinventpayphones.prototypehackday.com/

The availability of arrival times is just a side effect of the infrastructure upgrades. The real reason for upgrading the infrastructure was to be able to run trains closer together.

Also, iPhone development is more difficult in real life than in your imagination. It should be easy to wrap a nice interface around a protobuf, but it's actually a rather tedious process.

> Why is this literally the worst app I've ever used?

I've used apps that have worse interfaces and don't do anything useful.

Boston's MBTA has had similar services available for years. About 3 years ago they rolled out tracking for the Orange, Red and Blue lines. Less than 2 years ago, they rolled out GPS tracking for buses. Recently, they started adding arrival time boards for the Orange, Red and Blue Lines as well.

* MBTA App Showcase: http://www.mbta.com/rider_tools/apps/

* MassDOT Developer Page: http://www.eot.state.ma.us/default.asp?pgid=content/develope...

Additionally, cellular service has been added to most subway lines and stations for at least the orange line (so you have decent signal while riding and in stations), plus WiFi available on most commuter rail trains.

> “This is what generations of dreamers and futurists have waited for,” said MTA Chairman and CEO Joseph J. Lhota. “The ability to get subway arrival time at street level is here.

The breathlessness of the coverage of this app makes me wonder how much money was spent on it? First of all, the main problem for New Yorkers is the lack of this data for all the other train lines (i.e A-Z). Second, the MTA has, to their credit, made this data easy to get as others have posted. And apparently Android maps already use this information...so why build an app except for getting some public goodwill. Not that public gestures are bad, but it depends on the cost (puts 'file information request for app dev contract' on things to do sometime)

> " First of all, the main problem for New Yorkers is the lack of this data for all the other train lines (i.e A-Z)."

Unfortunately this is not going to be fixed any time soon. BMT/IND trains have a distributed command and control system. Multiple signal towers control train movement within their own sectors of the system - and each signal tower has no information about the rest of the system.

This was necessary in the early days, before electronics existed.

So to this day there is no centralized snapshot of the entire system state, unlike the numbered lines, and thus no ability to provide an API.

This is a great step forward. I look forward to the day we can have a big jumbotron with real time locations of trains.

What legacy system could possibly prevent one from simply recording the position of every train every few seconds? This is not a particularly difficult problem for a billion-dollar business to solve.

One built in 1904.

The control systems of the MTA are purely electromechanical. It goes way beyond what one would normally call a "legacy system".

You can't record the position of trains because there is no way to snapshot the whole system. Only the signal tower local to that section of the system is aware of what's going on in its zone. This is a vestige of the design where there was no reasonable way to centralize that amount of signaling over such great distances.

To get a snapshot of the whole system you'd have to simultaneously snapshot the state of every signal tower in the system and find some way to collect this information centrally. This sounds reasonable until you realize that the control system, being 50 years pre-computer, is unaware of individual trains. It doesn't detect trains, only presence of at least one car on a section of track (detected by the third rail making contact with the train).

To ascertain which train is where you'd have to either have dispatchers identify the train by communicating with operators (hard to poll), or infer the identity of a train by monitoring state. Both are quite difficult when all of your data is spread all over the entirety of NYC in isolated distributed control centers.

The IRT trains spent billions centralizing all command and control into a single building in Manhattan, and thus snapshotting state is trivially easy. The BMT/IND segments of the system are substantially bigger, with more stations, than even that. A full modernization to electronic control is necessary, which will cost many billions.

I don't care about the signal towers or how old the system is. The point is that the technology for identifying the location and speed of a few hundred giant hunks of metal moving around on fixed tracks is trivial and does not need to interact with the old system (or humans) at all. Just tag each train with a few RFID's, and stick in a few hundred miles of wiring into the tunnels.

I mean, are you telling me for a 100 million dollars this couldn't be done?

So on the decentralized systems, how does one station know when to delay a currently-stopped train because there's a delayed train farther up the route? Do signal towers up ahead simply transmit to adjacent stations "Delayed train up ahead?"

Pretty much. A signal tower usually controls several stations, so delays up ahead are usually contained within the zone of control of a signal tower.

In cases where delays are more substantial and you need to hold trains further upstream, this is communicated manually between towers.

Much of the control here is not based on semantic data, but rather verbal. The actual control system itself isn't capable of much more than indicating which piece of track is occupied.

I would guess it does not know there is a delayed train on the route, but just that there is a train on the route.

It is fairly easy to hook up a "'low resistance between the rails aka 'wheels on the rails' aka 'there is a train here'" detector to a 'do not enter' signal a few hundred meters up, and to a 'drive slowly' signal some distance before that (http://en.wikipedia.org/wiki/Automatic_Block_Signal)

I thought the presence of the train on given section of track was actually detected by the trains wheels completing the circuit between the rails, not with the third rail.

Let's assume that there's a byzantine system in which two parallel and unconnected systems control both BMT/IND trains...at some point, each train communicates arrival and destination data to its respective system. Why can't both systems report to a single controller? At some point, this information is centralized (dear God I hope it is, especially for evaluating performance metrics)...I can't see how it would be hard to provide this digitized data format at real time.

For starters: there's no electronics involved. It's all relays and stuff.

I was looking at this earlier, there's an API too that seems pretty simple to use:


(1) Developer key: http://datamine.mta.info/user/register

(2) The data is in Protocol Buffer format, so get that: http://code.google.com/p/protobuf/downloads/list

(3) The proto definition of the GTFS-realtime feed: https://developers.google.com/transit/gtfs-realtime/gtfs-rea...

(4) Some static data (which has IDs for each station, etc). http://www.mta.info/developers/data/nyct/subway/google_trans...

(5) You can just curl the URL for all the current data (with your developer key):

  curl http://datamine.mta.info/mta_esi.php?key=<developerkey> -o /tmp/mtafeed
(6) And you can decode it using protobuffers (protoc is from 2 & gtfs-realtime.proto is from 3)

  cat /tmp/mtafeed | /usr/bin/protoc -I /tmp /tmp/gtfs-realtime.proto  --decode=transit_realtime.FeedMessage > /tmp/decodedmtafeed
(7) Now you have a file full of messages like this:

  entity {                                                                                                                                                                                                                            
    id: "000170"                                                                                                                                                                                                                      
    trip_update {                                                                                                                                                                                                                     
      trip {                                                                                                                                                                                                                          
        trip_id: "078600_4..S44R"                                                                                                                                                                                                     
        start_date: "20121228"                                                                                                                                                                                                        
        route_id: "4"                                                                                                                                                                                                                 
        1001 {                                                                                                                                                                                                                        
          1: "04 1306  WDL/UTI"
          2: 1
          3: 3
  [ some data removed ]
      stop_time_update {
        arrival {
          time: 1356720373
        departure {
          time: 1356720373
        stop_id: "635S"
        1001 {
          1: "2"
  [ some data removed ]
Each "entity" looks like a "trip", i.e. a train. It says where the train is and when it will arrive at various stops. I found more documentation here: https://developers.google.com/transit/gtfs-realtime/referenc... and here: http://httqa.mta.info/developers/pdfs/GTFS-Realtime-NYC-Subw... But still need to parse it all. It seems to be basically like this:

In the "entity" - trip_id: "078600_4..S44R" * 078600 is a train id or something * _4.. means it's a 4 train (this is also available on the route_id field) * S means it's heading south * 44R describes the stops this train will make. I'm not entirely sure how to parse this.

In the stop_time_update: - stop_id: "635S" * you can find the stop codes in stops.txt from (4). This one is "635S,,14 St - Union Sq,,40.734673,-73.989951,,,0,635" - arrival / time: 1356720373 $ date -d @1356720373 Fri Dec 28 13:46:13 EST 2012

So, I guess there was a Downtown 4 train, scheduled to stop at union square at 1:46:13. (I downloaded this about about 1:30 Eastern, so that seems right).

I love that Chicago's CTA believes in a much more open development model: http://www.transitchicago.com/apps/ http://www.transitchicago.com/developers/default.aspx

More data:

https://www.metrochicagodata.org/ https://data.cityofchicago.org/

It's all a part of Chicago's OpenGov work and its incredible how much data and work is out there. Here's a list of apps that people made in a city-sponsored competition:


And here's a link to one of the meetups that try to do something cool with all that open data from the City:


"The MTA is challenging software developers to use MTA data to create new apps that improve the transit experience of its 8.5 million daily riders." http://mtaappquest.com/

"New York City challenged software developers to create apps that use city data to make NYC better." http://2011.nycbigapps.com/

Maybe I'm missing something here, but Chicago has had this for the CTA for years. What's the big deal? Is it because New York has such an expansive subway system that this is such a big accomplishment?

Well, Sofia mass transit has had that for at least 4 and a half years. Funnily, for all modes of transportation, except the subway.

Won't work in subway. No coverage.

This app isn't meant to be used in the subway – it's meant to used before you go down into the station. All the lines supported in the app have signs in their stations with the same live schedule information.

Not 100% true - there is cell coverage in a few stations if you have AT&T, and WiFi at a couple as well (mostly along the L line).

this reminds me of an app I helped work on for the Denver area RTD https://itunes.apple.com/us/app/denver-rail/id523272854?mt=8

Why.. why.. is that app pitch black? Thats not the Android style, thats lazy.

A lot of subway apps are black. I assume it's because most of subway signage is black. Here's a black app that's well designed: http://letsembark.com.

This is iOS only... But agreed this app is ugly.

You can get the exact same times using Google Maps (at least for iOS).

Except that there's never any reception in subway stops, no?

So for the trains on which they have real-time data, there are already LED-boards that show arrival time:


If you're on the platform, you'll be able to see the time even if your phone isn't working. So the mobile app will be useful if you're en route to the station and need to know how many minutes you have left.

I was always amazed that the arrival boards are not available at every subway stop. In 1912 I could understand, but in 2012 this seems ridiculous.

So the next train comes in an hour, and the one after that comes in a day?!

I'll stop complaining the next time the L is a few minutes late. Yeesh.

Man, I've heard horror stories of the L, but 25.5 hours is ridiculous.

You're thinking about the G train.

The L has its own horror stories too (mostly because it lacks a third track).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact