Preview in black and white: http://i.imgur.com/OhemZK2.jpg
Bonus! preview in Uber blue: http://i.imgur.com/gvFDtWN.jpg
Source AI files: https://www.dropbox.com/s/zw1oqdwhaqhqb6u/NYC%20Taxi%20Data....
The process of making it was quite simple. I zeroed a 2d array of integers, then took all the pickup/dropoff points and incremented the nearest cell. The pixel values are based on the logarithm of the counts, since otherwise everything outside midtown would be pretty much black.
There are some artifacts, like the thin vertical line down the east river. I think that was because of how the data was rounded, i.e. the number of unique longitude values that map to a certain image column.
I wrote this myself with a few hundred lines of C++, though I'm sure there's GIS software out there that will do all this for you with a few clicks.
Also did you overlay it onto a map? How did you get the angled effect if it's just a grid?
I assume it is because there is a fixed fare to/from JFK so drivers have little incentive to start/stop the meter at the exact pickup/dropoff location.
> Also did you overlay it onto a map?
No. If taxis did not pick up or drop off people on some street, that street does not appear. For example there is an area downtown where there are streets but they have had security barriers since 9/11 thus no taxis.
> How did you get the angled effect if it's just a grid?
None of NYCs grids are exactly north/south/east/west aligned.
there is not much distortion (world appears flat) which makes it quite clever hack to plot on the grid!
I made this map awhile ago for Instagram photos using R/ggplot2: http://i.imgur.com/IvGox1f.png
Although the country boundaries aren't strictly necessary. Since there appears to be a demand for generating these maps, I'll work on a tutorial for this NYC data set.
- 2013 data as FOILed by Chris Whong http://chriswhong.com/open-data/foil_nyc_taxi/
- 2008 to 2013 data as FOILed by me, on BigQuery https://bigquery.cloud.google.com/table/alien-climber-851:ny...
...note that after Whong's request, the TLC redacted the medallion numbers, making it virtually impossible to analyze trips by cabbie.
Is that data set available for people who do not use Google Accounts as well? Maybe you could upload it to https://archive.org.
Whoever owns Storage, NYC TLC in this case, pays a minimal fee of 2 cents per GB per month for storage. This includes multiple factors of replication/durability.
Whoever is doing the querying - this can be you - pays 5 dollars per TB queried. First 1TB per month is free.
Also, I would love to see an analysis of whether traffic is actually getting worse compared to that of last year. This claim was made by mayor de Blasio as a reason to cap Uber rides.
Downloading one of the CSVs to check it out. Each one is about 2GB.
EDIT: Per the BigQuery table schema, medallion is no longer a field.
BigQuery tables for the data: https://www.reddit.com/r/bigquery/comments/3fo9ao/nyc_taxi_t...
From medallions to food cart and restaurant permits, regulation is keeping competition out while rewarding those who merely sit on permits and rent their use. It is nearly an identical situation to how badly patents are managed and rewarded
(1) small children, where you need a car seat (or two)
(2) people with disabilities - service animal, wheelchair, etc.
Analyzing the data would hopefully show whether people had to wait for 2 hours.
Also, as a substitute for race, it might be possible to see if certain areas are under-served or not served at all. Perhaps drivers are avoiding picking up in Harlem.
Which is clearly violated thousands of times per day. Sure you can call 311 in NYC but very few people do.
I've once heard a quip that "You have no constitutional right to eat at a restaurant, but you do have one to a speedy trial -- but which one feels more secure?"
The city has "leverage" to stamp out discrimination against black people wanting a cab ride, but "no leverage" to stamp out discrimination against black people on Uber -- yet which one will more reliably secure a ride?
I could report him, but I'm a busy adult with kids of my own to mind, and I don't have time to try to parent someone else's problem.
Uber hailed this as a victory - but the way I see it, when your victory means "maintenance of the status quo while conceding your data to a third party" it probably wasn't actually a victory.
This kind of data was previously an issue with Uber, too.