
Using NYC Taxi Data to Identify Muslim Taxi Drivers - tshtf
http://theiii.org/index.php/997/using-nyc-taxi-data-to-identify-muslim-taxi-drivers/
======
rmxt
If you dig through to the source reddit posting [1], you can see that the post
only really puts 6 different individual taxi IDs out there for visualization.
To me, only the first 4 seem like a good visual fit for the prayer times, and
the overall trip heatmap [2] suggests that they may just be part of a larger
pattern of eating/taking a break at sunrise, noon, and sunset for all cab
drivers. While the blog posting is begging the question of whether or not the
data release contains personal data as a result of these findings, much more
invasive findings have already been published (and posted to HN) with this NYC
Taxi data, like corroborating individual trips by high profile people. [3]
Notwithstanding the privacy issues and questionable methods used to obfuscate
the data, I personally think that the release is a great step in the right
direction for open data.

[1]
[https://www.reddit.com/r/dataisbeautiful/comments/2t201h/ide...](https://www.reddit.com/r/dataisbeautiful/comments/2t201h/identifying_muslim_cabbies_from_trip_data_and/)
[2] [https://i.imgur.com/lyK0qTI.png](https://i.imgur.com/lyK0qTI.png) [3]
[http://research.neustar.biz/2014/09/15/riding-with-the-
stars...](http://research.neustar.biz/2014/09/15/riding-with-the-stars-
passenger-privacy-in-the-nyc-taxicab-dataset/)

EDIT: For better or worse, deducing ethnicity, country of origin, and/or
religion is probably much easier based on this data set [4]. People have come
up with analyses like this [5]. The data analysis is great, but my fingers are
crossed that tabloid newspapers and their ilk don't pick this up and run off
xenophobic, fear-mongering articles.

[4] [https://data.cityofnewyork.us/Transportation/Medallion-
Drive...](https://data.cityofnewyork.us/Transportation/Medallion-Drivers-
Active/jb3k-j3gp) [5] [http://vizual-
statistix.tumblr.com/image/107987401281](http://vizual-
statistix.tumblr.com/image/107987401281)

------
flexie
In Copenhagen, it was revealed a few years ago that racist customers who
didn't want a muslim taxi driver told the central when ordering a cab that
they were 'bringing a large dog', and the operators at the centrals knew and
respected this as the secret code that the customer wanted an ethnic Danish
driver. This is obviously illegal but I dont know whether they stopped the
practise.

~~~
danso
OT: Sorry, that made me laugh...I can't even imagine how that would even work
in New York...you'd be waiting forever to get a ride; the Redditor mentions a
previous name analysis of drivers (TLC releases names/licenses of taxi drivers
as another dataset) indicating that maybe half of all NYC taxi drivers are
Muslim.

After living in New York, I have a Pollyannish belief that if you were
inclined to be unconsciously racist...you'd give up that inclination in day-
to-day life...or else, how could you cope with the sensory/panic overload?
Everyday you're exposed to hundreds, sometimes thousands of people of every
ethnicity and religion, just being normal.

~~~
rmxt
Despite the diversity of NYC as a whole, my anecdotal experiences suggest that
people can still harbor those sorts of unconscious thoughts while living their
entire lives in a "diverse" city. I say "diverse" because there are lots of
enclaves in NYC, sometimes with stark divides across streets. (See the Dot Map
[1]). Outside of gentrifying/gentrified areas like Long Island City or
Prospect Heights, or white flight areas like Morris Park, neighborhoods look
rather homogeneous. (Exercise for the reader: see if you can find out where
the Landmark/Historic Districts are in Brooklyn.)

[1]
[http://demographics.coopercenter.org/DotMap/](http://demographics.coopercenter.org/DotMap/)

~~~
rkuykendall-com
Unfortunately, last time it was posted, it was noted that that map has drawing
bias. The dots are drawn in order of race, not randomly, and so the earliest
color dots are underrepresented and the later dots overrepresented. I'm afraid
I can't remember the which order they are drawn in. Just something to keep in
mind.

~~~
dandelany
Interesting, I wasn't aware of that, thanks for bringing it up. From the
code[0], it looks like it's drawn in the order [White, Black, Asian, Hispanic,
Other]. You're definitely right, that would skew things.

[0]
[https://github.com/unorthodox123/RacialDotMap/blob/master/do...](https://github.com/unorthodox123/RacialDotMap/blob/master/dotfile.py)

------
danso
Yeah, the OP has serious blinders here. Like the student who threw alarmist BS
about how the data could be used to track visits to stripper clubs...this
seems to be an example of cherry-picking the data. So, I'm not good at
discerning deviation from thousands of small dots...what's the numerical
measure of the correlation here? What are the significance of those blue lines
when they show a trend over empty black space?

And what does the analysis/trendlines look like on drivers who can be
discerned as _not_ being Muslim?

I'm all for personal data protection, and the TLC erred in not anonymizing the
data. But this weak analysis and alarmist bullshit by a PhD serves little but
to give governments more excuse to hide public data.

edit: Linking to someone more well versed than me, regarding the cherry-
picking of data in the "Riding with the Stars" article
[http://blogs.law.harvard.edu/infolaw/2014/11/21/the-
antidote...](http://blogs.law.harvard.edu/infolaw/2014/11/21/the-antidote-for-
anecdata-a-little-science-can-separate-data-privacy-facts-from-folklore/)

~~~
bnegreve
> What are the significance of those blue lines when they show a trend over
> empty black space?

The blue lines are not trends, but prayer times plotted on top of the graph.

------
rhino369
Observant Muslims _

