
Uber Doesn’t Want to Give NYC (or Anyone) More Data - artsandsci
https://www.bloomberg.com/news/articles/2017-01-05/uber-doesn-t-want-to-give-nyc-or-anyone-more-data
======
ThePhysicist
I work as a data scientist and often talk to unions and people in the public
administration about data, and I agree that data which companies like Uber,
Airbnb or Fedex generate can be very interesting for things like optimizing
public infrastructure or salary negotiations. I can perfectly understand why
companies don't want to publish this data (as it's a valuable asset and
contains a lot of potentially sensitive insights about the company), but you
have to weigh this against the benefit to the public that the data could
generate.

I think it's not unlikely that we will see new legislation soon which will
oblige companies to make some of their data available to the public or at
least public organizations so that they can be used for public good. In fact
this wouldn't be unprecedented, and nothing that would be (IMHO) very
detrimental to these companies.

~~~
chrishacken
I don't disagree that this data would be beneficial to the public, but I don't
think government should have the right to tell a private organization that
they need to hand out their data to whoever wants it.

Over regulation kills business; maybe not companies like Uber, but smaller
companies that are still trying to get off the ground.

This is a slippery slope.

~~~
kafkaesq
_I don 't disagree that this data would be beneficial to the public, but I
don't think government should have the right to tell a private organization
that they need to hand out their data to whoever wants it._

As a "private organization", Uber of course doesn't have to do anything.

But if it wants to operate a business on NYC's streets on a (massive) scale --
well, that's a _privilege_ , not a right. And if they want access to that
privilege, they're gonna have to play by the rules as set forth by the city's
voters and their elected representatives. Who have a vested in interest having
reliable access to that data for well, a whole bunch of pretty obvious
reasons.

It's called "rule of law", a concept which Uber has demonstrated considerable
difficult in understanding thus far. Once they do, we can perhaps have a
conversation about whether certain regulations are really useful or necessary
or not. But it needs to be based on the pragmatic merits (or lack thereof) of
those regulations. Arguments on the basis of "they're a private organization;
you have not _right_ to tell them what to do" just don't hold a lot of water
in these contexts.

 _This is a slippery slope._

No, it's just life in the big city. And it's about time Uber got used to it.

~~~
bhups
_Arguments on the basis of "they're a private organization; you have not right
to tell them what to do" just don't hold a lot of water in these contexts._

Forcing an organization to do something is a big deal because, unless you are
specific about when and what kind of data you can force a private company to
disclose, it can be abused. This is why such arguments are important.

 _But if it wants to operate a business on NYC 's streets on a (massive) scale
-- well, that's a privilege, not a right._

It actually is not a privilege, it _is_ a right. Conducting business is a
First Amendment right. In fact, this is a dangerous argument: if conducting
business is a privilege, couldn't the government decide which businesses it
likes and it doesn't like? The use of New York's streets is indeed a
privilege, but that is already paid for by road taxes - anything more than
that is just double dipping.

 _It 's called "rule of law", a concept which Uber has demonstrated
considerable difficult in understanding thus far._

Let's leave the attacks out of this - replace Uber with any other company and
you have the same set of issues to discuss. Instead, let's just focus on the
merits of the idea of government coercion of a private company to release
data.

If this data was truly beneficial to the public, couldn't the government buy
it from the company through a voluntary transaction, paid for by taxpayers?
The company can decide whether or not they wish to sell that information. The
only reason a company would refuse to sell anonymized information is if they
think the data is important to their competitive success.

~~~
grigjd3
It actually isn't a right legally. You can absolutely have your driver's
license taken away and the government has the right to regulate usage in many
ways. If you're confused about this, try driving the wrong direction for a
lane long enough and see what happens to you.

~~~
bhups
Conducting business (i.e. running an enterprise) is absolutely a right.
Obviously if your business involves breaking the law, the business owners
shall be reprimanded, but that's separate from OP's assertion that operating
business on NYC's streets is a privilege and not a right.

~~~
grigjd3
Using government provided resources, like roads, for your business is not a
right. You seem confused, so you should try setting up a stand on fifth avenue
without a permit and see how quickly the police remove you.

------
samfisher83
I thought this was a key take away from the article:

>Taxis already share all the data the commission is requesting from Uber.

If Taxis are required to give this data why should Uber be different?

~~~
TAForObvReasons
... because Uber is a special snowflake, haven't you heard? /s

Uber is trying to differentiate itself from the standard taxi companies in
order to avoid the same level of regulations and issues. To do that, you can't
concede on any front no matter how trivial or reasonable it sounds.

~~~
warcher
Yeah, and it's virtually inevitable that city hall catches up with them
eventually _somewhere_. They have to win _all the city hall fights
everywhere_. Also Lyft does too-- if Lyft ever capitulates in exchange for
licensure, Uber has to as well, lest they be out in the cold.

Sooner or later somebody is gonna get one of them, and then all the other
cities are going to say "Hey, if Los Angeles gets your ride data, then why
can't Topeka?"

------
CodeSheikh
Yellow cabs work when you hail it at point A and get dropped off at point B.
Addresses are recorded but not the identity.

Uber cabs work when you request it at point A and get dropped off at point B.
Addresses are recorded and the identity.

Yellow cabs share its data. All journeys along with their start, end
coordinates. No identity issues as their platform could not record it.

Say if Uber cabs could share its data. All journeys along with their start,
end coordinates. No need to share identity of the person associated with that
journey.

Using Yellow cabs data set, if I have an address of my friend's place then I
can figure out his journeys pattern when I see a frequency of his apartment
address (minus the home number) either by data crunching around his address or
the nearest corner intersection where he can potentially hail a cab.

I really don't see what's a huge deal here, unless Uber really does not want
to expose its $ amount for a variety of reasons that I don't want to get into.
Uber can strip off identity data, they can even strip off building numbers by
normalizing pinpointed building numbers to nearest intersections. Sure every
place can not have a nearest intersection. In that case, provide macro geo-
information for the neighborhood. Trust me, every address has a nearby
intersection in the city of New York.

Edit 1: spellings

------
caseysoftware
Does FedEx/UPS have to share package pickup/dropoff data with local
governments?

After all, it is a direct competitor to the US Postal Service.

~~~
knz
Great point but there are some legitimate reasons that the city may want ride
data especially if Uber wants to replace multiple forms of transportation
(taxis, personal vehicles, frieght delivery, ride services for the
elderly/disabled etc).

There are also other municipal needs like collecting utility information from
radio systems or pavement management surveys that cities often drive around to
collect - Uber could potentially be utilized for data gathering if it was
determined that they had sufficient coverage etc.

Every GIS nerd would love a data set like the Uber tracks for analysis and to
see what else they could be used for (Uber for mass license plate collection
is like something from Black Mirror).

~~~
lightbyte
This data could be extremely helpful to planning out public transportation.
You would be able to determine the most common areas people go to, where they
come from, when they typically go there, etc. All would help in planning bus
routes/subway lines/etc.

~~~
scirocco
Thus improving public transport and make people less dependent on Uber :)

~~~
Fricken
They're not in competition with one another. Together they are both in
competition with privately owned cars. A transportation system effective
enough to compel people to give their cars up is good for both of them. Uber
alone is too expensive, and public transit alone can't provide service
everywhere all the time cost effectively.

------
jdavis703
I don't want the government knowing this much information about me. But I see
how it would benefit transit planners and regulators to have this data. Can
they just blur it out to coordinates with a 100-foot radius or something
similar?

~~~
maverick_iceman
The problem is given enough data it is possible to unblur the data.

------
minimaxir
Relevant additional context: the Uber data was released to the public after
FiveThirtyEight filed a FOIA request to the NYC TLC after they got the data:
[https://github.com/fivethirtyeight/uber-tlc-foil-
response](https://github.com/fivethirtyeight/uber-tlc-foil-response)

Additionally, the data is anonymized; the original taxi dataset mentioned in
the article had poorly-hashed Taxi ID numbers which is how privacy was
compromised. Subsequent TLC datasets lacked that field completely.

------
fullshark
It's really premature for Uber to claim to be public transportation so the
entire premise of the argument that they should do it is faulty imo.

~~~
tomcam
Where do they claim that?

~~~
fullshark
The sub-headline:

> Ride-hailing companies aspire to be something akin to public transportation,
> but that doesn’t extend to sharing data with governments.

So this is supposedly noteworthy for that reason.

~~~
snrplfth
I think what tomcam was wondering was where Uber specifically claimed to be
public transportation.

~~~
fullshark
Yeah I worded that poorly, I didn't mean to state that Uber was claiming that,
merely that the article seemed built on a faulty premise. Uber could very well
be considered public transportation at some point in the future and they may
claim it to get tax breaks / infrastructure spending etc, but that's way way
way out in the future if at all.

~~~
snrplfth
Oh okay, that makes sense.

------
brndnmtthws
Good. I'd prefer that my private data remain private, and private companies
like Uber have a responsibility to protect that data.

~~~
aetherson
You know, it's possible to simultaneously be critical of the government
request for data, and the very real concerns that of course they won't
successfully anonymize location data, and _also_ have a realistic view that
Uber wants to "sell" your data, not "protect" it.

~~~
skybrian
I don't know about this: "of course they won't successfully anonymize location
data".

Yes, it's well known that they screwed this up. But that doesn't mean that we
should therefore distrust them forever. Although it's tricky, figuring out the
right level of granularity for revealing taxi ride data to the public seems
doable, and we shouldn't let cynicism prevent all attempts at a compromise.

~~~
smallnamespace
The fact that they screwed it up in the past means they demonstrated a lack of
ability to get it right. Unless we have strong evidence that they've fixed
that deficiency, it seems unwise to repeat the same mistake.

Better safe than sorry seems like the right heuristic here given that once
private data is released in the wild, it will remain public forever [1].

[1]
[https://www.reddit.com/r/bigquery/comments/28ialf/173_millio...](https://www.reddit.com/r/bigquery/comments/28ialf/173_million_2013_nyc_taxi_rides_shared_on_bigquery/)

~~~
anigbrowl
Yeah, and companies in the private sector screw up n similar fashion on a
regular basis. What strong evidence do you want that they've fixed the
deficiency? Government is about as open organizationally as any entity - you
can serve FOIA requests asking what government has done to mitigate some past
screwup and get back reams of documentation, which is more than you can say
for private entities. I seem to remember Uber having its own history of
privacy violations.

~~~
smallnamespace
Easy - propose a process for releasing data, and get it publicly vetted by
privacy/security researchers. I'm pretty sure anyone who understands basic
crypto would know that MD5 hashing low-entropy text fields is problematic. The
fact that they missed this means they just handed this procedure off to a
random coder who didn't know any better. That doesn't sound like at all like a
sane process for releasing gigabytes of highly personal data.

Private companies have their own issues, obviously, but just because A is bad
doesn't mean we should lower our standards for B.

Also, private companies have your data because consumers interact with them
directly, so at least there is some implicit consent involved. But I'm less
convinced that they should be forced to hand that data over to public agencies
that have only a very tangential relationship with the user. If I take an NYC
cab ride, why should the city government get to see that, especially if I
don't live there?

~~~
anigbrowl
As for your cab ride, because the government of NYC is who people turn to when
they have problems with their cab ride and don't get satisfaction from the cab
company. Honestly, talking to some people it's like they were living in some
free market garden of Eden one day and then government came along and ruined
everything. People institute governments to 'secure their rights'; it's your
choice to take an oppositional attitude rather than a participative one
towards government.

 _You_ are the one saying you want certitude about the security of information
held by government; why not propose _your_ preferred standard for how data
should be released, maybe get some peer security experts to refine it or agree
on a suitable candidate, and then promote its adoption by government with an
economic argument?

It seems not to have crossed your mind that government failure is often the
result of past policy decisions imposed by representatives or the voters
themselves on how things should be done, and that they're often mandatory for
the people who work in government. They may _know_ a policy or procedure or
person's performance is flawed, but lack the legal or budgetary authority to
do anything about it. _Sometimes_ inefficient policies are in place as a
political payoff to a corporation, union, or individual - corruption is a
problem, and legislatures are essentially political marketplaces, and subject
to certain failures of markets. Other times inefficiencies are just unintended
consequences of well-intended legislation that was poorly crafted, or outlived
its usefulness, or conflicting imperatives that lead to legal race conditions.

Try thinking of government as the operating system (or platform if you prefer)
of society. It's buggy, bloated, nominally open but actually with a bunch of
closed-source stuff in it, some people mine it for exploits or sneakily
implement their own, and so on. You have this huge codebase written in
multiple languages running on all sorts of legacy institutional hardware, all
strung together in a giant embedded system that is supposed to operate 24-7,
often under difficult conditions. Oh, and there's bitter disagreement between
two factions of developers with radically different ideas about, well,
everything.

Refactoring this isn't an easy undertaking. I suggest to you that the problems
of government are similar to the problems of a large software project, and
involve many of the same sort of trust problems that operating systems do.
Consider that there is a relatively small number of successful operating
systems/platforms, none of them were built overnight, and they all suffer from
various faults and have interoperability issues - some by design, some by
oversight. Your black-box approach to government is of limited utility because
it's not like you can easily swap it out for a better one.

~~~
smallnamespace
I'm not sure how you somehow found a broad antigovernment oppositional
attitude in these posts here. All of what you say about governance can be
true, and I largely agree with it, but I still don't want NYC to have my data
and I do think the onus is on _them_ to show they will be responsible
custodians because they screwed it up in the past. Once bitten, twice shy.

I don't know if you've tried to do free work for governments and get them to
go along with your initiatives before, but I bet you that it quickly becomes a
politicized process and is much more painful than, say, submitting a PR. I
don't have the time, energy, or resources to do such a thing, but I do have
the ability to vote against policies that I feel are misguided. If you have
examples of normal citizens getting municipalities to adopt their initiatives
and the process going smoothly, I'm all ears.

Also, the point here is that Uber is _not_ a cab ride; if I take a cab ride I
can understand why NYC gets the data directly since cabs are a de facto
government-established monopoly, but I don't see why that extends to my
business dealings with a private corporation.

~~~
anigbrowl
_I still don 't want NYC to have my data and I do think the onus is on them to
show they will be responsible custodians because they screwed it up in the
past_

As I pointed out in the first place, with a public entity like NYC you can
file FOIA requests to find out what they actually did to mitigate the problem
and ensure it doesn't recur. Is it enough? Opinions will vary, but reliably
ensuring better security practices in the future will require some sort of
rule, if only to identify the standard their IT operations need to comply
with.

Meanwhile, the city is still expected to respond to complaints from residents
about the cab service and to carry out its existing oversight functions which
are almost certainly mandated by law. Uber's right to operate a transport
service is subject to the same laws as every other transit service, so there's
no legal basis for them to have a veto over the city contingent on the city
meeting some (undefined) standard of reliable data custody. That would be
giving a private entity (Uber) authority over the data security policy of NYC,
which is an absurdity.

Certainly their opinion matters, as does yours as a voter, or even (very
indirectly) as someone who chooses to drop some of your tourist $ in NYC or
not. But having an opinion which you attempt to everage at election time
(occasionally alone or more commonly through donating to some lobbying group
to do it for you) is very different from a company unilaterally declining to
comply with a rule and then _claiming_ to be doing so in order to uphold your
interests rather than their own.

 _Also, the point here is that Uber is not a cab ride_

Of course it is. You summon a vehicle to transport you in comfort from A to B
and you pay on arrival. Summoning them via an app rather than via a telephone
call or a wave down is a mere operational detail. You're still hiring a car,
for the time being cars are driven by people, and the law in New York is that
commercial drivers can't work more than 60 hours a week (because of the
increased risks of accidents due to driver fatigue, which is backed up by a
lot of data) and have to be able to provide work logs on request.

The _existence_ of a difference tells you nothing about its degree. From the
point of view of the customer and driver, there is virtually no functional
difference between Uber or any pre-existing taxi company. The significantly
different business, dispatch and billing practices don't alter the fact that
it's a ride-for-hire service.

------
oli5679
The area where I'm really interested in obligatory publishing of data is
financial data from stock exchanges. How great would it be if Google or
someone from the next batch of HN was free to make a superior, cheaper
versions of Bloomberg/Reuters? Would save so much time and money within the
financial world.

------
LordKano
They shouldn't want to. They only want the data to see if more money can wrung
from the service or its customers.

~~~
anigbrowl
FTA: _Taxis already share all the data the commission is requesting from
Uber._

Same rules for everyone.

~~~
LordKano
In that case, I'd rather see the rules eased for everyone.

It's absolutely absurd that it costs over $400,000 for a cab medallion in NYC.

