Oakland releases months’ worth of license plate reader data (oaklandnet.com)
74 points by oxguy3 on Nov 28, 2015 | hide | past | favorite | 48 comments

Amsterdam did something similarly stupid. They released a large batch of license plates which had been 'encrypted' using MD5, as well timestamps and locations. It didn't take more than a day before someone had built the 'dude where's my car' app that listed where a licenseplate had been recognized and when.

Rumor has it this led to more than one divorce and the municipality has since changed the way they store the license plates and will likely not take part in the distribution of such sensitive data again. (At least, not voluntarily.)

The incident did wonders for people realizing that they too valued privacy.

I found this while perusing their open data portal for other information, was completely shocked that they would make this type of potentially very sensitive information public.

Could you please give an example of how this data could be misused now that it's public? I'm already aware of how it could be misused when it was private. TIA.

I'll post the name and address of that lover you've been seeing and when you've been seeing them. Remember that business trip?

Just kidding, because you really have nothing to hide. But not everybody is like you. Plenty of people go about their business not realizing that they're being followed all the time. They think they can get away with whatever it is they are doing because they are making (flawed) assumptions. And not all of those are bad. What about someone important undergoing medical treatment? Would that influence the stock of the company they are leading? What about a person visiting an AA meeting? What about someone having an abortion?

All these things affect your private life in a very immediate and drastic manner and in some cases they'll affect the rest of the world as well.

Databases like these shouldn't even exist.

It's gotta be a gold mine for stalkers, that's for sure. I don't necessarily want Joe Public to be able to do a SQL query and get back my day-to-day comings and goings with intersections and timestamps and everything.

I'm normally in favor of public data being public, but in this specific case I think the potential for abuse (and the ease and extremity of that abuse) might outweigh the benefits of being this free and open about this particular data set.

They could have captured basically all the potential benefit of this dataset while more-or-less curtailing the potential for abuse with very little effort: Just give salted hashes of the license plate numbers rather than sharing them in the clear.

Even if the hashes are salted, if you see XXXX ends up parking outside address 123 Hickory nightly, it's really easy to figured out who it is. It's incredibly hard to release this kind of data responsibly and safely.

Safely or not, what is the reasoning behind releasing this data to the public at all? What can we build with this data that is useful without violating anyone's privacy?

Why use a salted hash at all? Why not just randomly generated unique tokens?

That said, even with randomly generated unique tokens, I suspect knowing where someone was at two or three specific times would be enough to deanonymize them.

You don't even have to know where. Given enough samples all you need to know is that they weren't in certain places (for instance, at home). Eventually the constraints will line up in such a way that only one car matches the pattern, and then you know all the other places and times too. A sort of reverse sodoku.

I think that this data should always be public and then maybe tags will become a thing of the past.

I'm not sure why the down-votes but what I mean is that nowadays, with tag readers everywhere, I think we give up too much privacy by having a tag on the exterior of the vehicle. Steve Jobs never had a tag on his car as far as I know but this was in CA and he had to switch car every 6 months... I don't know if this is legal in any other states.

You could follow bank truck routes.

You could find where someone lives based on an instagram photo of them standing behind their car.

You see a car in a driveway and check it against the app you built with this data to find out when they're not home to interrupt your burglary.

You could plan hijackings of valuable shipments.

or kidnappings of wealthy people.

Lots of crap you can do when you basically know where someone or some vehicle is at all times.

I'm sure there's much worse, that's all I can think up from comic books and TV, but real pro criminals must have a million uses.

You don't even have to be a criminal to come up with nefarious uses for this info.

Health insurance companies could raise your rates if they see you parked near McDonalds. Retail chains could use the data about your traveling habits to enhance their habit tracking of you. etc etc etc

In the United States, health insurance companies couldn't raise your rates for the situation you describe -- the ACA mandates community rating with exceptions only for tobacco use and age.

It would be straightforward to demonstrate that your rates had been increased compared with others in a similar bracket within your area. This would have significant legal repercussions for the insurer that they would be incentivized to avoid.

>You could find where someone lives based on an instagram photo of them standing behind their car.

If you have someone's license plate you can get their registered address through online services (I've used docusearch). We're all driving around like jerks with our addresses printed on the back of our cars. It's really silly.

Raising insurance rates after you've been tracked parking near a chemo ward, general stalking, more accurate location based advertisements...

I'm having a harder time thinking of how an entity with sufficient funds wouldn't be able to misuse this.

My take on bulk collection of data is that it should be used on the people that could abuse it first, like politicians, CEOs of health insurance companies, etc.

I always hear US people point out privacy issues with scenarios of insurance companies. Is this a real world problem? Do insurance companies actually browse available data and define different rates for every citizen based on whether they attend gym, go to chemo, go to nightclubs, or have bad marks at school?

When it comes time to pay up, you better believe it.

If you're a declared "non-smoker", and die, and there's a picture of you taken after your policy data of you smoking a cigar, weed, or cigarette... good luck to your family's collection efforts.

Many health insurers are collecting lots of trivial under the new "wellness" programs that are popping up. You can be sure that they are trying to correlate gym attendance vs. various claim events.

The gist is that you wouldn't know the real 'why' of such a rate increase / quote. The real question is why do you think a corporation with the resources of a large insurance company wouldn't secretly use such information? Especially when they are famous for doing anything in their power to not pay out or cover high risk patients...

When they are allowed to, yes.

No. All the data that they use is self-reported (when you sign up for health insurance, they ask you if you smoke, etc).

You may be interested in these links. In essence, it takes very few locations to uniquely identify an individual.

I believe Ars was able to accurately track an Oakland city councilman using similar data. [3]

[1] http://www.technologyreview.com/view/512946/how-access-to-lo...

[2] http://www.wired.com/2013/03/anonymous-phone-location-data/

[3] http://arstechnica.com/tech-policy/2015/03/we-know-where-you...

It's showing my home address when entering my license plate...ugh. smh.

Really? You might want to read up on the DPPA (Public Law 103-322). I'm not a lawyer, but I'm pretty sure tracking a plate back to a home address is a big no-no under Federal law.


> ...is a big no-no...

Is that a euphemism for "illegal"? If it is, let's all agree to call a cigar a cigar and stop unwittingly assisting the spin machines of criminals and/or authoritarians. :)

The law was enacted to regulate what states could and couldn't do with license plate data in terms of making it available and traceable back to an individual. Under the law, no government agency can establish a mechanism to directly tie a plate back to personally identifiable information and make it freely available on the Internet. It's not illegal for a private citizen to track the info, I believe it's illegal for a government agency to make it broadly available. But then again I'm not a lawyer so I may be misreading the law, thus the reason for "no-no" as to avoid rendering anything that might remotely look like I know what I'm talking about.

> ...I'm not a lawyer so I may be misreading the law, thus the reason for "no-no" as to avoid rendering anything that might remotely look like I know what I'm talking about.

It's better to use a construction similar to the following:

"I'm not a lawyer -so I might be wrong-, but according to 123.e.(g).1 of $LAW, it looks like $THING is illegal.".

Authoritarians and other folks in power love to downplay the seriousness of their actions with soft talk and babytalk. [0] It's better to be explicit about your uncertainty and potential lack of qualifications to interpret source material, rather than to simultaneously imply your lack of expertise and reduce the perceived seriousness of a prohibited action. :)

[0] Please note that this is not a criticism of you. I try hard to say only what I intend to say and leave nothing to interpretation.

I'm missing where these records are present. I basically see plate numbers timestamps and coordinates no other columns on these datasets.

A car is likely to be parked right outside your home.

Oh I thought that these were all going to be street level data from toll booth type things.

Just for example, this will certainly make life a lot more dangerous for people trying to get away from or remain hidden from an abusive spouse.

Ever had a stalker?

Or any kind of personal business you wanted to keep private? E.g. ever parked in front of Planned Parenthood?

@CyberDildonics, yes. The license plates column is not anonymized.

Are the license plates the actual numbers verbatim?


anecdotal, over ten years ago, but a local county was publishing to the web property maps, valuation, and the names, of last owners. Great way to find people who had unlisted numbers like I did and get a general idea of their income.

now the question I have, are there other accessible data bases from which you can tie these back to their owners?

Wow, before clicking I thought, "these better be anonymized" then before the page loaded I thought, "even if they are, with just a few points of data (like, I know someone went to work these 3 days but not these two days, which plates match that pattern?) it would be possible to locate some people and therefore know all other movements they made that month."

But then it's not even anonymous, so I could just look my neighbor up by their license plate. What about jealous partners? What about thieves that instantly know everyone's work patterns? What about your boss seeing if you actually did stay home sick that one time?

These sort of "emergent" privacy leaks are interesting. Information that is already "out there" but being collected and distributed for the first time is like a system shock when it happens.

Funny. I posted something similar about Chicago a few hours ago.

Copy/paste: Chicago Tribune pretty much did this after a FOIA request for red light data was published. It's not searchable by owner name, but it's still pretty terrible.




I've sent many FOIA requests for parking ticket and towing data - the license data gets redacted every single time. I'm really curious about how the tribune got that data. FOIA specifically forbids license plate information from being released.

Ars Technica already got 4.6M scans from 3.5 years of collection, and all it took was FOIA requests: http://arstechnica.com/tech-policy/2015/03/we-know-where-you...

I already have been visualizing the Oakland alpr data. Thanks very much for the links to the Chicago data. The Guardian has some interesting visualizations for Chicago.

Are there any legal ways of protecting yourself from this sort of data collection?

Well there is the Steve Jobs solution: don't have license plates.

There's a searchable map of the data at oaklandlpr.com

I assume this is accidental?

