Hacker News new | past | comments | ask | show | jobs | submit login
Maker of US border's license-plate scanning tech ransacked by hacker (theregister.co.uk)
235 points by prostoalex 24 days ago | hide | past | web | favorite | 99 comments

Are they worth it as a target? With an RPI and openalpr I can easily build my own license plate reader in one evening. This technology is so far out of the bag already.

I’m critical of people like me (millennial, working in tech) and I think I have good reason. I hear so many bizarre technological “solutions” to what are ultimately policy issues. If we spent half that time instead lobbying our representatives we would be in a much better place as a society. Can you name your state rep? How about you write them a little today rather than succumb to cynicism or spitballing tech.

Reading the license plate is trivial, its processing, and doing the advanced queries that make ALPR solutions real.

Sure you could create a bunch of pi cams able to handle a few reads per second, but then say you want to know what plates have traveled with your target plate between multiple camera sets to see if someone is being followed, or you have cell phone pings and want to search all cameras in the radius of that tower for a plate.

The reading isn't hard, its fully solved, the economical and real-time searching of plates and evidence compilation is the much harder problem.

It’s true. There is also a lot of data cleaning and grouping to do. Building sighting histories gets computationally intensive with scale.

> Building sighting histories gets computationally intensive with scale.

Can you talk more about why this is difficult?

My very naive assumption is it would be pretty easy to partition/index the data by plate number and location/time and use something like Storm or Spark (?) to run various types of queries.

Sure. I have a SaaS, EasyALPR.com that uses ALPR for parking enforcement that I've been working on for a few years.

I'm near a major release actually, replacing my previous products with something called Parking Enforcer, which I believe is the best mobile app / vehicle grouping tool in market. I focus on business parks with 300-2000 parking spaces to patrol. It has been in beta for about 9 months.

I have been working algorithms related to ALPR data set matching a lot, primarily in Python.

I'm not familiar with storm / spark. However, one issue is that that license plate reads are not 100% accurate. So you are looking for a fuzzy version of the plate.

Its possible for collections of plates to sort of fuzz-out as incorrect members become more distant from previous ones, causing new matches to join groups they should not. You can write stuff to handle this but the original point was ~"this problem is in analysis not ALPR" which I agree with.

As far as computation, a new plate may or may not have a group to join. If it has been seen before, you need to look for the group "most likely" to be the same car. This can mean iterating over a large data set to look for the highest probable group.

There are some tricks to cutting down on the data set under consideration. For example it is much easier to only look at possible sightings from this past week in this area than every sighting from everywhere. (which when assembling a national database may be necessary)

But in my experience, even in the tens of thousands of plates, doing grouping requires task queues and tricks for quick identification and notification of matches. A watchlist (blacklist) is might be short, but the grouping task over months of data can be long.

Some plates are VERY similar but not the same car! Sometimes the ALPR camera forwards photos of fences or vehicle grills that are not license plates at all, and those must be ignored but not at a threshold where good data is thrown out. When bad stuff makes it to the user, they have to be easily disposed of if they make it through.

Data cleaning things can require additional capabilities of analysis, like breaking apart improperly grouped vehicles, yet ensuring they do not rejoin "bad" groups. Lots of details to handle, and to an end user, it is painfully obvious when "matches" are not correct. They expect magic.

One other note is that I found building an efficient and useful model architecture for these purposes to be challenging. There are more details in how the raw data comes in from ALPR scans that have to be handled before you even reach an individual "Sighting."

Question, from a technical perspective, how much of your fuzzy/unclear plate problem do you think will be solved by advances in cameras? In the context of a car with 4 to 6 cameras mounted on it roving a parking lot checking plates. Already the video output from the hdmi port on a GoPro hero7 black is considerably better than what a $4000 camera could do just 3.5 years ago, and it costs $349... Or various setups I have seen with modified 22 megapixel Sony mirrorless cameras used for GIS orthophotos from drone platforms.

Hard to say. FWIW, much of the time the exact plate gets read. But it also matters what has been trained. Some states have weird vanity plates. So it not purely an optics problem.

If I could I would have all my customers scan plates with the iPhone XS. But many work with 7 or even less.

Also introducing a lot more collected data to dedupe is a thing as well. One of the first algos I worked on was just realizing in short term new dats we already had a car and to not try and treat the same car like it was another car.

Are there ALPR companies with a workflow where the truly unrecognizable plates go in a review queue for image recognition by a human offshore somewhere, in a low cost location? I'm thinking of the standard call centre salary for people with an average level of education in second/third-tier cities in India, in Pakistan, in Bangladesh, etc.

It that I’m aware of. But it is certainly possible. There is just a lot of data.

There are some tools that focus on this “fuzzy deduplication” problem:

Senzing https://senzing.com/

Tamr https://www.tamr.com/

Thank you I will look at these.

> I hear so many bizarre technological “solutions” to what are ultimately policy issues.

Tech is empowering, much moreso than playing politics.

Consider something like the Kafkaesque nightmare that is applying for (and keeping) food stamps. It doesn't need to be complicated (nor should it be!), but you can either try to convince elected officials to make the poor a priority and fix the process or roll up your sleeves and write a script to complete the forms in triplicate, generate mailing labels with delivery confirmation, remind users of deadlines and pull phone records to prove the social worker never called like they said they did in the denial letter.

Or you could petition, harass, bribe and cajole your way into enacting change, and have it all overturned with a change in administration.

In some part these technical solutions exist to fix people problems. Look at the internet itself-- where problems exist (a country's politicians/dictator makes the nation unroutable), you don't wait for a coup, you route around it.

But also in some part these solutions are just modern rent-seeking, so...

> Or you could petition, harass, bribe and cajole your way into enacting change, and have it all overturned with a change in administration.

This is the reason why technological solutions are popular among the Silicon Valley crowd. No matter what, political solutions are plagued by human emotion and self-interest, and thus they become sticky, "corrupt", and slow. Technological solutions are subversive of that structure at the least, and a force multiplier in others.

The computer will generally do what you tell it to do. You can spend hours of effort on something and get a deterministic result that will do the thing. You can spend your whole life in politics and get nothing out of it because the entrenched power structures won't let it happen.

The computer will do what people tell it to do, so it’s just as plagued by human emotion and the self-interest of those who program it.

It’s just a different form of the same corrupt politics.

The biggest problem with technology is this delusion that it somehow isn’t a reflection of human failings.

If we at least admitted that then we’d be able to reason about it responsibly.

More technology to process forms just enables the bureaucracy to create more forms, in the same way that computer programs have gotten more fat off of the increased power and capability of modern devices. It makes the process shitty for everybody else who don't want to pay for, can't pay for, or aren't covered by the nifty tech. Exhibit A being TurboTax.

Technology is a band-aid, not a solution, and technology can also be used to make progress towards solving the problem decay, or actively worse.

I think "covering up gangrene" works well with the band-aid analogy

> Consider something like the Kafkaesque nightmare that is applying for (and keeping) food stamps.

I have long thought that this particular process was intentionally difficult as a means of helping to weed out people who aren't really in need.

>you don't wait for a coup, you route around it.

No, we’re still waiting for a coup in China, North Korea, and on and on for the citizens to access the internet.

> Are they worth it as a target? ... This technology is so far out of the bag already.

The point is that if you have access to their systems you can affect what is reported. For example you could add a bogus plate to the data stream, or remove one, or perform a substitution. I think we can all imagine cases were doing such a thing might be useful to someone.

Others have also pointed out that it’s possible that this company has some particularly interesting recognition technology, but I agree with you that this is really a second order issue.

I was thinking that if you can see the logic behind their license plate recognition algorithm you can engineer modifications to license plates that would make them unrecognizable by the police. For example, adding a weird black square on a few of the numbers that look fine to the naked eye but for whatever reason screw their scanners up.

excellent point!

Isn't the article stating that the hackers are posting the actual license plate scans of people crossing the border? That seems like a big deal to me.

Politics is about convincing as many people as possible to vote your way, and to make a big difference, you have to do it at scale. Changing people's minds is hard. Much like sales, it also has downsides like making a pest of yourself.

It's important, but I am a bit frustrated by shallow encouragement that makes politics sound like it's easy. It's like saying "get a job" rather than providing actually helpful resources and training to find a job.

I think the reason why this is significant is not so much that the proprietary files were leaked, rather the amount and breadth of personal information exposed along with the insinuation that this company's network (who maintains large amounts of personal data of unwitting participants) is not secure.

I bet a little digging and we will discover that this network was hopelessly undefended and their software is horribly riddled with holes and poor security practices.

It was obviously done to impact the company as an entity rather than to target individuals or the government. It's quite obvious, to me, that the attackers wanted to dump a large liability in someones lap and perhaps for good reason. If this company can't secure the data it collects about innocent civilians then it shouldn't be allowed to collect it.

Perceptics is a pretty big player in the license plate camera market. This breach is extremely embarrassing for them and will doubtless put their future work with government in jeopardy as they become a political target. I wonder if this hack was perpetrated by China as retribution for the Huawei blockade? The Chinese perhaps have an arsenal of such breaches ready for release when the right politician needs a nudge.

>> Can you name your state rep? How about you write them a little today rather than succumb to cynicism or spitballing tech.

Yes I can name (all) of my state's representatives. No I'm not going to write them. Your letter and sending information get filed under an equivalent of "dissenters". That's kind of the entire point here.

What makes you think the object of the attack was to duplicate their technology?

I would be more interested in the security implications. Hackers now have access to the source code; what kind of attack surfaces does this yield against customers who purchased the compromised software?

How does lobbying work? Is it backed by some kind of power other than the eager generosity to provide free policy consultation?

if you have the ML that parses the plates, you could make adversarial plates that it couldn't find

Related to this, I'd love to see some in-depth wire brushing done on how the US got blanketed with 4 way cameras at every intersection and how this data is used/kept, and by whom.

I first noticed it in major centers years back, but now it seems even small towns have cameras at every intersection.

From an IT perspective it's a pretty interesting project, but from a tin-foil-hat perspective it's astonishing when you imagine the ability to link all these cameras together in real-time.

We have them all over the place in Michigan. I specifically asked my friend who's a civil engineer with the county road commision when some were being installed just down the road from my house why they were being added. I was personally concerned at the prospect of red light cameras, which are currently banned in the state.

The explanation was that they were updating the stoplight controls. One of the inductive sensors in the road had failed, and it was cheaper to have a guy in a bucket truck stick a couple cameras on the pole than to rip up the road. The cameras are used to see the volume of vehicles in each lane and dynamically adjust light timings. And since I drive through the intersection multiple times a day, I have noticed an improvement. They skip the left turn sequence if no one is waiting, and rarely have a big backup when volume is high in one direction for the commute. Also, the left turn timer used to be very short (like 2 cars making it through on green, one on yellow, and the 4th car often took control and went on red), which was nice when there was only one car and you wanted to go straight, but annoying when you were one of 6 or 7 cars in line and an extra five seconds would let everyone make the turn but instead you had to wait through multiple light cycles. Now it seems to often hold the turn cycle long enough to let the whole line empty out.

But I totally agree that the idea of a soft update to either issue red light tickets or track license plate activity is extremely concerning. Might end up with a stray paintball from my backyard accidentally hitting the lens if they make that a policy change.

Camera control of lights is a blessing for many motorcyclist, there are just some lights that will not trip. Plus like the guy told you, it is far cheaper. Plus I know in Atlanta they use the cameras to adjust timing and such and did show improvements.

if there is no retention, or a press here to save last five minutes in case they witness an accident that would be good

My 2016 car came with three cameras. My phone has two.

Cameras are already everywhere. As they become even more dirt cheap, that will only hyper-increase, even if governments somehow completely stays out of it.

Sometimes you enter into a new technological era, and you have to accept that things have changed.

Sometimes I ask people about the cameras and the white boxes (those ubiquitous white boxes on poles, often solar-powered, along highways pointing perpendicular to moving traffic). I ask what they think - who put them up, what data they collect, where that data goes. I am routinely met with blank stares and "I don't know what you're talking about". It baffles me.

Good job vetting your vendors Uncle Sam! Somehow, we'll end up paying (taxpayers) to clean this up - just like we paid to deploy it. Sure would be nice if we had any voice/vote in these things...

You think people voting on vendors would somehow result in less data leaks?

It would result in us not approving the project/vendor at all.

edit: We(citizens) don't like being spied on, and it just rubs salt in the wound that we have to pay for it.

Given that this is literally only useful for tracking people, not approving the project would be ideal.

Tracking people is like 90% of what every organization does. It's why we have names, phone numbers, addresses, id numbers, license plates, etc. "Tracking people = bad" is an untenable assumption.

Well go ahead and dox yourself right here for us.

Perhaps I'm not being clear. "Tracking people involuntarily" might be bad. "Tracking people unnecessarily" might be bad. But they will be bad because they are involuntary and unnecessary, not because they are tracking people.

Ok, I argue then that all tracking of people is unnecessary.

Are you going to actually argue it, or are you just going to tease? It means nothing to me that you would disagree in some vague manner.

My username is associated with my real life identity and would be trivial for people to look me up.

I am already "doxxed" and it doesn't bother me.

I don't see it anywhere. Where does it list your personal info?

It's easily googleable, man. In 5 minutes you could find my LinkedIn info, and then know exactly who I am and where I work.

If you think 0% of the citizenry supports scanning licence plates that cross the border, I have bad news for you.

The overwhelming majority of people are probably ok with recording plates at the border. If it's confined to the border it's not really any different than the customs agent jotting down your license plate. It's also not really "tracking", it's just a record created at that single point. It's the whole network that makes it "tracking"

It's the record keeping everywhere else and tracking that that enables that's not ok.

Jotting down on paper is not the same as aggregated and uploaded to an open ftp server.

I honestly support license plate readers on most roads, if they were used for average speed measurements and in conjunction with automated ticketing of people speeding.

If you are a citizen and have not been disenfranchised, and I grant you that many fall outside of these buckets, you absolutely have a say. That say is that you can elect whomever you want to decide these things for you. It is the central mechanism in a representative democracy.

Re representative democracies, why are opinions on disparate issues like the economy, abortion, climate change, etc. all packaged into one party? At least 99% of people won't find a party that agrees with them on every single issue, so it feels like there should be a better system.

Because the US has first past the post. Sure you could vote for someone who matches every single one of your ideologies, but the chance that they win is zero.

Is there a country that does it better? I can't think of one. Voting on people seems what's being done everywhere, and the source of this problem.

Ergo: need a better system

But first past the post already works...

...for the people who would need to vote to change it.

They ain't gunna change something that will work against them after they change it...

Yea, you nailed it.

I'm not sure you are attacking representative democracy, but instead are attacking the US two party system. I'm def not going to argue against you on that. Just wanted to say that in a representative democracy, one does have a say in policy. I do very much agree that the US tries to give us all as little of a say as possible, as the "adults" (read: billionaires) decide things for us

Unfortunately, in the US, we've allowed our representatives to enact a hard cap on the number of representatives. So the representative you elect decides these thing for an increasingly larger number of citizens as population increases. So while you have a say, the weight of that say on choosing a representative is ever decreasing.

Gerrymander strategy dictates that you want as few urban districts as possible to maintain rural power. Raising the caps would weaken the benefit of packing opposition voters into fewer districts.

Yes removing the limit put on in the early 20th century would have broad ranging consequences, most of them would be to the detrement of the existing representatives.

...and then the person we voted for doesn't win the election and the person who does win the election doesn't listen to the citizens they're supposed to represent.

This is a flawed viewpoint. The idea that any single company can stand up to a nation state is absurd. The level of resources that Russia or China brings to bear for a single hack is far beyond what any company or even groups of companies could defend against.

This is the unfortunate weakness of the Western style democracies in the face of totalitarian states. We have a much more obvious divide between private and public entities. In China and Russia, the lines are blurred and often they get much better support from the government to defend and hack the opposition. Even to the point where China will hack US companies and just give the IP to Chinese companies.

> Even to the point where China will hack US companies and just give the IP to Chinese companies.

You can be sure that the USA is giving the "acquired" IP to their us companies as well.

Where do I sign up to collect stolen IP?

Not to the level that China is doing it, that's for sure. Just by the very nature of how closely companies in China operate with the government, you'd be fooling yourself if you think that's equivalent.

I try my best!

> The files also include .mp3 files, presumably from someone's desktop or laptop PC. Among the songs: Superstition, by Stevie Wonder, and Wannabe by Spice Girls, and a variety of AC/DC and Cat Stevens songs.

Quite an odd detail to add to the article, why was this seen as relevant?

It humanizes the victim from computer hard drive hacked to this is a person with quirks who you could know.

Also allows me to do some musical profiling. This identifies them as probably a GenXer or so. Because you would've had to be between about 15 and 25 to like the Spice Girls in 1997, let's say up to 30 y.o. to keep their MP3 ironically and/or nostalgically. And I'm claiming there's no way in hell you would've "discovered" them later, so that's a 15-year age range, birth years bounded 1967-1982, current ages 37-52.

Stevie Wonder and Cat Stevens tend to push it more toward the older end of that range.

AC/DC correlates with that entire cohort, so doesn't provide any additional or contradictory information. Unless we knew whether it was Bon Scott era or Brian Johnson, heh heh.

Expected age 46 ± 5.

And let everybody know: hey again it's a Windows desktop system in a critical environment and surely everything else is Windows based products.

Reminds me of when they reported on the contents of the hard drive of Bin Laden.

He was a Spice Girls fan?

It contained strategic Al-Qa‘ida’ documents, but news outlets loved writing about the unlicensed anti virus software, disney movies, anime, porn that was also on the hard drive.


I think he watched anime.

Were logs of license plate scans leaked, if so, is there anywhere we can find the dump to see if our plates are in it?

This is dystopia, but this is not just any dystopia, this is dystopia with 'justification', this dystopia is 'legal' and for many that word somehow makes everything ok, but for the rest of us trivializes everthing of value.

There is so much cognitive dissonance and denial in the tech community and their role not just in building but also defending and whitewashing narratives that its becomes difficult to see movies and read about surveillance dystopia and be expected to feel creeped out and then return to current reality where its sort of normalized and ok.

so it's probably some off the shelf ANPR tech that sends info into a database?

same as my local supermarket's car park?

Looks like Russians (disclaimer: am Russian, work in infosec).

Some previous hacks that were attributed to Russians, like Shadow Broker leak, actually were executed by somebody else, I think. This one is more suspicious, in my opinion.

Wouldn't the first thing a good hacker would do is to make sure he doesn't get cought? A good start would be to make it look like someone else did it, especially a entity that can't be checked or would cooperate to catch the actual hacker like the Russians or Chinese.

Yes, that's standard practice. Even hobbyists like me do it.

Also, any good hacker would be sure to leave behind multiple access paths. But maybe a professional hacker would refrain from dumping stuff, because that alerts the target. So the dump implies that they're not very professional.

This dump was made public on purpose. So were other dumps in cases like that.

OK, what's the advantage for the hacker? Except fame, I mean.

Isn't it better to stay quiet, to help ensure long-term access?

It all depends on the goal right? In this case it seems the goal was to harm the company because they didn't pay ransom. Not to perform long term espionage.

... Based on?

"Boris Bullet-Dodger" is a Russian joke name, «Борис-хрен-попадёшь» (and it is rare enough to be picked up by somebody who is not Russian). Repacking everything in rars in 2019 is also a Russian thing. If you are interested why I think Shadow Broker was not of Russian origin (as opposed to Guccifer 2.0, who was), I can also provide my insights.

The choice of the venue to leak the information and some other minor details lead me to strongly suspect one of our state-sponsored APTs.

I'd certainly be interested to hear your thoughts on the Shadow Brokers.

Their texts do not look like a Russian wrote them. It is broken English, right, but it is broken in wrong places (I am no stranger to writing in bad Russian English myself).

Russian English is known for skipping or confusing articles, longer sentences, more commas than it’s necessary, wrong tenses, and some troubles with pronouns. There is almost nothing of that. What _is_ there instead is way too many abbreviations than a Russian would use, things like POTUS/SCOTUS etc. We don’t use those. Also, the word “caucus” — quite an americanism, if a non-American person would have used that, they would speak much better English.

I am 90% sure that the perpetrator here is either an American, or lived in the US for a long time, and wanted to disguise his writing by intentionally distorting it.

Why do they repack in RAR?

I have always when encoutering RARs rolled my eyes and then tried to remember the names of tools to do extraction.

RAR was invented by a Russian (Eugene Roshal) and still popular in Russia (compresses better than zip and faster than 7z).

Interestingly, 7z is also of Russian origin (Igor Pavlov), and a lot of data compression research comes from there too --- possibly a legacy started by Markov.

That, and the fact that our computers were really shitty. :) It also helped in development of very clever analytical and numerical integration techniques, e.g. Sobol sequences.

That's really interesting. When working under constrainst people get creative. I think it just happend so that most computers that I've worked on don't handle RAR by default so I never knew the background.

I believe this is a holdover from piracy the early 2000s, when file sizes needed to remain small (due to unreliable downloads or the need to fit on a particular piece of physical media) and RAR was a convenient format for generating compressed archives in multiple parts.

I recall it being the common go-to for file-sharing (piracy) uploads on Usenet. I'm not sure if it was because of repair-ability when you were missing some of the RARs, file-size limitation, resume-ability, low bandwidth, etc... Probably a combo of them all I guess; wasn't aware it was still used nor that it might be more common in RU.

Yeah that's the one I thought of. A tough-guy character in a tough-guy movie made in 2000 seems a bit less obscure than some Russian pun. Also what actual Russian would choose such a name?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact