Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Build a DIY license plate reader with machine learning (github.com)
190 points by calebkaiser 10 days ago | hide | past | favorite | 60 comments

Hey HN,

I’m Caleb, and I’m a maintainer of Cortex, an open source model deployment platform. Not long ago, we published this DIY license plate reader project, and I wanted to share it here for anyone who is interested in computer vision or production machine learning in general.

The project is a web service that accepts images and, using three trained models, returns extracted license plate text, assuming there is a license plate in the image. Of the models used, two are pre-trained models from keras-ocr, while one is a fine tuned YOLOv3. All models are freely available.

You can see a video of the project in action here: https://www.youtube.com/watch?v=gsYEZtecXlA

And read a write up by Robert Lucian, the maintainer who spearheaded this entire project, about how he built a camera system to interface with the web service using a Raspberry Pi and 5G: https://towardsdatascience.com/i-built-a-diy-license-plate-r...

> using a Raspberry Pi and 5G

For anyone else wondering, he used a Raspberry Pi and a mobile internet connection. Any number of G's will do.

Even 2G?

You joke but I literally worked with a Raspberry pi 'hat' that talks in 2G.

I wonder if it would be more efficient with OpenCV?

That's very possible. The Python interface for writing prediction APIs makes it pretty easy to switch between models, so it shouldn't be hard to test.

This looks very neat. Do you think something like a Jetson Nano would be able to handle inference locally?

Thanks! The short answer is that I don't know, as it's only been run on EC2 instances, but given its current compute needs, probably not.

The longer answer is that this project has a ton of room for optimization, some of which is mentioned in the repo, and with lower latency requirements + optimizations, I don't see why it wouldn't work on less powerful hardware (I've never personally worked with the Jetson Nano, so I don't want to speak with any false confidence on it specifically).

I guess object detection is the bottleneck of the pipeline, which highly optimized YOLOv4-tiny[0]could get to 39 FPS on Jetson Nano. Also camera decoding with Nvidia Deepstream is a lot faster than OpenCV.

[0] https://github.com/pjreddie/darknet/issues/2201

Take care to understand your local law regarding automatic license plate recognition. Use of such software and/or data collected with it may be under regulation. IANAL, but see your state statutes here if you live in the US: https://www.ncsl.org/research/telecommunications-and-informa...

I find it interesting that some states are trying to regulate something like this.

What if you had a child read off every license plate they see and use voice recognition to record the license plates?

Is it illegal in real time, but allowed if you replay your dashcam footage later?

What if you let the child buy drugs? Same thing. Don't let your child do illegal things and better - don't help it. If you hear your child is reciting the number plates, make sure all recording hardware in your car is off before you lecture the kid that what it does is absolutely forbidden.

This made me laugh

Good luck having someone do that 24 hours a day, or during rush hour. I assume it's more about bulk collection.

The vast majority of those laws seem to be targeted at law enforcement. I imagine any system designed to automatically collect license plates would be included, whether it's sync or async.

The site makes it sound like it's only illegal for law enforcement to do this.

The photos I saw seemed to be in Paris, and I'm pretty sure it's illegal to run a license plate reader in France.

Am I missing something or in this image, is the confidence score 99.23% and has the incorrect license plate? The plates clearly have a 5 but it's predicting an S.


Is that confidence for the value or the category. It comes after LP in the image which makes me think it's the confidence that it is a license plate, not necessarily the confidence of the reading of said license plate.

You might be right. The gif on github, at times, shows no plate prediction but still includes a confidence score. I would suspect that if it can't even show a prediction, it's confidence score wouldn't show either (being that it would be 0%).

> I would suspect that if it can't even show a prediction, it's confidence score wouldn't show either (being that it would be 0%).

Recognizing that something is a license place, and being able to read the plate are mutually exclusive things. People can identify license plates even if they can't read them, and people who don't know what a license plate is my still be able to read the numbers on them.

The ML has to do two things; the first is reject anything that isn't a License Plate, the second is OCR License Plates.

Typical model output is not actually a probability, but a mostly arbitrary number between 0 and 1. Models usually have to be calibrated to match probability.

It is really confident in its answer, but that does tell you much about how likely the answer is correct.

For example, if I built a model that always returns the same plate number, it would be 100% confident, but wrong 1947791 times out of every 1947792 plates it sees.

That said I would have thought confidence would be lower for plates with 5 and S in it.

For neural networks unless it's calibrated otherwise the confidence score has little meaning to the accuracy. [1] see figure 4 and you will understand what I mean

[1] https://arxiv.org/pdf/1912.03263.pdf

ANPR Tip: Correlate between successive images to improve confidence, and use (if available eg in the UK) legal plate format information to rule out unlikely characters.

Also, humans aren’t perfect at reading plates from highway footage still images - which is why ANPR systems are never 100% accurate.

The 5 prediction probably had a confidence score of 98.52% or something, and the model chose the one it believed to be most likely.

I don't know anything about neural networks, but the fact that you can have two incompatible predictions each with more than 50% confidence tells me that confidence is not what my intuition would predict (with 87% confidence)

99.23% is not 100%. 1/130ish But, the wrong examples don't seem a good demonstration of that...

Nope, you are correct. Wow, both examples from that image are wrong.

"Do-it-yourself" seems like a not quite appropriate title for an interface that calls a webservice with pre-built vision ML.

This is CIY - cobble it yourself.

That's a fantastic description. Never seen it before. I'm a data scientist. Some of my projects are DIY and some of them just consist of writing code to link data with existing models and I might start using "CIY" to describe them.

I was thinking the same thing. If it was all localized to a Pi then yes, but outside services isn't really DIY so much. Maybe half DIY if such a thing.

On the other hand, I would tend to agree that this project is a "DIY Alexa" (edit to add: in the minds of most readers of Instructables website): https://www.instructables.com/id/DIY-Amazon-Alexa/

This is another example of what the GP comment is complaining about. It's not a DIY Alexa. Alexa is the virtual assistant API. This is a DIY Amazon Echo that still relies on the Alexa service.

As an engineer and a technologist, I agree that you and GP are precisely correct.

If you showed it to and asked a string of randoms off the street, I think way more than half would agree with the "DIY Alexa" label as reasonably communicative description and more useful than a longer but more precise title.

Yeah, that's a good point. It's easy to get caught up in our little tech bubble :)

I've been watching dashcam footage lately and thinking about this. It would be very helpful for hit and runs if the dashcam had some on-board recognition that saved the last x license plates just like it saves the last x seconds of videos.

Often the video footage is not good enough to get a plate because the capture settings are set lower so that more footage can be stored.

I wonder how much more revenue police departments are making with ANPR[1].

They can, for example, have a camera in all the cruisers, and automatically alert when a nearby car has an expired registration, inspection, etc. Or, if they have the tie-ins, registered to an owner with an open warrant or unpaid traffic citations.

[1] https://en.wikipedia.org/wiki/Automatic_number-plate_recogni...

I’ve pondered the implications of running an ‘installation art’ project where a bunch of these are deployed around a city, with aggregated data visible over a web UI (assuming that this doesn’t violate any laws.) The aim would be to raise public awareness of pervasive surveillance, and perhaps catalyze changes to the law.

Regardless of the content, I think the idea of an installation art project whose goal is to make the art project itself illegal is pretty cool.

How does this implementation perform compared to classic approaches such as OpenALPR [1] ?

At the very least running local inference becomes much more expensive, and possibly provides worse results.

[1] - https://github.com/openalpr/openalpr

This reminds me of an article [1] I read a few years back which utilised OpenALPR.

[1]: https://www.freecodecamp.org/news/how-i-replicated-an-86-mil...

The actual computation here shouldn't be that great, right? (especially given the RPi has a GPU on) Seems like an inconsiderate design to stream images to a cloud service, rather than process locally and stream the plates themselves.

This is great! I've been meaning to build something like this for a long time. I'm always excited when I notice I've passed the same car on commutes on different days, or see a car I recognize from home in a different part of town. Automating that process and having my phone tell me e.g. "You just passed a car that you once drove past 1,000 miles away from here" is such a tempting side project. Bit big-brothery, though.

I want this tied to a crude speed estimation algorithm, from a stationary camera.

Even with ALPR retention restrictions, I could trigger a video save and send the police a video of the idiots doing 50mph through the residential neighborhood.

For speed estimation, I wonder if the simplest thing would be two plate readers a fixed distance apart. Say they're attached to light poles N meters distant.

When the reader triggers and takes a picture, it pings your server for a timestamp. It does the processing and records car X at time T provided by the timestamp. You then calculate the speed by the distance between the two meters divided by the time between the two timestamps for X.

Provided that speeding cars are a problem in your neighborhood, I bet you could find the people responsible for the light poles (which likely have power associated to them) to let you install the devices, especially if you're providing the devices and servers.

I don't think you'd necessarily be able to ticket the drivers though, as you're not law enforcement. Maybe you'd be able to work with local law enforcement though.

I was thinking of using the bounding boxes across frames.

Maybe I could work with a neighbor, but then things need to be tightly time-synced.

Zero chance I could get light pole access. A city endorsed surveillance system? No way.

Yeah, I am interested in adding LPR to my car's computer at some point, but it'd be locally processed and stored for personal reference.

I'd maybe like to know if it detects the plates of people I know well nearby (sometimes I wonder when I see a car I feel I recognize), and I'd want to be able to mark bad drivers I see, and be warned if they come near my car again.

My thought is that if I am driving around a driver bad enough that I think about the risk of accident, I might want to know if they're close to me again in the future.

This is great. One of my side project ideas is to build an ANPR based app and 'social network' starting with basic tracking, I.e. you've passed this car 4 times before. I'm new to the ML world so still learning vut Do you think this is feasible to run in real time using CoreML on iOS?

I was thinking the other day about a DIY solar panel for apartments that could be part of the decoration like within plant vases or something in that fashion that would be cool and would look good and not have "space being occupied", that would be interesting.

"You, too, can now actively participate in the surveillance state"

Yes, I get the technical appeal. I wish we all wondered less if we could, and more if we should.

I'm much less concerned about this being used by a private citizen to keep license logs along with dash cams, integrate ALPR output into dash cam footage, etc, than I am about the surveillance State as a whole.

I've wondered plenty about it and I think Yes, we should.


It _may_ be illegal where you live, but you have to remember that people live in other places than you, with different laws.

It's probably not illegal where they live judging by their tone. Often people who don't like something will tell you it's illegal to gain your compliance relying on your ignorance. They usually cuss and make a big show of it.

I'd be quite surprised if it was illegal anywhere. Dashcams are perfectly legal and even mandatory in some situations. Getting data from that footage is legal.

I guess the uploading of images done here might be murky, but again there's millions of dashcam vids and youtube and other video sites.

Dashcams are illegal in few countries.

Applications are open for YC Winter 2021

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact