BTW, most of the insecam cams (so no need to scrape them), plus some more are available in my c4 list. Though it might be a bit out of date now. Incidentally, I had the same idea but never followed through.
Now we just need someone to design a Samaritan-esque UI for it ;-).
The HN title 'Counting Foot Traffic Over IP Webcams" caught my eye because, whilst counting people in a static image is doable with YOLO, counting foot traffic (i.e. how many times people enter a scene, or how many times people leave a scene) requires more calculation, as you need to know whether someone in frame t is the same person who was in frame t-1.
But when I clicked through, the title of the article was 'Counting People with Machine Learning', and it's just counting presence, not foot traffic :(
If I had a store, though, I would count someone walking past one way and then they other as two potential visits.
I would imagine that "people in room at time t" and "people in room t+1" is quite a good proxy for "number of people present all day", it is certainly an upper bound.
It's not as simple as that. The problem is that YOLO can process ~100 FPS, where a frame is something like 250x250 pixels. If you want to process high resolution, you have to sweep a 250x250 "window" across the high res image and run Yolo on each. Then you have to figure out which objects were multiply counted due to the windowing. Even if the windows don't overlap (which they should) Yolo might detect the left half of a person in one window and the right half in the next.
Once you've done all that, a top end GPU can handle about 4 FPS (assuming 1080P input).
Then the problem is that YOLO will occasionally miss blindingly obvious objects. That combined with the fact that you've only got 4 FPS means that detecting the direction of a person is hard - they tend to move across the camera's field of view before you've got enough data to be confident what just happened. A person walking from left to right looks the same as someone walking off the left of the frame and then someone different walking onto the right of the next frame.
At some point its easier, cheaper and more accurate to install an IR laser beam and count the breaks. You'll save about half a kilowatt too.
Another interesting point is that YOLO is pre-trained on hundreds of object classes. This feels like a waste. I wanted to retrain it with all but the people class removed from the training set. My learned colleague suggested that was a stupid idea because YOLO learns general info about how to separate objects from backgrounds from all the object classes. Not showing it surfboards makes it worse at detecting people. Crazy.
YOLO's a bad way to do it. I tried a bunch of facial recog libraries, had overall bad success for larger frame sizes and framerates. I also don't have a CUDA/OpenCL card for my laptops. So it's CPU for me.... Alas.
OpenCV's facialRecognizer class is one of the fastest I found. And it's what I used in my program.
Primarily, it does a LBP cascade finding "any face", including ones that look like walls. Thankfully it has False positives, but almost never false negatives. Then, I use each region of interest's area and do a haar cascade for eyes. If theres at least 1 eye in the region of interst, I pass it to the classifier.
From there, the classifier then runs the image zone into the classifier. If its not there, it adds it. if it is, then it adds this as another sample to further prove the face.
Sure do. Doing that increases the quality of the classifier for that face-hash. That also helps if they show up a bit later with slightly different lighting.
I also implemented a "no more than 50 samples per matched face" to keep the size of the face-hash-db down.
and my old code's currently on gitlab, gitlab.com/crankylinuxuser . It's pretty crappy as it was a weekend hack. I need to separate the engine from the GUI, and make the GUI web accessible. There's a few more pieces to do that, but I was looking at selling it for various purposes.
Really cool. I actually wanted to train and use YOLO on my data from Python for a research project I am doing but didn't know how to run it from Python and I only know very limited amount of C. So far I am using RCNN from TensorflownObject Detection API. Would you mind if I copy your C to Python code and maybe shoot you an email if I need help using your code?
This is clever! I admire the hacker spirit in this... seems pretty interesting, and anonymized person data / trends would definitely help restaurants / other businesses who need to staff part time folks to fill out shifts.
Thanks for the response! It might be a bit of a stretch, but I remember reading a comment on here about someone who used satellite imagery to count the number of cars in store parking lots and used that data to inform investment decisions. Perhaps something like this could be applied to that use case.
There are several startups working on this, like Prism Skylabs: https://prism.com/ . I had this idea as well and pitched it to small business owners, but wasn't able to drive enough interest to want to pursue the idea in earnest.
Just a guess, but you were probably trying to sell them their own data (total people in/out, peak busy times, largest/smallest crowd in location, etc.). That's not what business owners want to know the most. In my experience with retail-type stores, the owners and managers are more concerned with how their competitors are doing in comparison. If you were to sign up many related customers into a network and share quasi-anonymous data with store owners like, "a similar sporting goods store to yours located ~0.5 miles away just set a new weekly attendance record in the sporting goods category..." or, "a sports bar located within ~0.4 miles of your location is now at 80% capacity...", that is information owners will pay for. Rank the store against their competitors to show them what percentage of customers they are getting out of the total segment. The only difficulty is signing up enough related stores to provide meaningful data.
Just an FYI, Panasonic is offering a commercial product that offers analytics derived from 360 degree camera images. I saw a demonstration of this system a couple of years ago at the National Restaurant Show.
This is cool technology and all, but I wonder if the people working on it ever stopped to think about the moral and ethical implications of what they were doing.
Although I was initially creeped out by Insecam, I was fascinated with the idea that I could peer into so many different corners of the world just by clicking on a couple of links.
If it sounds creepy, and probably is. Go with your gut on this kind of thing. Then again, maybe it's better to have the tech out in the open.
Pretty sure that this guy, an undergraduate student in Virginia, has nothing to do with Brigham Young University projects from the 1970's-80's. So, byu.io != byu.edu
Now we just need someone to design a Samaritan-esque UI for it ;-).
- c4: http://git.io/c4
- info: https://github.com/turbo/c4
- Samaritan reference: http://personofinterest.wikia.com/wiki/Samaritan