
Show HN: Counting Foot Traffic Over IP Webcams - brianyu8
https://www.byu.io/2017/12/07/counting-people-with-ml
======
minxomat
BTW, most of the insecam cams (so no need to scrape them), plus some more are
available in my c4 list. Though it might be a bit out of date now.
Incidentally, I had the same idea but never followed through.

Now we just need someone to design a Samaritan-esque UI for it ;-).

\- c4: [http://git.io/c4](http://git.io/c4)

\- info: [https://github.com/turbo/c4](https://github.com/turbo/c4)

\- Samaritan reference:
[http://personofinterest.wikia.com/wiki/Samaritan](http://personofinterest.wikia.com/wiki/Samaritan)

------
rahimnathwani
The HN title 'Counting Foot Traffic Over IP Webcams" caught my eye because,
whilst counting people in a static image is doable with YOLO, counting foot
traffic (i.e. how many times people enter a scene, or how many times people
leave a scene) requires more calculation, as you need to know whether someone
in frame t is the same person who was in frame t-1.

But when I clicked through, the title of the article was 'Counting People with
Machine Learning', and it's just counting presence, not foot traffic :(

~~~
PeachPlum
I understand and agree with your nuance.

If I had a store, though, I would count someone walking past one way and then
they other as two potential visits.

I would imagine that "people in room at time t" and "people in room t+1" is
quite a good proxy for "number of people present all day", it is certainly an
upper bound.

~~~
abainbridge
It's not as simple as that. The problem is that YOLO can process ~100 FPS,
where a frame is something like 250x250 pixels. If you want to process high
resolution, you have to sweep a 250x250 "window" across the high res image and
run Yolo on each. Then you have to figure out which objects were multiply
counted due to the windowing. Even if the windows don't overlap (which they
should) Yolo might detect the left half of a person in one window and the
right half in the next.

Once you've done all that, a top end GPU can handle about 4 FPS (assuming
1080P input).

Then the problem is that YOLO will occasionally miss blindingly obvious
objects. That combined with the fact that you've only got 4 FPS means that
detecting the direction of a person is hard - they tend to move across the
camera's field of view before you've got enough data to be confident what just
happened. A person walking from left to right looks the same as someone
walking off the left of the frame and then someone different walking onto the
right of the next frame.

At some point its easier, cheaper and more accurate to install an IR laser
beam and count the breaks. You'll save about half a kilowatt too.

Another interesting point is that YOLO is pre-trained on hundreds of object
classes. This feels like a waste. I wanted to retrain it with all but the
people class removed from the training set. My learned colleague suggested
that was a stupid idea because YOLO learns general info about how to separate
objects from backgrounds from all the object classes. Not showing it
surfboards makes it worse at detecting people. Crazy.

~~~
crankylinuxuser
YOLO's a bad way to do it. I tried a bunch of facial recog libraries, had
overall bad success for larger frame sizes and framerates. I also don't have a
CUDA/OpenCL card for my laptops. So it's CPU for me.... Alas.

OpenCV's facialRecognizer class is one of the fastest I found. And it's what I
used in my program.

Primarily, it does a LBP cascade finding "any face", including ones that look
like walls. Thankfully it has False positives, but almost never false
negatives. Then, I use each region of interest's area and do a haar cascade
for eyes. If theres at least 1 eye in the region of interst, I pass it to the
classifier.

From there, the classifier then runs the image zone into the classifier. If
its not there, it adds it. if it is, then it adds this as another sample to
further prove the face.

I can get 15 FPS@1280x720 on a Thinkpad T61

~~~
rahimnathwani
If the same face is in two consecutive frames, do you pass it into the
classifier twice?

~~~
crankylinuxuser
Sure do. Doing that increases the quality of the classifier for that face-
hash. That also helps if they show up a bit later with slightly different
lighting.

I also implemented a "no more than 50 samples per matched face" to keep the
size of the face-hash-db down.

[https://hackaday.com/2015/03/04/face-recognition-for-your-
ne...](https://hackaday.com/2015/03/04/face-recognition-for-your-next-con/)

and my old code's currently on gitlab, gitlab.com/crankylinuxuser . It's
pretty crappy as it was a weekend hack. I need to separate the engine from the
GUI, and make the GUI web accessible. There's a few more pieces to do that,
but I was looking at selling it for various purposes.

~~~
rahimnathwani
Awesome. Thanks for the clarification re the 50 sample limit. And thanks for
sharing the code.

------
arif_sohaib
Really cool. I actually wanted to train and use YOLO on my data from Python
for a research project I am doing but didn't know how to run it from Python
and I only know very limited amount of C. So far I am using RCNN from
TensorflownObject Detection API. Would you mind if I copy your C to Python
code and maybe shoot you an email if I need help using your code?

~~~
hertzdog
Maybe this can help :)
[https://github.com/thtrieu/darkflow/](https://github.com/thtrieu/darkflow/)

------
randall
This is clever! I admire the hacker spirit in this... seems pretty
interesting, and anonymized person data / trends would definitely help
restaurants / other businesses who need to staff part time folks to fill out
shifts.

~~~
brianyu8
Thanks for the response! It might be a bit of a stretch, but I remember
reading a comment on here about someone who used satellite imagery to count
the number of cars in store parking lots and used that data to inform
investment decisions. Perhaps something like this could be applied to that use
case.

~~~
greglindahl
There are numerous NewSpace companies with that as their business plan.

------
joezydeco
Just an FYI, Panasonic is offering a commercial product that offers analytics
derived from 360 degree camera images. I saw a demonstration of this system a
couple of years ago at the National Restaurant Show.

[http://www.security.us.panasonic.com/feature/ultra-360-camer...](http://www.security.us.panasonic.com/feature/ultra-360-cameras)

------
094459
Great write up and really loved the approach you took and the learnings you
left with. Keep it up!

------
jorjordandan
DIY NSA

------
nerdponx
This is cool technology and all, but I wonder if the people working on it ever
stopped to think about the moral and ethical implications of what they were
doing.

 _Although I was initially creeped out by Insecam, I was fascinated with the
idea that I could peer into so many different corners of the world just by
clicking on a couple of links._

If it sounds creepy, and probably is. Go with your gut on this kind of thing.
Then again, maybe it's better to have the tech out in the open.

------
crvd
Thanks BYU! The surveillance state thanks you!

It's very patriotic, just like all that top secret work you used to do with
the Eyring Research Institute!

~~~
jalessio
Pretty sure that this guy, an undergraduate student in Virginia, has nothing
to do with Brigham Young University projects from the 1970's-80's. So, byu.io
!= byu.edu

~~~
brianyu8
Haha very true. Although I did receive an offer to buy my domain name from a
Brigham Young University IT guy. True story! :)

~~~
crvd
Ok, but I still feel good calling it BYU so I'm gonna leave that.

~~~
crvd
Calling out _

