
Show HN: Gesture controlled Smart Mirror - shinao
https://github.com/Shinao/SmartMirror
======
amendez
How did you implement the hand detection? Since your using HSV color space,
are your just doing color segmentation for a guessed range of Hue, Saturation,
and Value for the color of skin? I have a few ideas on how to really improve
the detection since I've worked on skin segmentation before

~~~
shinao
Oh nice, yes that's exactly what I'm doing, I take the middle of my hand, take
the HSR point and increase it's range to get the whole hand. I'm listening :)

~~~
amendez
Nice! You can implement a real time skin segmentation algorithm using HSV
segmented skin as training, and using random forests to do classification, it
classifies in real-time with opencv!

Here is a video of my result from only training with one frame of a segmented
HSV hand to some random videos:
[https://vimeo.com/114489035](https://vimeo.com/114489035)

Also, you can use a kinect or an occipital structure sensor to do segmentation
from a certain depth, more hardware but less vision computation.

~~~
amendez
And using a depth sensor may be more robust for hand tracking and gesture
recognition.

~~~
shinao
Thanks for the info, learnt something. What kind of project did you do for
skin segmentation ?

~~~
mendeza
I was working on skin segmentation for a research project to it segment in
real time for Google Glass! I never did the training with a huge database of
ground truths. I want to eventually complete the training right to build a
robust skin segmentation alg. I can link the github page(Opencv implementation
only), but my code is really messy as I built as a POC in two days.

The google glass app was to use skin segmentation for gesture recognition.

~~~
shinao
Oh nice ! Maybe can be used for translating sign language in real time, that
would be amazing, kinda hard to get the depth infos though.

Thanks for the answers and if you ever finish the alg you know where to post
it :)

~~~
mendeza
I'll let you know, We should collaborate and make the mirror be interactive
like Jarvis! :D. Message me at interactivetech1@gmail.com

------
shinao
I posted on reddit yesterday but didn't have much feedback, maybe I will have
a better chance here : how would you guys have done it ? Do you have any idea
for any new widget that I could integrate ?

~~~
nkg
I acknowledge the technological feat, but could you tell us about your
motivation? What is the place of a smart mirror ?

~~~
shinao
Obviously having a new cool gadget was my big motivation, but for the useful
things : knowing the exterior temperature, the time obviously (I removed my
clock), I can play a game if I'm waiting for something for a few minutes, etc.
But basically you are right, if I have my phone, it just takes almost as much
time for me to do this things.

~~~
nkg
I think there is something to do around the concept of a smart mirror, because
we have been introduced to many prototypes in last 5 years. It's like people
want to have it, people want to build it, but no one is sure what it is about.
Maybe it should be conceptualized more like a decoration/fashion item...

------
inertial
Thanks for sharing your cool hack. I remember two similar projects a while
back that caught HN's attention :

[https://news.ycombinator.com/item?id=10204018](https://news.ycombinator.com/item?id=10204018)

[https://news.ycombinator.com/item?id=10801430](https://news.ycombinator.com/item?id=10801430)

P.S. I'd be tempted to call it Jarvis. It's fun to swipe away icons in the
morning :)

~~~
shinao
I got the inspiration from these kind of projects, I saw a Jarvis one like a
few weeks ago (voice activated), I wanted to make one with a Minority Report
style in mind though :)

~~~
internaut
Well done on your project.

Any news on Google's Soli chip?

~~~
shinao
Thanks! Didn't even know this was in development. Google though, if I'm
waiting like I am for Project Ara it's gonna take a while.

------
Feneric
Just wondering why you rolled your own gesture recognition system rather than
using something off-the-shelf like the Leap Motion.

Also, your comments about the Python environment are maybe a little unfair...
You have an unusual (and some might argue overly-complex) use case where
you're combining it not just with the third-party OpenCV package and wrapper,
but also Tornado _and_ Node.js. It would have probably been easier going with
either the Python approach or the JavaScript approach rather than combining
them.

All in all, pretty neat though.

~~~
shinao
I considered Leap Motion but didn't know if it would have worked behind my
mirror and in vertical way (Maybe if you have one you call tell me). Also I
always like to learn new stuff, OpenCV seemed like something I could use again
in the future.

I tried running everything on Node but the OpenCV wrapper is not there yet.
Completely right on Python though but while learning it I wasn't interested to
go on with it for the front end.

Thanks !

~~~
hlfshell
We've tried that here - one-way mirrors, straight glass, plexiglass,
projection film - leap motion will NOT work through the materials.

It CAN work vertically with caveats.

------
johnmurch
I am shocked at how many different DIY versions of a smartmirror has been
posted here in HN. I was hoping someone would kickstart something like this,
but just saw [https://www.kickstarter.com/projects/338193274/mirro-the-
wor...](https://www.kickstarter.com/projects/338193274/mirro-the-worlds-most-
personal-device) didn't really get off the ground. Maybe you can kickstart
yours and I can be the first to buy :)

~~~
shinao
It's probably waiting for the mass production of either e-ink display or
screen as flat as a tablet to be more attractive and/or low power demanding.
I'm not attracted by the business side of things, but you go for it ! :)

------
grogenaut
Just curious but what do people see the major use of smart mirrors being? I'm
personally only in front of a mirror a few minutes a day but I know others are
a lot more. Also depending on the mirror re-installation would be quite a pita
and so would likely be as bad as in dash systems for aging out. Especially
with early rapid iteration.

------
tlrobinson
This looks awesome. Latency in touch/gesture systems is brutal though.

------
kctess5
Hi there. I would like to turn your attention to this excellent paper on hand
gesture recognition [1]. I ran across it in my foray into hand tracking and
gesture recognition with the Kinect. I implemented the Fourier descriptors
described in that paper (there's even source code in the paper!) and agree
that they are quite effective.

Since you already have a decent looking hand segmentation method, you could
simply trace the outline of the segmentation (opencv will give you contours if
you ask politely) and generate descriptor vectors with that. Fourier
descriptors are rotationally invariant, and you can get scale invariance
easily by scaling your hand images to a constant size after segmentation. You
can use the descriptors with a variety of ML algorithms, but K nearest
neighbors is probably the easiest (and it was I implemented). SVM is probably
also a good method. Using ML has the absolutely gigantic upside that you don't
have to write code to recognize individual gestures, and it can learn many
gestures (I did 5+ no problem).

I am actually working on open sourcing my code (didn't release it immediately
because messy code + busy life during the semester), and porting it to Python
so that I can use Numpy and Scipy. The original version I wrote in C++, which
includes hand tracking (via OpenNI) + static gesture recognition on the Kinect
360 sensor, as well as a CLI for interfacing with the code and building a
gesture library. To start tracking, you have to wave vigorously at the camera.
In the Python version for the Kinect One sensor, I'm cleaning things up and
also implementing my own hand detection and tracking algorithm (based on an
unscented Kalman filter) so that I can kill the dependency on OpenNI, which
will help calm the vigor of the wave gesture required to initialize OpenNI
tracking. I expect good results when the Python version is finished, due to
the high quality depth based hand segmentations I am seeing with the new
Kinect. I might be able to hook you up with some source code if you don't mind
seeing the rough draft and having a tricky install process.

While the older Kinect works with the Fourier method, it is not as good as the
Kinect One sensor because it uses a depth reconstruction algorithm that
results in jagged edges and merged fingers. The new sensor uses a time of
flight sensor which gives very nice accuracy and normal looking edges.
Granted, you could get around this problem via skin segmentation on the RGB or
IR images that both Kinects also provide.

A camera only system like yours can work well with enough effort, but with a
depth image you will almost certainly get better segmentation and detection
results with simpler code. Then you can focus on higher level problems like
static and dynamic gesture recognition, and design a better UX around those
things. This does come at the price of needing a more expensive and larger
sensor, though there's a few depth camera options available these days.

Hope this helps! Happy to discuss more if you'd like.

[1]
[http://www.bu.edu/vip/files/pubs/reports/CCSSM13-04buece.pdf](http://www.bu.edu/vip/files/pubs/reports/CCSSM13-04buece.pdf)

Edit: re your comment on Python, I have gone the C++ route (as well as just
about every other mainstream language) and can honestly say that Python is a
much easier development environment than what you will find in C++. Numpy and
Scipy are where it is at. Even after completing two major software projects in
C++, I have almost an order of magnitude greater productivity in Python,
largely because these libraries and the incredibly readable syntax. Also they
are blazing fast - you will be hard pressed to write faster C++ than the C
code that powers Numpy. In my experience the algorithms you use and your
system architecture will be a much larger factor than your language of choice.
I choose what gives me the highest efficiency in development because I can
focus more on nailing the architecture, algorithms and building necessary
tooling. If you have never worked with C++, I assure you, the darkness that is
C++ compilation and linking hell is not for the faint hearted.

~~~
shinao
Amazing ressource thank you ! I should have search deeper before my project,
clearly not finished reading yet but there are clearly superior methods than I
currently have.

My grudge on python was probably linked with the incompatible libs with the
version 3. I should have used 2.7 and miss some features with OpenCV in
restrospective, but I wanted to learn python at the same time and it seemed a
waste of time to begin with an older version. I've had a lot more experience
in C++ and I don't mind a few dependencies problem, but it's probably that I'm
used to it now.

I would love to see if you have a demo of what you have done (we never think
to record our projects but heh you never know), if not a link to the code can
be interesting when you aren't busy :)

