Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A tiny JS library for real-time localization of eye pupils (tehnokv.com)
153 points by tehnokv on July 2, 2019 | hide | past | favorite | 61 comments



For an explanation, see author's post:

Using decision trees for real-time localization of eye pupils

https://tehnokv.com/posts/puploc-with-trees/

It's actually quite fascinating. I'm too paranoid to enable webcam for any random site, so will be trying it out locally.

For face detection and determining the locations of the eyes, the demo uses picojs, described in another post:

pico.js: a face-detection library in 200 lines of JavaScript

https://tehnokv.com/posts/picojs-intro/


I'm all for being paranoid but what exactly is the threat model here? Someone is going to have a few seconds of video of you?


Spearfishing family members using a pic and/or some audio of you to generate a fake "help me" video? (I mean, that probably won't happen, just trying to think of something an attacker could do with it.)


Imagine being paranoid to the point of actually entertaining a hypothetical like that, so much that it affects your browsing habits.


There is actually already a pretty prevalent (lower tech) version of this scam. Grandparents are called by a con artist who attempts to pretend to be a (previously unidentified) grandchild in distress in a foreign country. [1] My grandmother lost a few thousand dollars to it because she thought they were me.

1 - https://www.ag.state.mn.us/Consumer/Publications/GrandParent...


You're using "paranoid" as if it is a bad thing.


In the age of Deep Fakes, I don't think its paranoia at all. Also, having secure browsing habits is healthy for when someone is trying to harm you.


I’m much more concerned that they might be collecting data and video that is not well secured, and then someone else could come along and use that data for nefarious purposes.

The collector if the data doesn’t need to be nefarious, they only need to be sufficiently careless.


Anything like paranoia would be a (clinical) issue only so long as it cripples your life - say if you refuse to browse the Internet at all despite this having detrimental effect on your well-being/lifestyle etc. Being cautious with your browsing habits is magnitudes away.


You could always load the page and then turn off your wifi before enabling the webcam, since it's allegedly happening client side.


Yeah, you got a point there, I can't think of a good reason why turning on the webcam on a random site would be a privacy/security risk for a random visitor.

I suppose I used the word "paranoid" to imply that it was not entirely logical/reasonable to think so.

Thinking of it, I acquired this attitude of being overly careful since Facebook, face recognition, machine learning, pervasive privacy invasions became the norm on the web. Who knows, maybe there's a way to combine random webcam snippets with other tracking technology to gather even more data about me.


The next tracking technology that completely fails on me. Tracking mouse movements is already totally helpless when it comes to blind users. But tracking my eye pupils is even more useless. I just hope nobody will ever impelment CAPTCHA based on this. It is already hard enough (and sometimes impossible) to proove that I am human.


I was thinking of using something like this that would track my pupils and then switch window focus i.e. allowing me to switch from terminal to terminal without doing Alt-Tab.

Yes, probably the height of laziness to avoid pressing Alt-Tab but its interesting to me.

Edit: Its possible that i am mistaken and what I would need for that is more gaze-tracking than pupil-tracking.


The interface in vr for elite:dangerous works like this, it does gaze tracking of sorts (tracking the headset) and pops up context menues (weapon systems, radio/communication, navigation etc). It works really well imnho. Otoh the interface is similar in regular 2d, but feels clunky and gimmicky there.


That sounds interesting. I will take a look at it. Honestly, I haven't done any study or reading on this topic so have essentially zero idea of how it might work, especially when using a webcam.


This sounds extremely interesting, I guess you could infer the direction you're looking at even with pupil-tracking instead of gaze-tracking, considering you'll move both eyes to look at a specific spot.


Can't wait to integrate this with ad serving tech to make sure users are actually watching the ads. Imagine adding little "eye catchers" to ads, and if the pupils don't respond, they are likely looking at a different screen or looking away from their computer which means you can safely pause the ad until you have their attention again! The possibilities are endless - integrate with mobile games and reward users whose pupils follow the eye catchers with more gems and more ads!


Microsoft had a patent application on a very similar idea:

> Television viewing tends to be a passive experience for a viewer[...] To increase interactive viewing and encourage a user to watch one or more particular items of video content, awards and achievements may be tied to those items of video content. [...] Producers, distributors, and advertisers of the video content may set viewing goals and award a viewer who has reached the goals.[...]

> Additionally, the viewing behavior may include an action performable by the viewer and detectable by one or more sensors, such as a depth camera.[...]

> If the viewing goal has been met, an award may be granted to the viewer. An award may be a virtual award. Such as an addition to a viewer score or an update to an avatar associated with the viewer

https://patents.google.com/patent/US20130125161


You say this sarcastically but there will be thousands of people out there thinking exactly this in earnest. Ugh.


There was a feature advertised for Android phones a few years ago (IIRC) that I always felt would be used for this kind of purpose.

Basically, if the phone was playing video and was no longer pointed to your face, it would pause the video. I could be wrong on the specifics here, but that seems close to what the feature advertised was.


Obligatory reference to Black Mirror season 1 episode 2, "Fifteen Million Merits".


disable camera. the day that is no longer possible is the day I go offline


I'm always curious about what it takes to confuse things like this (if for no other reason so that I will know how to make face recognition camouflage to hide from the autonomous weapons that roam our future post-apocalyptic hellscape).

This demo seems to do pretty well -- e.g. it handles covering one eye or arbitrary parts of your face, but has some trouble at oblique angles (like your face in profile or near-profile).

But here's an odd thing I noticed:

- Take a pair of over-the-ear headphones ("cans").

- Rotate them 90 degrees so that one earpiece is on your forehead (or above it) and the other is at the back of your head.

This seems to completely stump the algorithm, even when your eyes are plainly visible and looking straight into the camera. With my headphones turned sideways it never once identified my pupils (or my face) accurately.

Possibly a hat would do something similar (I don't have one handy) but a largish dark circle your forehead seems to confuse it.

Very neat for a 200 LoC browser-based demo.


This seems to have a lot of trouble when wearing glasses (large-style glasses). I was surprised of how bad it seemed for 2019 tech, then tooky glasses away and was blown away of how good it is.


It seemed to have a little trouble with light reflecting off my glasses (sometimes it found my eyes but placed my pupils more or less on my eyebrows) but it generally did OK with my glasses.

My glasses are closer to the "invsible frames" side of the spectrum than the Harry Caray / Elvis Costello style, so maybe that matters.

I guess maybe I should just review the code but I was trying to figure out if it uses other facial features to "anchor" the eyes or just the (fairly recognizable) features of human eyes themselves. Your glasses experience makes me guess it is pretty eye dependent -- it does have more trouble with one eye covered for example, whereas if it was anchoring off some combination of ear/nose/mouth/chin it should be easy enough to ID the eye from half the face, let alone with just an eyepatch-sized cover.


>This seems to completely stump the algorithm, even when your eyes are plainly visible

That's a false positive. This uses quite simple algos, decisition trees, which have more false outcomes (both positive and negative) on this pupil detection task that state of the art models, which would be deep convolutional networks. The decision trees are much less computationally expensive though - running a modern deep CNN at 30 fps needs a dedicated GPU. The decision trees will run just fine on a CPU in a browser!


Nifty. I use PoseNet[1] on tensorflow.js to locate eyes for 3D UIs on a laptop. lploc seems much faster, but requires a more restricted head pose. Maybe I'll try using both.

I've also been looking for usable gaze tracking, to play with combined gaze (fast but low resolution) and head (high resolution but slow) pointing. Perhaps lploc can help with that.

So thank you for your work.

[1] PoseNet webcam demo: https://storage.googleapis.com/tfjs-models/demos/posenet/cam...


The author means real-time location of pupils. This is not gaze tracking though.


Gaze tracking on browsers with webcams does exist though!

https://webgazer.cs.brown.edu/


I finally got webgazer to work once, a year or so ago. As I recall, it required non-default settings, shaving my beard, adding lights pointed at my face, hanging a backdrop, and holding eyes and bushy eyebrows just so. And maybe something else I'm failing to remember. The joys of small training sets.

No doubt it "just works" for some people. For others, not so much.


I'm sure you know this, but just in case:

If you add this CSS to your site, you can increase the readability without sacrificing the aesthetic.

body { max-width: 65em; margin: 10px auto; }


It works pretty well, but it has the same issue I see with a lot of these object tracking algos - it jitters all over even when I’m holding as still as I can. Why are these systems so sensitive and unstable? Could the algo be trained to prefer similar output for similar inputs to reduce the flicker at the cost of some accuracy?


What am I missing? "Click the button below and allow the page to access your webcam."

...and then?


If it works, you will see a live feed of your webcam with your face encircled with your pupils highlighted as red dots... its pretty impressive!

I'm guessing your browser is blocking the webcam, or maybe its not configured correctly to work with your browser. did you get a popup requesting permission from your browser? sometimes its hidden at the edge of the URL bar behind an icon.


Not very good if you change head orientation or rotation but otherwise pretty cool!


Doesn't work on *my mobile android chrome. I'll try again at work tomorrow. Looks pretty cool though.

I would love an interaction method like this.. I can already think of some super fun UIs to build around it...


Worked pretty well for me on my low-end 2 year old Samsung!


Works on Chrome, but doesn't appear to work on Firefox. I can see my webcam feed, but the red indicators don't show up.

Edit: Nevermind, it requires access to canvas data which for me is blocked by default.


Works on firefox for me 67.0.4 (64-bit)


Sorry if I don't trust some random web page to use my camera because they promise they're not logging it to their server.


All client side code is available to read, and all network requests can be viewed in developer tools of most browsers (including options to go offline so you dont need to disconnect from the internet altogether but still see requests being sent)


What if you open incognito mode, open the page, cut off your network, then close when you're done with the demo?


I overread and I thought you were being sarcastic with "[..], then close your eyes when you're done with the demo".


I overuse sarcasm in reality to the point that people don't believe me when I'm sincere, so no surprises I didn't write this out clearly enough :D

Seriously though I think if it's all JS based you should be good to use incognito and offline to gate off the code from uploading.


Not sure why you'd care. You're just going to look at the screen and move your eyeballs around a bit then close the tab. If they want to waste bandwidth / storage to capture video of me looking around with my eyeballs that's ok by me.


I didn't do this myself, but couldn't you load the page and then disconnect from wifi before running/testing?


so, we use eye tracking in experiments to look at how people acquire information. For example, you can parameterize a drift diffusion model using input from gaze. But the equipment is not cheap. I see this as a means of acquiring that sort of data on the cheap. For example on mturk.


This is the kind of tech that managers might like to install on their employees computers. If you're smart enough to build interesting things, maybe spend that talent on something less creepy.


Doing this in javascript is creepy, however the technology itself can have other than nefarious uses too. For example in a bandwidth constrained situation you could determine which piece of a video stream to optimize for higher quality based on the general area you are or were looking at, if you look at it long enough it should stop being fuzzy.


Video streaming is too slow to adapt, but I see a way for advertisers to check if you watched an ad even if you didn't interact. Specially if the site already uses the webcam.


That was the thought that immediately came to mind when I saw this article --- as if unskippable ads were bad enough, with this they'll be able to make ads that you can't not watch, and the whole thing just disturbs me greatly. No doubt it'll get spun into an ostensibly "user-friendly" "automatically pauses the video if you look away" "feature"... but the motives are clear enough.



If I close one eye on this it still finds center of the closed eye where pupil should be.

Presumably you could close both eyes but I can't test that for obvious reasons.

So this specific implementation might not work for that.


That is a pretty cool idea but I can't see it being fast enough to dynamically update the video stream could it? Very cool idea though.


What about helping disabled people use computers? Or environments that aren't conducive to using your hands like a factory? ...If the only use you can think of for software like this is "creepy," maybe it is not the technology that is creepy.


> If the only use you can think of for software like this is "creepy," maybe it is not the technology that is creepy.

Really?

I couldn't find what I was actually looking for, where Weizenbaum describes how vision or reasoning experiments might be made with benign or even cute objects, but for rather not so benign ends. I found this instead, which I think is even better put.

> Other people say, and I think this is a widely used rationalization, that fundamentally the tools we work on are "mere" tools; This means that whether they get use for good or evil depends on the person who ultimately buys them and so on.

> There's nothing bad about working in computer vision, for example. Computer vision may very well some day be used to heal people who would otherwise die. Of course, it could also be used to guide missiles, cruise missiles for example, to their destination, and all that. You see, tthe technology itself is neutral and value-free and it just depends how one uses it. And besides -- consistent with that -- we can't know, we scientists cannot know how it is going to be used. So therefore we have no responsibility.

> Well, that is false. It is true that a computer, for example, can be used for good or evil. It is true that a helicopter can be used as a gunship and it can also be used to rescue people from a mountain pass. And if the question arises of how a specific device is going to be used, in what I call an abstract ideal society, then one might very well say one cannot know.

> But we live in a concrete society, [and] with concrete social and historical circumstances and political realities in this society, it is perfectly obvious that when something like a computer is invented, then it is going to be adopted will be for military purposes. It follows from the concrete realities in which we live, it does not follow from pure logic. But we're not living in an abstract society, we're living in the society in which we in fact live.

> If you look at the enormous fruits of human genius that mankind has developed in the last 50 years, atomic energy and rocketry and flying to the moon and coherent light, and it goes on and on and on -- and then it turns out that every one of these triumphs is used primarily in military terms. So it is not reasonable for a scientist or technologist to insist that he or she does not know -- or cannot know -- how it is going to be used.

-- Joseph Weizenbaum, http://tech.mit.edu/V105/N16/weisen.16n.html


So... When you think about computers and lasers today, is it a military application that comes to your mind? Even rocketry is iffy.

I read the GP completely agreeing with your point, but this comment is such a clear reminder that we have no hope of knowing what change will come from a technology that I had to change my mind.


There is no single application that comes to my mind, so I can answer neither yes or no, and I can't deduce what answer you seem to be supposing.

> I read the GP completely agreeing with your point, but this comment is such a clear reminder that we have no hope of knowing what change will come from a technology that I had to change my mind.

Okay, but why? Because a military application doesn't come or come to my mind when I "think about computers and lasers"? I'm not following.


Because this is an excerpt from a smart person just a bit out of touch with his times (it would be a majoritarian opinion among smart people a few years earlier) claiming that computers are inherently linked to the military, to the point that new CS undergrads would certainly work in military projects after graduation.


there is already sophisticated eye tracking equipment more suitable for military purpose. Might be useful for spying, but not more useful than key logging or screen capture.


Go find me one "tech manager" who wants to install this on their employees laptops.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: