Hacker News new | past | comments | ask | show | jobs | submit login
WebGazer.js: Eye Tracking on the Browser (brown.edu)
190 points by trymas on May 25, 2016 | hide | past | favorite | 65 comments

The demo you came here for


Click around the screen where you're currently looking to calibrate

That is amazing. I've never used an eye-tracking device before so the experience of hitting the circles with my eyes is very surreal.

I think this would be really useful to UX researchers! The folks I know in that space often set up screen-sharing sessions with users, and watch them interact with the UI in real time. Adding gaze tracking to this process seems like a natural progression, allowing the researchers to see where the user expects a button or UI feature to be, even if the user can't articulate that in real time.

I think this is impressive work. My lab does eye tracking with really expensive equipment locally. It would be interesting to conduct visual/face processing studies through the web and include eye tracking data with this library. Granted the accuracy is low compared to specialized equipment and predictions are highly variable, other contributors could enhance its accuracy. I suspect that the variability in shape, size, and structure of human faces contributed greatly to the high variance/low resolution predictions issue. Maybe categorizing the user's face shape a priori would enhance predictions. That is, the library would first estimate the "type of face" and use those data to inform the eye tracking. This might not make sense--just a thought.

Just curious, but does anyone who pays attention to this space/technology know if eye tracking (with commodity-ish hardware) has gotten to the point of "replacing the mouse"?

E.g. at the OS level, the pointer basically follows your gaze.

Eyes don't point a things we look at as accurate as a mouse does (they only aim within the fovea, about 1 degree angular, corresponding to about 1cm on the screen) and eye trackers are often far worse.

This summer I built a system that uses slight head movements for refinement, allowing hands free mousing as fast as a trackpad. However, eye trackers good enough for the system to be pleasant cost $10000+. I have ideas that would allow it to work with a $100 eye tracker, but even then it has the downside that the computer vision for your mouse will peg one core.

As someone who hates mice and hopes one day to have this sort of tech, I'm trying to understand what you mean by "Eyes don't point at things we look at as accurate as a mouse does" - how is it that I can choose to "look at" an individual pixel on my screen, or an individual serif on a letter, and then choose another one far less than 1cm away, and my eye focus shifts to it? That shift is not detectable by a camera?

(Edit update: by "detectable" I mean theoretically detectable by a really good camera and software. In other words, it seems you are arguing the tech is impossible even theoretically due to some aspect of biology. Am I following you correctly there? Thanks)

I'm also not an expert, but I suspect a confounding problem is the fact that they eyes never stop moving, even when it feels to us like we're looking at a single point: https://en.wikipedia.org/wiki/Eye_movement#Saccades

Not an expert, but you actually focusing on a pixel is your brain doing image processing, not a function of your eyeball.

Your field of vision (and focus) is much bigger than 1 degree, so any part of the image that you are "looking at" is something the brain is doing to the image as post processing, not something your eyeball is doing.

Your eye is constantly moving around, even when you think you are staring at the exact same point.

I researched this over a summer once by downloading the latest scientific papers on gaze tracking. The result were pretty disappointing to me. Then I figured I was doing this researching all wrong. Because I already knew from highschool biology that your eye does micro movements all the time to keep the retina stimulated and to keep a larger area in focus at the same time. So I opened wikipedia and looked up the smallest micro moments that the eyes do. Based on the angle of it and the average distance between your eye and the screen, it's easy to see that you can never replace the mouse with gaze for a pixel perfect pointing device.

However! If you think outside the box, you might get a fairly accurate gaze tracker and a different GUI design to get this to work. That vision (no pun intended) is more of a long term one. An easy short term use for eye gaze use would be automatically setting GUI window focus based on eye gaze. That already might save you a keyboard-mouse switch. As long as you have no more than four windows on your screen, you can make it work with the current tech already.

"you can never replace the mouse with gaze for a pixel perfect pointing device"

And how about getting in the precision area like a finger has on a touchscreen? You think, this is possible? Because touch works pretty well, if the UI is good (big) enough ...

I researched it some years ago and the precision was just too low. Furthermore I believe that it has to work REALLY well. In particular it is interesting to know if there would be some kind of health risk?

E.g. if the tracking is off (i.e. the predicted x,y coord differs from the target you had in mind with your gaze) I guess one would try to auto-correct the error by a slightly different gaze, i.e. looking slightly beneath the target. I am not sure if this is a problem or not?

We do not have this problem with the mouse, as we control it via relative speed and do not set specific x,c coords.

Maybe one could solve this by also tracking the head? E.g. make and educated guess via gaze tracking and let the user refine it with her ... nose. ;)

I was wondering if this sort of cursor control would be possible with, say, a Google Glass. The software is already able to take input in the form of winking for snapping pictures. So, couldn't you have a cursor overlayed on the Glass's screen, line it up with where you want it on your computer screen, and then wink to place it there?

Kind of related, there's a VR headset coming with built-in eye-tracking: http://www.getfove.com/

Even if it could just focus the particular terminal window I'm looking at then it'd be really useful.

BUT BUT BUT!!! I spend so much time moving the cursor out of the way so I can read the letter under the pointer.

You can avoid that easily using more conservative MAGIC (actually a reference to an HCI paper) that only moves the mouse pointer when you want it to.

I think this could be useful with Digital Signage. My client uses HTML to display content and is run in browser which makes it easy to design and deploy content. He says one of the challenges is getting seamless user interactivity with his browser based digital signage.

This coupled with web camera recognizing hand gestures e.g. http://hadi.io/gest.js/ could help grabbing user interaction inside his apps without too much hassle but right now its a janky solution compared to a real native kinect setup (I don't think its even possible to effectively connect to kinect in browser).

Someday soon USB devices will be accessible, thanks to the Chromebook. Not sure if that will help accessing the Kinect.


Well after I figured out that I could use clicks to calibrate the tracker it was somewhat accurate (ca. 100-200 px), however hte variance was quite large.

I think it's a really nice idea, however not yet precise enough to work for such tasks as user studies

100px gives you around 70 zones in your average desktop window. More, when you consider shared hits between zones. That's enough to do basic heat maps of a web site (such as for ad impression tracking), coarse navigation, or integration into things like editors/IDEs.

I was thinking about ad impressions as well; but God forbid websites are ever able to use my camera for eye tracking by bypassing the webcam permissions (or through the removal of in the future? I expect anything when it comes to content monetisation).

Or simply refusing access to the content without access to the camera (thinking of Forbe's ill-conceived content blocking).

I will laugh my ass off if it ever comes to this. Hopefully I never have to come back to this comment and weep.

Stop giving them ideas!

Sticker over camera lens takes care of that - in this age of malware it is good to use it anyhow. Good luck tracking my eyes then!

Other than thwt, I see huge potential for this in UX studies.

Need something like that for microphones in TV/laptops/tablets.

It sadly did not work for me either, even though I tried "calibrating" it with moving my mouse around the screen, but when i moved the mouse out of the window it was just as bad as before.

You need to click.

Open the game and follow the mouse with your eyes while you click each 8-directional edge of the game's viewport and then click the center.

Now you can let go of the mouse (don't need to move it out of the game) and the game should follow your gaze.

This demo is impressive. It's accurate enough for me to isolate the orange ball from the cluster.

On a website, the calibration could be done in a modal overlay. "Look at this dot while clicking it wherever it appears". After then Nth click, the modal goes away and the user lands on the website ready to gaze.

Nice work, well done.. one of the best one's we've seen !!!

I am one of the founders at xLabs and we too have been working on self calibrating real time interactive webcam eye tracking for a number of and have a number of demos and commercial products.

We'd love your feedback.. check out http://xlabsgaze.com/ (core tech) and our first commercial product https://eyesdecide.com (Design effectiveness and usabilty testing SaaS).

Our just check out one of our video here https://www.youtube.com/watch?v=kSnAfJWAhtE

Keep up the good work.

This could be useful to predict where the user will click next, so the website can preload data, with the result that user waiting time is reduced.

I assume it's meant for in-house user testing, not on the live web.

100px accuracy. I thought eye tracking could do better than that in general? At 100px accuracy, WebGazer is more useful to advertisers than it is to consumers, as I can't see any consumer-friendly applications that would be useful with that precision. If that's the case, adoption won't grow. Are there other comparable eye tracking libraries out now?

> as I can't see any consumer-friendly applications that would be useful with that precision

How about:

- switching focus between windows

- application selection a'la mission-control

- coarse navigation through a document via a sidebar

- real-time inventory management in a HTML5 game

- Selective zoom of embedded images

Even more fun: Atom is run inside a web browser. Imagine what you could do with eye tracking within an IDE.

These would probably work - switching focus between windows - application selection a'la mission-control

Whereas these, no way at 100px precision or without external sensor (touch/mouse/etc) would this work at all nicely/without jitter. - coarse navigation through a document via a sidebar - real-time inventory management in a HTML5 game - Selective zoom of embedded images

>Imagine what you could do with eye tracking within an IDE.

You provided really good cases for the rest, but for this I can't quite picture a need (other than maybe scrolling, which is arguably faster using vim movements already). Do you have any ideas?

Maybe it could be used for modal keybindings if your gaze changes modes e.g. focus between the panes of an IDE. hjkl moves the cursor when looking at the code, but it traverses the file hierarchy when looking at the sidebar, and enumerates tabs when looking at widgets.

Kind of like how keys are reused in Vim yet have similar meanings across each mode. You don't have to remember so many unique keybindings.

Some editors let you switch between panes with keybindings, but once you have more than a few or some in weird shapes, you end up needing tmux's `C-b q` pane selector. Seems like gaze could replace most of that.

There are a number of activities in an IDE which require keystrokes or mouse movement, or changing contexts. The capability to browse through code or documentation without having to change contexts would be valuable to me; and I can't be the only one able to finish typing a thought while my eyes move on to read other code.

Minor use cases, perhaps, but something that operates intuitively has the potential to be pretty awesome.

As a user, I find this creepy as fk. I would love to know how to disable this sort of "feature" on my phone.

If you have an Apple device you don't need to disable it: webcam video streams (aka getUserMedia()) isn't supported on Safari. On other browsers you always need to give permission before the camera is switched on.

Oh, great to know, thanks!

It didn't work for me until I did the following:

- Zoom in so that my gaze is actually looking at distinct parts of the physical screen

- Look at the cursor while clicking a few times to train it

- Take off my glasses

- Move way closer to the camera and hold relatively still

It was a lot of hassle to get it to work but after it did work it was a lot of fun!

> Take off my glasses

I was wondering if that was the problem for me, but I'm so near-sighted that if I were close enough to actually see the cursor, there's no way the camera could see my eyes to track them. :-(

In the demo, I had to click on the screen a couple of times to get the dot to start moving around. Then, it seemed to want to follow my mouse quite a bit as I moved and clicked.

I was able to greatly move the red dot when I say closer to the webcam, really opened my eyes, and moved left to right, but the action was flipped for me — looking left moved the dot right, and vice versa. (I'm using an Apple Cinema Display… I've had issues before where it automatically flips the screen around in some cases.)

Although not very accurate, it's certainly an interesting experiment.

Did you calibrate it at all?

How do you calibrate it?

The instructions were on the page. Get it looking at your face properly, and then look directly at the mouse cursor and click it, on spots around the page. Make sure you don't move your head, just your eyes.

Maybe the site could detect if people have read the instructions.

It didn't work for me at all; the red dot never appeared on my screen, though the face-outline in the preview image did fit my face quite well (when it wasn't detecting my chin as my mouth).

Hi, one of the authors here. It was probably not very clear on our part, but it's meant to train the eye tracking model as naturally you use the website. So it takes advantage of your interactions over time, like if you're using Gmail.

So if you're doing the demo on the blank page, click in a couple of places around the screen (while looking there as you would normally), and you should start seeing the prediction.

Oh, thanks, it worked now!

I'm on a thunderbolt display (with integrated camera) sitting back about 3 feet. After 5 or so clicks against the corner regions, tracking became surprisingly accurate. I'm impressed!

Very interesting project. Was excited to try this but the quality of the tracking is not there yet. The face detection was off most of the time unless I positioned my camera just right. Then the eye tracking was no good. But they are only using a webcam and doing this right (see http://www.tobii.com/xperience/) requires multiple IR emitters and an IR camera.

It didn't work that well with the camera on my HP 8470p but great idea and I'm sure it took a lot of time and effort. I appreciate the effort.

Interesting. Just curious, but does this library scalable for a facial recognition system? Eg: Authentication system.

Cool idea and opens up a lot of interesting applications. Unfortunately, the demos don't work reliably (on my machine the tracking was completely off the wall).

cool trick but we all know who will end up using this the most:

advertisers. another way for them to "gauge" their reach and brand appeal. or something.

literally i feel like every new thing that comes to the browser, a marketer at a company thinks, 'how can we exploit this to put more junk into the face of other people?'

i am not against advertising, i just think most of them have no limits in terms of what they'll exploit at the expense of their own users.

Well, advertisers couldn't use this without asking permission for a user's webcam, which would be a non-starter. I choose the less cynical path and say this could be tremendously useful for remote user testing.

Remote usability testing would be fantastic. This could be useful for local testing as well. Having a library or tool that could generically sit in front of a site to show where users look when taking action X sounds beneficial to me.

I can also imagine on a lighter note this being used for games where you look to where you want to move a character (maybe blink twice for confirmation).

How many people are going to be hesitant to allow people to use their webcams?

I can't get it to work, but even if I could I would be very hesitant to use it.

I wonder if anyone has experimented with using something like this to make an incredibly inexpensive assistive technology interface

Could this be useful to monitor user sentiment and correlate it with further data from applications/media?

This is interesting. I don't see many practical implications though.

pretty cool!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact