
Eye Contact in Video Conferencing (2018) - walterbell
https://itotd.com/articles/5279/eye-to-eye-video/
======
sexyflanders
Latency is the issue for me more so than eye contact.

On video chat sometimes I’ll say something like “mmhmm” or “yes” in agreement,
and accidentally throw the speaking party off because my comment was delivered
after a normal pause in speech due to latency.

This interruption of normal conversation flow is more problematic to me than
eye contact. We both know were speaking into machines after all.

~~~
dheera
For some reason video calls have just never worked well for me. Every single
time it's either latency, massive amounts of echo making it hard to speak, and
"can you hear me" every minute or two. For some reason, it's 2019 and we as a
human race still haven't figured out how to get a stable 480p video stream
over gigabits of bandwidth.

~~~
shereadsthenews
Video conferencing worked perfectly when it was still done over circuit-
switched digital telco networks. It’s only the Internet that has botched
things up.

~~~
aidenn0
If by "botched things up" you mean "Made affordable." Nobody other than Vint
Cerf will argue that circuit-switched networks are not superior from a QoS
point of view, but they can be orders of magnitude more expensive to build and
maintain.

~~~
closeparen
They should be feasible to implement within a building, between buildings on
the same street, and probably between geographically diverse sites of multi
billion dollar companies that spend 30%+ of their labor hours on site-spanning
meetings.

------
erikpukinskis
This will definitely never happen with “monitor”-style technology, because
AR/VR with face tracking will do it. That’s probably five years out.

As a side note, I believe eye contact and 3D projected body language tracking
are the limiting technology for virtual offices to leapfrog physical offices.

It’s my expectation that when those two technologies are cheap ($600 all-in-
one device) it will massively disrupt both commercial real estate and any
residential markets that are commute dominated.

I believe particularly in the Bay Area this will be the source of the next
housing market collapse. Particularly since Facebook is heavily invested in
telepresence now, and has a clear path to a device like I described within ~5
years.

Google and Apple, in all likelihood, will put out a similar device in that
timeframe as well.

~~~
aantix
Do you have any links where we explore such prototypes?

~~~
erikpukinskis
You can look up some of the foveated rendering prototypes. The non-static ones
have eye tracking.

I have seen a few headsets with downward facing cameras but not many. There
are hard problems around both technologies so they are mostly research
prototypes at this point I think, not consumer alphas.

~~~
erikpukinskis
Also I think some Leap Motion prototypes might be facing downward and tracking
body. I seem to remember one that had the Leap sensor on little arms sticking
out on little arms.

As I said I believe Facebook, Google, and Apple all have this tech on the far
side of their roadmaps. So I don’t see all three failing.

But even if they do, the Leap Motion roadmap points there, and... while Leap
is obviously mismanaged, new parent company is UltraHaptics. Hopefully they
can get Leap into a headset....

Regardless, the Leap founder (Ed: David Holz) knows exactly what’s necessary
for telepresence in VR, and I don’t think any amount of mismanagement will
stop him from executing it EVENTUALLY.

But he’ll take 10 years, and Facebook might do it in 3. Apple could roll it
out next year for all we know. They’ve prototyped all the tech already in
ARKit.

David if you’re in this thread and you want to build a leap headset email me.
We can just disassemble Oculus Go’s and add a Leap and... well you wrote the
6DOF code already right? Solder a coprocessor on the board that can do it
without any battery drain. Maybe we have to design one, IDK. Sacrifice a
kitten. Hire a stats PhD and make it “stochastic”. Put a neural net processor
between the cameras and the computer. Who knows.

Then ship it with a Nintendo 64 emulator ONLY, that’s the entire OS, no games
you have to sideload them. And mark it up to $999. Fund the company on
preorders.

~~~
erikpukinskis
Strip off the Go’s plastic and CNC a custom enclosure, faceplate carved to the
buyer’s design. Use a portion of every sale to fund artists creating faceplate
designs for customers to choose from.

Needs IPD adjust, off ear audio

------
mwcampbell
> Eye contact is extremely important for meaningful communication

Some of us, specifically blind people, have lived our lives without ever being
able to make eye contact, yet we still have meaningful communication, both
with each other and with sighted people. I, for one, don't ever think about my
inability to make eye contact, except when a sighted person brings it up
(edit: and to be clear, that only happens in a discussion like this one, not
when trying to communicate with me personally).

~~~
darklajid
I'm genuinely curious: Do you use video chat a lot? If you do, is that a
personal preference or something of a standard workflow at your job / you
agreeing to someone else's preferences? My sample size of blind people I know
and feel close enough with to discuss limitations/day to day issues is tiny
unfortunately, so the question of whether video chats are a thing never came
up (We did watch(?) movies before though).

The reason I'm asking is that I do agree with the article, but I'd phrase it
slightly different: The technology right now makes video chats ~different~ and
awkward due to the lack of eye contact, IFF you're used to eye contact in face
to face conversations in the first place, because video chats are supposed to
allow f2f over remote distances.

~~~
mwcampbell
Thanks for asking. No, I don't use video chat much at all. Really, I only do
it occasionally with my family when I can't be there for somebody's birthday
(I'm in Washington state but the rest of my family are in Kansas). And of
course, I only use video for their benefit, not mine.

I currently work in an on-site office environment, and only a small number of
us are blind or low-vision. When I'm initiating a real-time conversation with
someone, I'll often do it by IM. We may do it face-to-face if the other person
is right there, but to me that's just like an audio call with 3D positioning,
perfect fidelity, and no latency. To be fair, that is indeed still better than
current VoIP, but not enough so that I would insist on being in the same
building as my coworkers. Indeed, with an online audio group chat, I always
know when someone joins or leaves the group; that's not always the case in
person.

At my previous job, by contrast, we were an all-distributed team made up
almost entirely of blind or low-vision people. As you would expect, we never
used video. But we did use audio a lot, both one-on-one and group chats. (Our
favorite voice chat system ended up being TeamTalk [1].) I never met most of
my coworkers in person, and I never felt that this was a problem. It only just
now occurred to me that the few sighted people on our team might have been
uncomfortable with the way the rest of us worked. As for the rest of us, as
far as I know, none of us ever wanted to be together in an office building. At
least not for the sake of communication; having a shared pool of test hardware
would have been nice from time to time.

[1]: [http://bearware.dk/](http://bearware.dk/)

------
Arathorn
One option is to capture the video with depth as well as colour data and then
just rotate it to point in the right direction(s) to maintain eye contact.
This is what we experimented with on
[https://matrix.org/blog/2018/02/05/3-d-video-calling-with-
ma...](https://matrix.org/blog/2018/02/05/3-d-video-calling-with-matrix-
webrtc-and-webvr-at-fosdem-2018)

------
dave_sid
I really don’t think looking someone in the eye is as important as it sounds.
Yes facial expressions communicate a vast amount of information but I can
still see someone’s face even if they are not looking me in the eye. Tbh I
actually don’t like looking people in the eye for the whole of a conversation;
it would feel weird. For what it’s worth, I don’t feel like someone is being
dishonest if they don’t look me in the eye. That kind of myth can go in the
bin along with the idea that a firm handshake shows confidence. That kind of
thing makes me cringe.

------
killjoywashere
One of the simplest things would be to simply allow the user to put their
thumbnail at top center, where the camera usually is. Even better, maybe
position it to where dynamically to above the head of the speaker on the other
end. In my case, for example, the geometry of my desk leaves the front edge of
my microscope visible in teleconferences, so I am no doubt slightly left of
center in all cases. In any case, I know that I for one spend a non-trivial
amount of time looking at my own camera feed, seeing if the video is ok: am I
dropping frames because of a visible ceiling fan? Is the sun coming through a
window behind me, blowing me out?

------
couchand
_... I think most of us would say that’s a big improvement over audio-only
communication._

Never assume that just because you have a particular personal preference that
most people share it.

~~~
7952
I consider a well written email to be a big improvement over audio only
communication.

------
foreigner
Eye contact would be brilliant but at the moment I would settle for full-
duplex audio. It really impairs the ability to have a natural conversation in
e.g. Google Meet when you have to wait for the other party to be totally
silent before you start speaking.

------
kimburgess
Link to one of the more recent and interesting projects mentioned:
[http://sites.skoltech.ru/compvision/projects/gaze-
correction](http://sites.skoltech.ru/compvision/projects/gaze-correction).

This would be a crazy difficult (and fun) domain to work in. Humans are very
well tuned to perceiving facial details - the margin for error would have to
be tiny. I'm still uncertain if images processed this way could escape the
uncanny valley, but this appears to be a much more technically feasible
approach for consumer use than transparent displays, multi-cam setups, or
camera arrays embedded in displays.

~~~
nsb1
If I were a betting man, this is where I'd put my money. It seems attainable
via software or hardware assisted processing once the model has been computed.
All the other solutions seem too expensive or hardware intensive to me.

------
hashkb
I'm a full time remote consultant. I've noticed anecdotally that over the last
year 80% of my contacts have started muting their camera all the time. Some
cite "bandwidth" or other silly excuses; most just say nothing.

I'm as insecure about my appearance as anyone else- what's the deal? I keep my
camera on until it's obvious that nobody else is going to show their face.
Then I have to be the awkward one to turn off.

~~~
ianbicking
I've been doing video meetings for years, and I've seen video muting go up and
down based on team expectations / social morays, general engagement, and the
kind of meetings (e.g., one where lots of people are just trying to get some
situational awareness, but aren't really participating).

I personally get annoyed when people video mute by default, and when I was a
manager it was something I would bring up in 1:1s. Lurking on meetings is
fine, but if you have a real role in the meeting and you aren't multitasking a
meal or something, I feel strongly you should have your video on and be
engaged, and most of the time when people are video muting it's because they
are hiding their distraction and lack of attention, or else it's reached a
critical mass where people are muting just to fit in.

Now that I'm an IC again generally the best I can do to encourage people to
unmute is just name them: "hey Joe, are you there?" even though I know full
well they are connected. It's admittedly a bit passive aggressive, but it's
also a way of gently telling the person they aren't being present, and it's
unlikely I'm the only person who notices.

I also mute myself for the wrong reasons, so I'm not trying to make this an
indictment of individuals. But it's so easy for people to drift away from
their work, especially when you are remote, and we need to look out for each
other and not just accept dysfunctional behavior.

~~~
couchand
Wow I hope we never work together. A mature team trusts one another and
assumes the best. If I got that feedback from a manager in a one-on-one I
would put in notice that day. What a horrible attitude to bring to your work
interactions.

~~~
ycombobreaker
If you had a private conversation with your manager, and they asked you to "be
more present" at telemeetings via unmuting your video stream, your response
would be to QUIT? Maybe before that point you should talk through the topic
with your manager. Surely the best result for everyone is to have a shared
understanding, rather than making what seems like a minor communication
"difference of opinion" a dealbreaker.

~~~
darpa_escapee
This is butts-in-seats mentality, but for remote workers:

> _most of the time when people are video muting it 's because they are hiding
> their distraction and lack of attention_

The OP _assumes_ that those who have their camera off are wasting time,
because they can't see them to prove that they're not.

~~~
ycombobreaker
I get that, but... butts-in-seats as an expectation for the work day (which is
typically where the policy/work culture is called out), is quite different
from a butts-in-seats expectation for meetings where one has a "real role".

~~~
couchand
Can you clarify your position here? What do you see as the main issue with
butts-in-literal-seats, and what about this hypothetical remote meeting is
"quite different"?

~~~
ycombobreaker
GP cited "butts in seats mentality" in a way that I interpret as pejorative,
referring to office cultures where physical presence during working hours is
more important than output.

A meeting is specifically about communication. A participant at the meeting
that is inattentive is either wasting others' time, or should probably
politely step out, or at least should not have been asked to attend. Being
present _is_ a priority for a good meeting, hence the pejorative "butts in
seats" comment doesn't make sense to me.

~~~
couchand
The output of an interactive meeting is the interactivity, not the appearance
of interactivity.

~~~
ycombobreaker
I don't disagree with this; I just think dropping the video feed can impact
the interactivity of a meeting.

------
andy_ppp
Eye contact is something I constantly need to remind myself that people feel
is really important, but I cannot say I’m naturally aware of its necessity.

~~~
dagw
Yea, I find it really tricky to both think about what I'm trying to say and
focus on maintaining eye contact.

------
jpalomaki
Combining Tobii eye-tracker with some software (machine learning) based
correction system might be interesting project to try.

You could collect training data by setting up webcam through teleprompter,
regular webcam and Tobii on a machine. Just record normal usage and then learn
a model using eye-tracker data and regular webcam data as input and data
captured through teleprompter as target.

With Tobii the system would also know who you are looking at on the screen.
With custom conferencing software, you could deliver the software modified
eye-contact stream only to that person.

[https://gaming.tobii.com/product/tobii-eye-
tracker-4c/](https://gaming.tobii.com/product/tobii-eye-tracker-4c/)

------
everdrive
Does anyone wish we'd just attempt to have fewer video conferences instead of
trying to tweak this terrible technology?

------
scotty79
I am really amazed that each video call app doesn't have an option that
adjusts look of the eyes of the person you are talking to so they seem to be
directed at you.

There's no magic in how people look when they look at you. All the magic of
feeling when someone is looking at you happens in the recipient's brain.

~~~
zarmin
eyes are the w̶i̶n̶d̶o̶w̶s̶ linux to the soul

------
jstsch
This will be solved when in-display cameras become ubiquitous. Probably the
entire display will be the camera, without a lens obviously. Using some nifty
algorithms you can construct a focused image from that. It'll just be a layer
inside the display.

