
Looking to Listen: Audio-Visual Speech Separation - chriskanan
https://research.googleblog.com/2018/04/looking-to-listen-audio-visual-speech.html
======
tropo
This would be a huge improvement for hearing aids, particularly for people who
can't hear in stereo. It might need an eye tracker for aiming.

~~~
yegle
Exactly! Cocktail party effect requires two functional ears and I'm sure this
can be helpful to people with only one functional ear.

> The cocktail party effect works best as a binauraleffect, which requires
> hearing with both ears. People with only one functioning ear seem much more
> distracted by interfering noise than people with two typical ears.

[0]:
[https://en.wikipedia.org/wiki/Cocktail_party_effect](https://en.wikipedia.org/wiki/Cocktail_party_effect)

------
Scaevolus
This method has improvements (better quality than audio-only separation,
speaker assignment, and better noise handling), but you can do pretty well
with just mixed audio:
[https://www.youtube.com/watch?v=vW51cG1Ox98](https://www.youtube.com/watch?v=vW51cG1Ox98)

------
anotheryou
I wonder if we will have visual assisted tts. Humans do it:
[https://youtu.be/G-lN8vWm3m0?t=74](https://youtu.be/G-lN8vWm3m0?t=74)

The McGurk illusion is so strong that I'm sure visual cues have a major role
in error correction or voice recognition assistance for humans.

~~~
microcolonel
With that particular example, I noticed the issue immediately and heard bah
the whole time. I wonder how much people's response to that illusion varies.

------
ayush_merci
I wonder how a blind person will respond to a cocktail party effect. If a
blind person can do it, maybe this separation can be done without visual
input?

------
jacksmith21006
This is pretty cool. Will be interesting to see it used on older audio sources
to clean this up.

