
Guessing the pressed keyboard keys by analyzing the audio from the microphone - Osiris30
https://github.com/ggerganov/kbd-audio
======
scarlac
Attempted to train it by typing for ~2 minutes. Basically typed everything on
the GitHub page and tried getting predictions. Results were disappointing. I
didn't see any accurate predictions. Even with the default p/q program I see
mostly random results.

I tested on a 15" Macbook Pro 2018 (latest version of keyboard that is softer
to type of / less noisy)

~~~
carbocation
Sounds like the author has the same experience: if the keyboard is not
mechanical, then this library doesn’t work:

[https://github.com/ggerganov/kbd-
audio/issues/3](https://github.com/ggerganov/kbd-audio/issues/3)

~~~
ggerganov
Correct. I just made a short video to demonstrate what a working setup
looks/sounds like:

[https://www.youtube.com/watch?v=2OjzI9m7W10](https://www.youtube.com/watch?v=2OjzI9m7W10)

------
switz
I've always thought Twitch streamers were opening up an attack vector through
this exact method.

Cool to see someone follow through with it. Any streamers out there should
figure out a way of avoiding keypress bleed or muting their mics when typing
sensitive info (e.g. passwords).

~~~
nmstoker
Yes, countermeasures, such as playing other sounds around those frequencies
and/or filtering the microphone at those frequencies would be interesting to
explore. Filtering the mic seems the less annoying from a user perspective!

~~~
drankula3
Most streamers, myself included, use a filter called a noise gate, which
requires the volume of the mic reach a certain threshold before being
broadcast. This filters out the majority of background noise on the stream.

------
daenz
Now determine the keypresses by filming the vibrations of a potato chip bag[0]

[0][http://news.mit.edu/2014/algorithm-recovers-speech-from-
vibr...](http://news.mit.edu/2014/algorithm-recovers-speech-from-
vibrations-0804)

~~~
hashmap
They manage to do it (just sound, not keypresses) with a 60fps DSLR camera at
the end by examining the rows of the video - does this mean that for any
videos already in existence, sound can be decoded from the images?

~~~
aaaaaaaaaab
Not for handheld videos.

~~~
pbhjpbhj
Wouldn't you be able to filter out the camera motion, as long as the movement
due to sound was fixed relative to something in frame?

~~~
aaaaaaaaaab
I would say the motion blur and the rolling shutter effect would overpower the
subtle vibrations due to audio.

~~~
kibibu
Rolling shutter is exactly why this works.

Its not a complicated paper, give it a go.

------
pbhjpbhj
If someone hunts-and-pecks how close can you get with gaze tracking?

------
0db532a0
Could you maybe do away with training beforehand on a single source and
instead use multiple, very sensitive microphones and triangulate the locations
of the keys being pressed? The estimations might not be accurate, but you
could put the results through a smartphone typo correction algorithm.

------
syntaxing
This is super cool. Very similar to timing attacks. I wonder if there is a way
to tune the model. Essentially cater the probability of each key to a person
with maybe a sentence or something.

~~~
sdenton4
Yeah, this is exactly the sort of thing where augmenting the base model with a
language model will have some good returns...

~~~
ggerganov
In keytap2 I'm trying to make use of the statistical distribution of n-grams
in the language. The idea is to first group the unknown keys into clusters
based on how similar they sound. The prediction then is performed by breaking
the obtained substitution cypher (assuming each cluster corresponds to a
letter).

------
sentrysapper
Can you combine this with or does it use relative positioning of the
microphone? Seems like a good way to map where keys are (measure lower
decibels which would be keys further from the mic).

------
ppod
There is more research on this than what you have cited here, unfortunately
the literature goes under the catchy keyword "acoustic keyboard emanations",
try that in Google scholar.

------
rapnie
Primary use case is more tracking and surveillance, I presume?

~~~
IIAOPSW
Primary usecase is hacking. A keylogger without any software or trace on the
target computer.

~~~
spockz
This is another reason why you want 2FA.

------
danschumann
Someone has seen the movie sneakers! The blind guy did this in that movie.

~~~
hnmonkey
In what scene? I don't think he actually did... I'm pretty sure when they're
trying to crack the guy's password they're using video to record and zoom in
to watch him and he's blocking it.

------
frakr
definitely keep buying mechanical keyboards....

