Hacker News new | past | comments | ask | show | jobs | submit login
Timing Analysis of Keystrokes and Timing Attacks on SSH (2001) [pdf] (berkeley.edu)
36 points by feross 4 months ago | hide | past | web | favorite | 17 comments

Anyone interested in this will likely enjoy Don't Skype & Type![1] where researchers decoded keystrokes from background audio of Skype conversations. The best part is source code is available[2]. I wonder how many people are applying this to Twitch streamers or YouTube videos (especially any Talks at Google videos) today?

1. http://spritz.math.unipd.it/projects/dst/ 2. https://github.com/SPRITZ-Research-Group/Skype-Type

I've been working on a similar tool: https://github.com/ggerganov/kbd-audio (see the 'keytap' tool). Not sure how well it works yet, as I have made tests only with my setup. There is a live page that anyone can experiment with.

This is incredible work. Very well done! Have you posted this to Show HN yet?

Thanks! I haven't posted it yet, as I want to test how reliable is the approach. I know it does not work with non-mechanical keyboards at all (most likely because the key sounds are quiet) and I have tested it only with my mechanical keyboard.

Then I must apologize as I've already posted it to Twitter where it's getting a lot of really positive reactions: https://mobile.twitter.com/feross/status/1068038193868460032

It worked pretty well on my keyboard! Let me know if you'd prefer me to delete the tweet.

Wow! Thanks so much!

Please, don't delete the tweet. Hopefully I can collect feedback. Thanks again!

What kind of keyboard do you have?

That work is great. A related paper on earlier work where traffic analysis on skype was being done and where the researchers were able to extract individual phonemes and then reconstruct speech that way. It's one of my favorite papers. It's titled "Phonotactic Reconstruction of Encrypted VoIP Conversations" and you can find it here: http://www.cs.unc.edu/~fabian/papers/foniks-oak11.pdf

Voice is in a way the easy case, because we know the antidote. Constant Bitrate (CBR) mode of an audio codec consumes the same amount of bandwidth regardless of what is transmitted, which is inefficient but secure. As I understand it Signal's voice chat is Opus in CBR mode.

Other scenarios are trickier and may need custom work. For example Encrypted SNI currently requires a host to pick a maximum name length, the encrypted name may be any of those names configured on the host, and is padded to that length so that an adversary can't guess which name from the length.

Because we don't have a general solution, TLS 1.3 defines an zero overhead optional padding, you can add extra bytes of padding to any TLS message but neither TLS itself, nor the HTTPS binding defines a "good" way to use this padding to shield users from analysis of content based on size because there is no general solution known.

I haven't been able to recreate this work from the repos. I emailed them too, about a year ago, with no response.

Have you had any success? If so, would you be willing to share?

It sounded too good to be true, then I found the caveat: a neural net must be trained on a specific keyboard. Still pretty cool, though.

Man, this paper is a classic. I love traffic analysis attacks like this. I did something myself six years back albeit with a somewhat contrived example figuring out what someone is looking at on Google Maps via request and response sizes. Example video here https://www.youtube.com/watch?v=skQNwd9Jij4 and https://ioactive.com/ssl-traffic-analysis-on-google-maps/.

I am speculating that nice traffic analysis attacks can be done on mosh (which is a great tool btw) to, similar to the paper that is in this thread. It's been sort of on my "todo/research" list but haven't been able to sit down for a few days and mess around with it. And I'm sure that QUIC (HTTP/3) will open up some interesting avenues of attack here too.

It should be fairly straightforward to defeat such analysis using ssh compression. Keystrokes and control signals are batched before transmission.

Certainly I've had 'ssh -C' in finger-memory even on LANs for well over a decade.

Sure. But that's why my point was about mosh (https://mosh.org/). It just uses TCP+SSH for the authentication part and then it sets up an encrypted UDP-tunnel on the server-side with the mosh-client then just sending AES-256-GCM packets back and forth over UDP. To the best of my knowledge it doesn't batch anything.

And compression definitely doesn't always help as some of the attacks on TLS were only able to be done because of compression happening before encryption. Hence why we ended up with the HPACK in HTTP/2 to prevent exactly such type of attacks.

Mosh does include random chaff and (some) timing variation and batching in an effort to weakly frustrate these kinds of keystroke information leakages -- my understanding is that we are at least as strong as SSH in this area, but would love to see any analysis either way. (We have a "frame rate" with a minimum interval in both directions, and a SEND_MINDELAY collection interval. The current values are chosen for performance and minimizing tiny packets, but could be increased or randomized.)

If necessary (or maybe in some optional supersecure mode), Mosh can afford to do much more timing variation, or even a "line-at-a-time" mode, since the client can be more aggressive about showing the predictive local echo (with the ability to correct it later) while waiting to send batches of keystrokes and for the server's reply. Or we could just do a CBR mode.

(BTW Mosh uses AES-128-OCB, not AES-256-GCM.)

Oh wow, thanks Keith, first of all for mosh! I've been using it daily for several years now. It's been great! Second of all for clarifying and correcting me regarding the algorithm-usage. I don't know where I got it from and I must have just misremembered; it's been a while since I spent a bit of time looking at the source code (and walking away very impressed).

That's really amazing how you can use the intervals between packets to infer passwords like that. I thought I'd seen an attack like this that used tcp timestamps, I didn't know you could do this from the packet arrival times themselves!

I wrote a little program to transmit data via packet intervals, I need to play with adding error correction to it now - https://www.anfractuosity.com/projects/timeshifter/

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact