
Phonotactic Reconstruction of Encrypted VoIP Conversations [pdf] - js2
http://www.cs.unc.edu/~fabian/papers/foniks-oak11.pdf
======
js2
_Abstract — In this work, we unveil new privacy threats against Voice-over-IP
(VoIP) communications. Although prior work has shown that the interaction of
variable bit-rate codecs and length-preserving stream ciphers leaks
information, we show that the threat is more serious than previously thought.
In particular, we derive approximate transcripts of encrypted VoIP
conversations by segmenting an observed packet stream into subsequences
representing individual phonemes and classifying those subsequences by the
phonemes they encode. Drawing on insights from the computational linguistics
and speech recognition communities, we apply novel techniques for unmasking
parts of the conversation. We believe our ability to do so underscores the
importance of designing secure (yet efficient) ways to protect the
confidentiality of VoIP conversations._

via
[https://news.ycombinator.com/item?id=11994286](https://news.ycombinator.com/item?id=11994286)

This is pretty fascinating... the jist is that they were able to extract the
conversation from an encrypted VoIP stream from the lengths of the encrypted
packets and knowledge of the underlying voice encoding:

 _The approach we pursue in this paper leverages the correlation between
voiced sounds and the size of encrypted packets observed over the wire.
Specifically, we show that one can segment a sequence of packet sizes into
subsequences corresponding to individual phonemes and then classify these
subsequences by the specific phonemes they represent. We then show that one
can segment such a phonetic transcript on word boundaries to recover
subsequences of phonemes corresponding to individual words and map those
subsequences to words, thereby providing a hypothesized transcript of the
conversation._

