
Hijacking YouTube to transmit your data - adamnemecek
https://banmeihack.wordpress.com/2016/07/26/hijacking-youtube-to-transmit-your-data/
======
nnq
> This is a fundamental hole in security with no logical workaround.

There is no hole in anything. You're not violating anyone's privacy or
stealing anything from anyone. Even the bandwith is given to you for free.
It's how things are supposed to work.

You're just exercising your right to privacy by using such a thing.

One can tolerate someone re-inventing/re-discovering steno and making it sound
like it's smth new... but not someone having _no f idea whatsoever_ of what
"security" means and what his "right to privacy" is... ffs!

~~~
codeulike
I think he's talking about situations where a network blocks certain things,
and how steganography allows you to bypass that.

edit: stegano, not steno

~~~
seanhunter
I think you guys mean 'steganography' rather than 'stenography'.

[https://en.wikipedia.org/wiki/Steganography](https://en.wikipedia.org/wiki/Steganography)

~~~
codeulike
haha oops

------
chriswarbo
[https://en.wikipedia.org/wiki/Steganography](https://en.wikipedia.org/wiki/Steganography)

~~~
Animats
Right. The article author seems to have re-invented steganography.

The hard problem is finding a way to encode data in video in a way that will
survive recompression, resizing, or other video processing. The watermarking
people have struggled with this for years. There are various spread-spectrum
like schemes with good noise immunity that can do this.

YouTube has an ongoing battle between the copyright-infringement
identification system and versions of audio and video modified to evade it.

~~~
visarga
Can't they simply use the least significant bit from each color channel to
carry data? I think a single bit flip would change colors by a factor of
1/128, indetectable for the eye. Of course, use compression, encryption and
redundancy too.

~~~
Animats
That won't survive video compression.

------
jwatte
Google "steganography."

You can significantly increase the bit rate. For example, overlay a QR code
over each of four consecutive frames. You can do this without frame dropping.
You only need to add about 6dB of the code for this to be recoverable.
Similarly, if you know how the codec works, you can exploit that. (Your
proposed method is actually pessimal for a modern B frame codec!)

Then there is hiding modem transmissions in techno music sound tracks ;-)

------
0xmohit
One could achieve a similar thing using PDF files by utilizing a feature
called "File Attachments" [0], [1].

There is nothing _insecure_ about it too.

You could even need commercial software for it, and could use TeX and friends
[2] to achieve the same.

[0] [https://blogs.adobe.com/insidepdf/2010/11/pdf-file-
attachmen...](https://blogs.adobe.com/insidepdf/2010/11/pdf-file-
attachments.html)

[1]
[https://wwwimages2.adobe.com/content/dam/Adobe/en/devnet/pdf...](https://wwwimages2.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000.pdf#G9.1519247)

[2] [http://tex.stackexchange.com/questions/208012/attaching-
file...](http://tex.stackexchange.com/questions/208012/attaching-files-using-
plain-tex-pdftex)

------
knorker
Sigh.

Youtube deliberately has the "upload a video" feature. It's not a mistake.
It's not a security hole.

Also see this misguided and confused soul:

[http://seclists.org/fulldisclosure/2014/Mar/123](http://seclists.org/fulldisclosure/2014/Mar/123)

------
voltagex_
I wonder how difficult it would be to find the optimal "storage" method for
data within YouTube videos.

On the video side, you're dealing with at least VP9 and H264 which I'm
assuming "destroy" your data somewhat in the encoding process. The audio side
is Opus and AAC, with similar challenges.

~~~
niftich
H.264, VP9, etc., are all macroblock-based DCT codecs with I-frames that
contain a full still image (like a JPEG), and other kinds of frames that
contain instructions in terms of motion vectors of how those macroblocks move
around. They also use colorspace transforms and color subsampling, so they
intentionally sacrifice some color accuracy.

But writing a data stream into a 2D still image in a way that can be decoded
later is a solved problem, ie. 2D barcodes like DataMatrix, QR Code, and
Microsoft Tag (which has up to 8 colors to further increase data density).
These formats have built-in error correction that compensates for some missing
blocks. However, we can tune the format to be closer to the video codec's
internal structure, to make them play nicer together.

For example, we can set each barcode block to be within 50% to 100% of pixel
size of the video's macroblock, to make it more likely that the video codec
can reuse the macroblock with motion vectors in a P/B-frame, instead of having
to put more bits to it, or have it accidentally mangle it.

Realistically, we can also increase our color palette, as we're not going to
be scanning these barcodes in bad light conditions -- all we need to do is get
the color mostly right. But the more we increase the palette, the less video
codec can reuse blocks; so this is something we'll want to experiment with.

The biggest problem for the barcode approach comes from the addition of the
3rd, temporal dimension. We can have each frame form its own independently
scannable barcode, but doing so, we'll want to build in some temporal
redundancy, ie. have a chunk of data, or error correction for said chunk, be
present in more than one frame -- to protect against occasional frame drops,
very inconvenient frame drops (like when you lose an I-frame and the video is
grey- on green-blocky for several more frames), and offer some extra
protection against "normal" decoding errors.

By the way, there are existing implementations of this concept:

[1] [http://thruglassxfer.com/](http://thruglassxfer.com/)

[2] Demo of above:
[https://www.youtube.com/watch?v=2_8GlFdlb0Y](https://www.youtube.com/watch?v=2_8GlFdlb0Y)

[3] Same idea, some hackable code: [https://github.com/Neohapsis/QRCode-Video-
Data-Exfiltration](https://github.com/Neohapsis/QRCode-Video-Data-
Exfiltration)

~~~
swiley
2D barcodes solve a slightly different problem and end up wasting a lot of
space on two things:

1 you don't need a header for every frame but these barcodes do.

2 (for QR this is the worst one) there is a lot of space wasted to help detect
and correct perspective distortion.

------
awesomepantsm
This makes no sense. If you have the ability to run software to decode the
youtube video, then let's be honest, what is actually stopping you from just
using Tor browser, or a proxy site to get to your content? Or just a USB stick
with whatever data that you downloaded elsewhere?

------
brian-armstrong
You can generate .wav files with all sorts of modulation methods centered at
the frequency of your choice with transmission measure in kbps with
[https://github.com/quiet/quiet](https://github.com/quiet/quiet) which
provides a wav file encoder. You could then just add this wav on top of your
video.

And if you're really feeling adventurous libquiet provides floating point
output that can be put into any channel like video if you're willing to plumb
it in there.

</plug>

~~~
nitrogen
How well does the modulation scheme used survive MP3/AAC/Vorbis/Opus encoding?

------
0x0
I was kind of expecting to see a live stream with white noise "modem"
audio/video.. :)

~~~
carey
I thought it might at least be something like the SSTV messages in the Portal
ARG, which sounds a lot like modem noise.

------
melle
Another example of steganography can be found in songs by Aphex Twin, e.g.
Windowlicker
([https://en.wikipedia.org/wiki/Windowlicker](https://en.wikipedia.org/wiki/Windowlicker))

~~~
bcook
He also put his own face into one song.

[http://www.bastwood.com/?page_id=10](http://www.bastwood.com/?page_id=10)

------
rasz_pl
>for a 30 frames/second video, a 15 bit/s transfer rate is obtained.

~~~
aji
yeah, somehow I feel this is less than optimal

~~~
eximius
clearly. It is trivially improvable by simply adding more sections to the
video. Or not caring about the original video.

------
flashman
> Replace every even frame with a copy of the subsequent odd frame

I think this is supposed to be _previous_ odd frame, given that 1 2 3 4 5 6 7
8 9 10 becomes 1 1 3 3 5 5 7 7 9 9.

------
ryanmarsh
Couldn't this be done with two live streams for TX/RX? effectively creating a
VPN? As the author said there's plenty of modulation techniques possibly.
Surely some much faster ones could be used.

The biggest downside I could think of would be the lag: data > render frames >
encode frames > network > decode stream > render frames > scrape data

~~~
hoffcoder
I think the author has not thought of frame reordering in error scenarios. If
UDP is used, frames could even get dropped in the middle, and in case of TCP,
frames could arrive out of order. That would wreck havoc in the odd-odd
numbering sequence of the encoding.

~~~
visarga
Tornado codes to the rescue, then.

[https://en.wikipedia.org/wiki/Tornado_code](https://en.wikipedia.org/wiki/Tornado_code)

------
visarga
Maybe this can be used to distribute tracker IPs / seed information for p2p
networks, eliminating the need for a root server.

~~~
megablast
They can encode the magnet number inside trailers for the actual films the
number represents.

------
masukomi
> Once they identify videos that might contain encrypted data, they can then
> begin to work on decrypting that data. The amount of video data on the
> internet is massive and it is growing at an exponential rate (the zeta-bytes
> of data they would have to sift through, I cant even imagine the headache).

um. they already do that. They scan all the uploaded videos for copyrighted
audio, fingerprinting and comparing the uploaded audio of a bajillion videos
against 1/2 a bajillion songs.

------
joebergeron
While this is little more than simple steganography, I'd be curious to see
what kinds of encrypted data size / video size ratios are achievable, perhaps
using some more nuanced techniques or approaches.

It's definitely an interesting idea, but it's really nothing new. I remember a
few years ago reading about people hiding compressed .zip archives inside of
jpegs or something like that.

------
roddux
I wonder how much data you could reliably transmit without the video/audio
quality of the base video notably changing.

Does YouTube cut out audio frequencies that are beyond the hearing range?

------
libeclipse
Hmm. The word encryption is pretty loosely used.

~~~
brian-armstrong
Well, if you have a mechanism for sending date you can always encrypt on top

~~~
cellularmitosis
Sure, but that doesn't change the fact that the author is confusing the terms
"encrypt" and "encode"

------
palakchokshi
he probably encoded HODOR HODOR HODOR multiple times in that video at the end.

