
How to defeat naive image steganography - jstanley
http://incoherency.co.uk/blog/stories/image-steganography.html
======
leni536
There is a less naive steganography tool in the Debian repos named steghide.
According to the manual:

    
    
           Steghide uses a graph-theoretic approach to steganography. You do not
           need to know anything about graph theory to use steghide and you  can
           safely  skip  the rest of this paragraph if you are not interested in
           the technical details. The embedding algorithm roughly works as  fol‐
           lows:  At  first, the secret data is compressed and encrypted. Then a
           sequence of postions of pixels in the cover file is created based  on
           a pseudo-random number generator initialized with the passphrase (the
           secret data will be embedded in the pixels at  these  positions).  Of
           these  positions  those  that do not need to be changed (because they
           already contain the correct value by chance) are sorted out.  Then  a
           graph-theoretic matching algorithm finds pairs of positions such that
           exchanging their values has the effect of embedding the corresponding
           part  of  the secret data. If the algorithm cannot find any more such
           pairs all exchanges  are  actually  performed.   The  pixels  at  the
           remaining  positions (the positions that are not part of such a pair)
           are also modified to contain the embedded data (but this is  done  by
           overwriting  them,  not  by  exchanging them with other pixels).  The
           fact that (most of) the embedding is done by exchanging pixel  values
           implies  that  the first-order statistics (i.e. the number of times a
           color occurs in the picture) is not  changed.  For  audio  files  the
           algorithm  is the same, except that audio samples are used instead of
           pixels.
    

I wonder how hard it would be to detect (not decode) content hidden by
steghide.

~~~
Arkanum
There is a series of lectures on stego at Oxford [1]. The lecturer uses simple
machine learning (average perceptrons I think) and achieves 99% accuracy with
little effort although that was for LSBR[2]. Later he discusses problems with
the graph approach [3]. Simply put whilst it preserves 1st order histograms it
will probably mess with 2nd or 3rd order histograms.

[1][https://www.cs.ox.ac.uk/teaching/courses/advsec/](https://www.cs.ox.ac.uk/teaching/courses/advsec/)
[2][https://www.cs.ox.ac.uk/teaching/materials15-16/advsec/advse...](https://www.cs.ox.ac.uk/teaching/materials15-16/advsec/advsec-
notes-ch01234.pdf) p48 [3] ^ p58

~~~
leni536
Your second link is behind authentication. Is it freely available somewhere?

------
dzdt
I always expected steganography to be used to encode a compressed and
encrypted datastream, not uncompressed images. Such a compressed and encrypted
datastream should be indistinguishable from noise, so not show in this kind of
analysis.

My pet idea for doing steganography was to use nonstandard or suboptimal
encoding sequences. For compressed data formats, including images, music,
video, and general files, there is almost always a choice of various encodings
which will decompress to the same result. Systematically varying a compression
choice gives an invisible way to encode data.

~~~
Arkanum
Actually there are properties of images that meant that the noise introduced
through stego is actually, especially for LSBR, VERY easy to detect.
[https://www.cs.ox.ac.uk/teaching/materials15-16/advsec/advse...](https://www.cs.ox.ac.uk/teaching/materials15-16/advsec/advsec-
notes-ch01234.pdf)

~~~
progressive_dad
That page requires an Oxford University username and password.

I'd be interested to read this.

------
StavrosK
Wouldn't this be completely defeated by just simply encrypting the data
beforehand?

~~~
woliveirajr
Yes.

And, in general, people don't hide the image of some text inside another
image. They hide the text. And that isn't so linear (blocks of 0s or 1s)

~~~
UweSchmidt
But we agree that the bar for something non-naive is extremely high if you
want to defeat the well-funded adversaries and whatever they come up with in
the next few decades (future algorithms will be applied to today's images).
I'm sure they have nice libraries of the artifacts most cameras and
compression algorithms _should_ leave in any image and if it isn't there,
well, let's look for some steganography?

~~~
jakobegger
I'm pretty sure you are overestimating the capabilities of government
agencies. At least I hope so.

~~~
williamscales
Ha! In general I assume that the NSA is ahead in many areas that remain
classified.

------
minimaxir
This is why the steganography tool PixelJihad uses a schedule to scatter
pixels randomly around the images, instead of changing LSB sequentially (and
derive the same schedule to retrieve the pixels:
[https://sekao.net/pixeljihad/about.html](https://sekao.net/pixeljihad/about.html)
)

Definitely not naive, but an interesting hack.

------
wrong_variable
I have been thinking a lot about stenography. If we can make the technology a
little more common place ( couldn't find a good book on the topic ) - it would
effectively make the arguments made by FBI,GCHQ ineffective. But something
tells me large scale usage of stenography is not simple to implement.

~~~
renox
There are two big problem with stenography: 1) as it is inefficient, you need
a "normal"/casual way to send a big amount of data. It used to be rare, but
nowadays I doubt that this is a big issue. 2) you need to coordinate how
you're going to use stenography. That is the real big issue, same as key
distribution in cryptography.

~~~
rjsw
When it was reported that OBL had a big porn stash in the house at Abbottabad
I presumed that it was (at least partly) to be used with steganography.

------
ourcat
Nice.

Here's some messages hidden in audio 'noise' generated from stretching out all
the black and white squares of a QRcode and generating some audio from it.

It can also be decoded back to the QR code and back to the message.

Neat.

[http://qrdio.com](http://qrdio.com)

------
jstanley
Just realised I obviously cocked up the bit planes for the Lena+Walrus
example. There's some grey in there as well as black and white!

EDIT: Fixed. It was because I changed it down to greyscale too late in the
process.

~~~
arnarbi
Knowing nothing about steganography, how far would I get from this detection
by making the last bit-plane the xor or the second last one and my data?

~~~
jstanley
I've not tried it, but it seems like it could work. The only potential problem
is that you'll have around the amount of noise expected in the (n-1)th bit
instead of the nth bit.

------
Arkanum
I've posted it elsewhere in the comments but Dr. Andrew Ker at Oxford Uni runs
a course on steganography [1]. It's worth a read as he covers several
different methods for stego and then methods for detecting said methods.

[1][https://www.cs.ox.ac.uk/teaching/materials15-16/advsec/advse...](https://www.cs.ox.ac.uk/teaching/materials15-16/advsec/advsec-
notes-ch01234.pdf)

~~~
jstanley
"This service is accessed via the University of Oxford Single Sign-On system."

Are you able to exfiltrate this information?

