
Show HN: I made a steganography program - VenomSwitch
https://github.com/JHurst97/SteganograhyProject.git
======
nullc
As far as I know the state of the art in image stego is perturbed quantization
with wet paper codes.

The essential idea with wet paper codes is that you can use error correcting
codes so that even the recipient need not know which bits contain the data.

A simple analogy is that if you wanted to communicate only a single bit you
could decide that the xor of all the data in the image decides the bit. Then
the encoder can flip the single least distinguishable bit to encode their
message.

The idea behind perturbed quantization is that when you quantize the data
(e.g. while jpeg compressing it) some values will be maximally close to tied
between rounding up and rounding down. If you embed your data in those values
you'll minimize the embedding noise.

Unfortunately most of the work on approaches based on these ideas is academic
and has no public implementation, but there is one that I'm aware of
[https://sourceforge.net/projects/pqstego/](https://sourceforge.net/projects/pqstego/)

There is a lot that could be done both with respect to security (e.g. using
better models to predict the image to hide the stego noise better, better
error correcting codes, and list decoding) and basic engineering, e.g.
handling content length encoding better.

~~~
jkhdigital
Image steganography techniques have indeed reached a level of sophistication
that requires significant expertise to implement correctly. A good parallel is
compression--most CS undergrads can figure out how to implement Huffman or
even LZW compression, but state-of-the-art ANS (asymmetric numeral systems) is
pretty dense. I believe a special type of wet paper code called a syndrome-
trellis code is at the forefront of image steganography:
[http://dde.binghamton.edu/filler/pdf/Fill10tifs-
stc.pdf](http://dde.binghamton.edu/filler/pdf/Fill10tifs-stc.pdf). Good luck
parsing that without a solid foundation in information theory.

Steganography is the current focus of my PhD research, and from my perspective
the greatest security challenges are in systems engineering. Embedding secret
messages into your vacation photos is one thing, but who are you sending these
messages to? And why? How will they know when and where to go looking for the
secret message?

Encryption entered its golden age with the advent of public-key cryptography
and Diffie-Hellman key exchange, allowing two parties to share the secrets
required for secure communication knowing only public information about one
another. However, steganographic systems are still dependent upon lots of out-
of-band secret sharing, since attempting to share the necessary secrets over
public channels tends to reveal the existence of the system and opens up DoS
attacks (e.g. Tor and the identity of relays). IMHO, figuring out better
solutions to the system security of particular steganographic applications is
a much more pressing research problem than improving statistical embedding
profiles for image steganography.

~~~
nullc
You can use public key encryption which is indistinguishable from uniform.
E.g. using ECDH w/ ellegator and ed25519 for the key agreement.

Then the recipient of the stego need only publish the public key. If the
encoding software were part of standard linux distributions (even if just
specialized security conscious distros), then the sender could obtain it
without identifying themselves.

I don't think you can tackle the problems in isolation: There isn't much point
in making uniform encryption if the embedding gives it away to anyone who's
looking. :)

> but who are you sending these messages to?

One scheme I've been fond of is where the recipient announces their public key
and that they'll accept messages in images posted to some moderately high
traffic public picture board, like r/naturepics. Then then simply download and
process all images submitted there.

The sender has an easy and obvious cover story for their posting and can
produce as many images they need just by visiting a park with a camera.

~~~
betterunix2
"the recipient announces their public key"

A typical threat model is that there is a "warden" who can selectively block
messages, so this would not really work since the warden would block the
announcement (in theory the warden is not going to block all messages,
although a formal security definition should only focus on whether or not the
warden is more likely to block stegomessages than innocent messages, which is
trivial if the warden blocks everything). Thus out of band communication is
needed to distribute secrets; likewise, the stegosoftware itself has to be
distributed out of band.

"indistinguishable from uniform"

Actually you need to be indistinguishable from the "background noise" (the
thermal noise from the camera sensor) of the image, which is not uniform. In
fact, the noise is correlated with the camera itself and can be used to
identify which camera took a particular photo.

~~~
nullc
> Thus out of band communication is needed to distribute secrets; likewise,
> the stegosoftware itself has to be distributed out of band.

This is why public key cryptography is so important. Because you can use a
broadcast channel -- like a Linux distro to communicate the tools and required
knowledge in a way that doesn't give a strong signal that you're using it.

> "indistinguishable from uniform" Actually you need to be indistinguishable
> from the "background noise" (the thermal noise from the camera sensor) of
> the image

This isn't the case for code based encoding. The encoded values are a
pseudorandom linear combination of a huge number of bits from the image and
end up being extremely uniform. Consider my silly xor example: the xor of all
the bits in the image is going to be close to uniform, even if the bits in the
image are fairly biased.

But also more generally: If you have a way to map encrypted messages onto the
space of uniform strings, then you can also remap from them onto any other
required distribution in a fairly straight forward way.

Having a good error metric is important though, and it's one of the areas that
seems a bit under researched.

~~~
betterunix2
"you can use a broadcast channel"

That is not part of the threat model, and in any case is basically an out-of-
band channel.

"like a Linux distro to communicate the tools and required knowledge in a way
that doesn't give a strong signal that you're using it."

Again, the threat model is that the adversary controls the network and can
block or modify messages at will. So unless all linux distros included the
software, the adversary would simply block those that include the software.
The threat model of steganography basically implies that some out of band
distribution is needed.

"encoded values are a pseudorandom linear combination of a huge number of bits
from the image and end up being extremely uniform"

That has nothing to do with what I was talking about. The "time domain" of the
image is where the noise lies and where the bias exists in the bits. Even if
it can be encoded in some way that looks uniform, that does not mean that an
arbitrary change to the encoded image will have the correct bias when it is
decoded.

"you can also remap from them onto any other required distribution in a fairly
straight forward way."

In the forward direction, but the reverse direction is not so straightforward.
This is a classic problem with doing ElGamal encryption on elliptic curves --
some curves do not have an easily invertable map from bit strings to the EC
group. I do not know how hard this is to do with images, but when I last
looked image stegosystems relied on heuristics and needed relatively short
messages to remain secure.

~~~
nullc
> That is not part of the threat model, and in any case is basically an out-
> of-band channel.

It may not be the threat model _you_ are interested in but it's part of the
usage model of just about anyone capable of actually using a computer with
arbitrary software, even in extremely restricted parts of the world. (users
who can't run arbitrary software can't really be helped by steg stuff they
can't use!)

Simplified threat models can be extremely helpful in formalizing security
arguments, but taking them too and confusing them for actual usage can be
devastating for the users security too.

> That has nothing to do with what I was talking about. The "time domain" of
> the image is where the noise lies and where the bias exists in the bits.
> Even if it can be encoded in some way that looks uniform, that does not mean
> that an arbitrary change to the encoded image will have the correct bias
> when it is decoded.

In fact it has everything to do with what you were talking about. One of the
advantages of using wet paper codes is that you can _arbitrarily_ preserve the
properties of the underlying data while encoding uniformly random information.
Even better: only the encoder needs to know about the properties that are
being preserved, so it can use information which is unavailable to the
attacker or the receiver such as an unquantized or higher resolution original.
This is what the PQstego program I linked above does.

The encoder has significant freedom to preserve whatever properties it needs
to preserve while encoding uniform data.

> but the reverse direction is not so straightforward. This is a classic
> problem with doing ElGamal encryption on elliptic curves

You would _very much_ not want to use ElGamal for this (or essentially any
application which you didn't need its additive homomophism)-- doing so has
terrible performance, terrible security properties, and terrible
communications efficiency.

If you look back at my message I specifically brought up "elligator" for ECDH
-- a bidirectional mapping between curve25519 points and bits. Similar
approaches exist for many other curves.

------
speps
Decrypted text from image:

> Not going to lie I'm surprised anyone can figure out how to decrypt this in
> the current state of the program, well done! P.S. star my repo plz I'm
> trying to look employable! :)

Project wasn't building correctly to start with, UI is bit clunky (missing
window frames!?).

------
VenomSwitch
This is different to the other stego programs out there in that it allows the
user to choose how many LSB (Least significant bits) they want to use. This is
important because if you use 4LSB an image can store more data but it is more
obvious in the output, with 1LSB it is less obvious but you can store 1/4 as
much data. My program also allows you to input text OR image into a cover
image. If you are unfamiliar with steganography - I have tried to explain it
simply on the github readme. ANY feedback on the program or the repo would be
greatly appreciated!

-J

~~~
lun4r
I made one years ago ([https://www.planet-source-
code.com/vb/scripts/ShowCode.asp?t...](https://www.planet-source-
code.com/vb/scripts/ShowCode.asp?txtCodeId=28747&lngWId=1)). Had a quick look
at your code and I could be mistaken, but it looks like you modify the pixels
from top left to bottom right. You can make it harder to detect by selecting
the pixels pseudorandomly, seeding your prng with the encryption key.

~~~
auxym
If the hidden data is encrypted anyways, then wouldn't it be indistinguishable
from truely random data? What would then be the point of distributing it
randomly or in order?

~~~
betterunix2
Actually, the thermal noise of a camera is biased so a uniform random
distribution of low-order bits will be easy to detect. You need to preserve as
much of the thermal noise as possible and try to choose pixels from high
frequency regions of the image (where the thermal noise is harder to separate
from the image itself).

~~~
berdon
So maybe the best approach would be to use a filter/transformation to exclude
hard-lined regions (edges) and only embed data within variations/gradations?

~~~
betterunix2
Other way around: edges are a good place to try to hide stegotext because of
the high frequency components (the edge effect, which is also why JPEG
artifacts are worse along sharp edges).

------
peterburkimsher
Do you have an algorithm that will work with recompressed lossy images?

I experimented with steganography before, but simply sending a picture via
Facebook and clicking Save To Camera Roll on an iPhone causes serious data
loss pretty quickly, so I couldn't get meaningful data out.

In the end I found that PDF attachments worked for my use case, but I'd still
rather have an image-based steganography program if possible.

~~~
jonatron
If you want watermarks to survive jpeg compression, you have to do DCT block
manipulation, not just individual pixel values.

~~~
im3w1l
Does this survive cropping by not-multiple-of-8?

~~~
jonatron
When you decode, you can start from offsets (eg 4,4) rather than 0,0.

------
thomasqbrady
As a developer of 20 years who has avoided .Net, some build instructions
and/or some releases would be nice, but looks like a fun project!

~~~
speps
Open it in a recent VS (2017 worked), fix the compile error and run it.

~~~
thomasqbrady
We don't all use (or even have) Visual Studio.

~~~
thomasqbrady
Also, that assumes a platform, no? This wouldn't build on Mac/Linux/Unix
machines, without a Mono setup, would it?

