
Defeating Image Obfuscation with Deep Learning - EvgeniyZh
https://arxiv.org/abs/1609.00408
======
Hondor
They're not really "unblurring" the images. They had to have a database of
10s-100s of candidate faces to match against. If you know that the blurred
person is one of 400 people you've already got photos of, there are probably
plenty of conventional ways you can identify them other than from their
blurred face. This doesn't sound like as impressive a result as the title
implies.

------
xori
Is there a library that anyone knows of that does the "encrypted" jpeg[0]
operations that are mentioned in the article?

[0]
[http://www.telecom.ulg.ac.be/publi/publications/mvd/sps-2004...](http://www.telecom.ulg.ac.be/publi/publications/mvd/sps-2004/index.html)

~~~
richard_mcp
One of the authors here. I couldn't find any library that carries out the P3
operations, but it was pretty easy to modify existing software[0] to achieve
this.

[0] [http://www.ijg.org/](http://www.ijg.org/)

------
thaw13579
If the obfuscation is a blur, wouldn't deconvolution work as well?

~~~
highd
Deconvolution is pretty hard since it's basically inverting a very poorly-
conditioned matrix - that's also assuming you know the kernel. The solution is
generally to use some kind of prior on the recovered image. While a normal
compressive sensing technique would use, say, some sort of wavelet sparsity as
a prior, this paper's approach is effectively imposing a far stricter prior -
that the face is one of 530 individuals. If you can model face transformations
arbitrarily well (pretrained deep model) then that prior is going to make the
face recovery much easier. Of course they're only saying which face, not
recovering the actual image, but close enough.

That's one of the things I'm most interested in about deep learning - priors
can be far better than simple linear transformations + norms. This means
potentially recovering useful information from far noisier/smaller data
sources than current compressive sensing techniques are capable of.

------
excalibur
So... colored boxes over faces > fancier methods?

~~~
ggggtez
The US government has known this forever. Ever see those black boxes over
censored documents? It's obviously better than blurring, the only information
you get is the size of the box.

~~~
dragonwriter
> Ever see those black boxes over censored documents? It's obviously better
> than blurring, the only information you get is the size of the box.

Actually, quite a few times in the past with that method, you've gotten the
complete text, because someone uses a format that includes the text as text
(like PDF) and then dropped a solid-black box over the text, which _looks_
like the text is obliterated, but preserves the entire content in the file.

~~~
Namrog84
One time in college. A teacher did something like this with hw and solutions
being "hidden"/"censored". And I was easily able to get around it and have all
the solutions :/

~~~
taneq
One of our lecturers did something similar, but with class grades. They
started from a complete list of grades in Excel, and for each student, deleted
all grades but that student's grades and then saved a copy.

Unfortunately, they did this without disabling the Excel feature that saved
undo data in the file, so anyone in the class was able to hit Ctrl+Z a couple
of times and see the entire list of grades.

------
mamon
Correct me if I'm wrong but isn't pixelation a first step in most of image
recognition algorithms, used to reduce computational complexity of the
problem? It would be weird to treat it as obfuscation method.

~~~
foota
I think it's generally a blur, as opposed to pixelation.

~~~
nercht12
It says "mosaicing (also known as pixelation)".

------
dredmorbius
Focusing on the specifics of image deobfuscation is missing a vastly larger
point.

A large part of the implications of this research are about the amount of
information needed to reliably identify a given human being. About 33 bits of
nonredundant data -- enough that an image with 64 bits of extractable data
might serve to identify any person with considerable redundancy.

And where that might be obtained from, how _computers_ rather than _humans_
could suss out identifying information, often where humans cannot, and
finally, the questions of who might be motivated to use such information, on
whom, and to what ends, which gets to the rather deeper question of the
morality of technology, and how to address issues in which technology is in
all likelihood inevitable, if not particularly desirable.

There's a good discussion of many of these points in a post by Yonatan Zunger
at G+, which I recommend:
[https://plus.google.com/+YonatanZunger/posts/HxrBbAskg19](https://plus.google.com/+YonatanZunger/posts/HxrBbAskg19)

Yonatan and others raise a number of points.

 _Humans are very good at recognising faces, particularly intact ones. But
computers can key off of other factors._ Blurring out _faces_ leaves other
identifiable elements, including body morphology, clothing, skin or hair
patterns, gait, vocal tones, writing patterns (even after multiple passes
through Google Translate), and more.

Reconstructing blurred or pixelated faces is the least of our problems.

 _In the question of who benefits --_ qui bono* -- I find it useful to
consider that Francis Bacon's dictum is false.* Knowledge _isn 't_ power. But
knowledge _is_ a power _multiplier_. The party with intrinsically greater
power, and information resources, and access (overtly or covertly) to corpora
of identifying data and observations to be matched, has the distinct
advantage. The more so if it can both acquire data and act with impunity.

This is amplified for any party which can either acquire data, or act, with
impunity. National security organisations, international criminal syndicates,
entities such as Anonymous, and non-state terrorist organisations all fall
into this general category, though with differing resource and vulnerability
profiles themselves.

 _This is balanced, somewhat, against the size of the surveilled population._
Mass efforts, such as Edward Snowden revealed the NSA to be engaged in, are
sweeping through tens of millions to billions of targets. A group targeting an
oligarchical power structure has a much smaller and more tractable problem --
the six members, say, of a Communist Party Standing Committee, a few hundred
national legislators, or the tens of thousands of members of the 0.01% elite
of the US or Europe, say.

The nobility has the advantage of resources and power. The _mobility_ , as
Neal Stephenson termed the Mob, has the advantage of loose structures and high
resilience, though it can also be difficult to organise effectively.

A particularly explosive situation emerges where a faction of the nobility
discover a means to energise the mobility to their ends: demagoguery.

 _The technologies being demonstrated are only going to become more
effective._ So long as there's a technological civilisation, these
capabilities will increase. Rebottling the genie is almost certainly
impossible.

I'm left wondering both how this might be mitigated (if at all). Or, perhaps
more possibly, how it will change dynamics of surveillance, sousveillance,
mass protest, international espionage and crime, and more.

Another thought occurs: reverse-engineering algorithms might make it possible
to construct obfuscated images which a machine reads as identifying some
specific individual, but which no human would be able to either identify _or
challenge_. Raising new concerns over fabricated evidence or narratives,
whether as a framing mechanism against an enemy, or as part of a self-
promotion campaign ("I was at Our Group's Great Defining Event and This
Machine-Identifiable Image is Proof").

Interesting times / so it goes.

