Hacker News new | comments | show | ask | jobs | submit login
Defeating Image Obfuscation with Deep Learning (arxiv.org)
78 points by EvgeniyZh on Sept 13, 2016 | hide | past | web | favorite | 24 comments

They're not really "unblurring" the images. They had to have a database of 10s-100s of candidate faces to match against. If you know that the blurred person is one of 400 people you've already got photos of, there are probably plenty of conventional ways you can identify them other than from their blurred face. This doesn't sound like as impressive a result as the title implies.

Is there a library that anyone knows of that does the "encrypted" jpeg[0] operations that are mentioned in the article?

[0] http://www.telecom.ulg.ac.be/publi/publications/mvd/sps-2004...

One of the authors here. I couldn't find any library that carries out the P3 operations, but it was pretty easy to modify existing software[0] to achieve this.

[0] http://www.ijg.org/

If the obfuscation is a blur, wouldn't deconvolution work as well?

Deconvolution is pretty hard since it's basically inverting a very poorly-conditioned matrix - that's also assuming you know the kernel. The solution is generally to use some kind of prior on the recovered image. While a normal compressive sensing technique would use, say, some sort of wavelet sparsity as a prior, this paper's approach is effectively imposing a far stricter prior - that the face is one of 530 individuals. If you can model face transformations arbitrarily well (pretrained deep model) then that prior is going to make the face recovery much easier. Of course they're only saying which face, not recovering the actual image, but close enough.

That's one of the things I'm most interested in about deep learning - priors can be far better than simple linear transformations + norms. This means potentially recovering useful information from far noisier/smaller data sources than current compressive sensing techniques are capable of.

So... colored boxes over faces > fancier methods?

Well, you have to ask yourself - how much information remains in the image after I apply this method?

Is the state of the obfuscated image still related somehow to the state of the non-obfuscated version? If yes, your obfuscation method sucks.

You need to change the image in such a way that the state of the obfuscated part after you apply the method is not related in any way to its state before. Otherwise clever algorithms will always find a way to "leak" info out of your poor choice of method.

The best thing is to just drop a black (#000000) blob over the relevant part, completely replacing the part snd deleting previous bits 100%. There's no information leak other than size then.

Blurring algos, pixelation, all that fancy stuff from the movies looks good on screen but is naive in practice.

Of course, if you use a "smart" file format that has undo history and such, then you might still be leaking info.

One of the authors here. There was some great work from the Max-Planck Institute that came out recently which looked into this[0]. They examined images that used black or white boxes to cover faces and found that neural networks could pick up on cues like body posture and the surrounding environment to identify individuals.

[0] https://arxiv.org/abs/1607.08438

Yup. In 2004 Interpol busted a pedophile by untwirling his (lousy) face obfuscation: http://boingboing.net/2007/10/08/untwirling-photo-of.html

It's not even very difficult: just apply the same twirl transformation in the opposite direction.

Yes. For simple text-censoring examples de-pixelating is quite easy (since you know a lot about what could be under it), it was only a matter of time that somebody applied better tools to harder cases.

(you can of course cover something up with a random pixelated pattern if that looks less jarring, but it then is a purely cosmetic choice: the important bit is to remove all information connected to the obscured content)

The US government has known this forever. Ever see those black boxes over censored documents? It's obviously better than blurring, the only information you get is the size of the box.

> Ever see those black boxes over censored documents? It's obviously better than blurring, the only information you get is the size of the box.

Actually, quite a few times in the past with that method, you've gotten the complete text, because someone uses a format that includes the text as text (like PDF) and then dropped a solid-black box over the text, which looks like the text is obliterated, but preserves the entire content in the file.

One time in college. A teacher did something like this with hw and solutions being "hidden"/"censored". And I was easily able to get around it and have all the solutions :/

One of our lecturers did something similar, but with class grades. They started from a complete list of grades in Excel, and for each student, deleted all grades but that student's grades and then saved a copy.

Unfortunately, they did this without disabling the Excel feature that saved undo data in the file, so anyone in the class was able to hit Ctrl+Z a couple of times and see the entire list of grades.

Which, depending on the font, can be enough to reverse-enginner the text, particularly if there is a smallish list of options.

Of course, if you block out a whole line, then things get much, much harder, basically impossible. There are many possible lines!

sounds like it. though, i think for aesthetic purposes, i'd prefer a random or "stock" obfuscated faces rather than a colored box.

Correct me if I'm wrong but isn't pixelation a first step in most of image recognition algorithms, used to reduce computational complexity of the problem? It would be weird to treat it as obfuscation method.

It depends. Downsampling and image pyramids are common. You might call it "filtering", which is more descriptive way of putting it. You ideally filter out the data you don't want (noise, specks of dirt) and keep the data you do want (letter shapes). Extreme blurring, used in obfuscation, is supposed to completely remove the part of the image you're supposed to hide, but in fact, the filter is not perfect. It only lowers the signal level, and you need a more powerful filter to extract the signal afterwards, and you end up with more quantization noise. ("Deconvolution" really just means coming up with the opposite filter.)

Deep convolutional networks (which are used here) are basically custom image kernels, non-linear activation functions, and simple downsampling algorithms chained together. So you don't really need to explicitly blur, because the network will train itself to use a blur filter if its necessary.

Resizing to a fixed size is a common first step in deep learning approaches, but that fixed size is generally large enough to keep a lot of image data. They don't usually pixellate down to 8x8 or whatever it is that the mosaic filter gives you.

I think it's generally a blur, as opposed to pixelation.

It says "mosaicing (also known as pixelation)".

Focusing on the specifics of image deobfuscation is missing a vastly larger point.

A large part of the implications of this research are about the amount of information needed to reliably identify a given human being. About 33 bits of nonredundant data -- enough that an image with 64 bits of extractable data might serve to identify any person with considerable redundancy.

And where that might be obtained from, how computers rather than humans could suss out identifying information, often where humans cannot, and finally, the questions of who might be motivated to use such information, on whom, and to what ends, which gets to the rather deeper question of the morality of technology, and how to address issues in which technology is in all likelihood inevitable, if not particularly desirable.

There's a good discussion of many of these points in a post by Yonatan Zunger at G+, which I recommend: https://plus.google.com/+YonatanZunger/posts/HxrBbAskg19

Yonatan and others raise a number of points.

Humans are very good at recognising faces, particularly intact ones. But computers can key off of other factors. Blurring out faces leaves other identifiable elements, including body morphology, clothing, skin or hair patterns, gait, vocal tones, writing patterns (even after multiple passes through Google Translate), and more.

Reconstructing blurred or pixelated faces is the least of our problems.

In the question of who benefits -- qui bono* -- I find it useful to consider that Francis Bacon's dictum is false.* Knowledge isn't power. But knowledge is a power multiplier. The party with intrinsically greater power, and information resources, and access (overtly or covertly) to corpora of identifying data and observations to be matched, has the distinct advantage. The more so if it can both acquire data and act with impunity.

This is amplified for any party which can either acquire data, or act, with impunity. National security organisations, international criminal syndicates, entities such as Anonymous, and non-state terrorist organisations all fall into this general category, though with differing resource and vulnerability profiles themselves.

This is balanced, somewhat, against the size of the surveilled population. Mass efforts, such as Edward Snowden revealed the NSA to be engaged in, are sweeping through tens of millions to billions of targets. A group targeting an oligarchical power structure has a much smaller and more tractable problem -- the six members, say, of a Communist Party Standing Committee, a few hundred national legislators, or the tens of thousands of members of the 0.01% elite of the US or Europe, say.

The nobility has the advantage of resources and power. The mobility, as Neal Stephenson termed the Mob, has the advantage of loose structures and high resilience, though it can also be difficult to organise effectively.

A particularly explosive situation emerges where a faction of the nobility discover a means to energise the mobility to their ends: demagoguery.

The technologies being demonstrated are only going to become more effective. So long as there's a technological civilisation, these capabilities will increase. Rebottling the genie is almost certainly impossible.

I'm left wondering both how this might be mitigated (if at all). Or, perhaps more possibly, how it will change dynamics of surveillance, sousveillance, mass protest, international espionage and crime, and more.

Another thought occurs: reverse-engineering algorithms might make it possible to construct obfuscated images which a machine reads as identifying some specific individual, but which no human would be able to either identify or challenge. Raising new concerns over fabricated evidence or narratives, whether as a framing mechanism against an enemy, or as part of a self-promotion campaign ("I was at Our Group's Great Defining Event and This Machine-Identifiable Image is Proof").

Interesting times / so it goes.


Please don't create new accounts to violate the guidelines with. Not only do we ban these, we ban the main account as well if it continues.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact