That's one of the things I'm most interested in about deep learning - priors can be far better than simple linear transformations + norms. This means potentially recovering useful information from far noisier/smaller data sources than current compressive sensing techniques are capable of.
Is the state of the obfuscated image still related somehow to the state of the non-obfuscated version? If yes, your obfuscation method sucks.
You need to change the image in such a way that the state of the obfuscated part after you apply the method is not related in any way to its state before. Otherwise clever algorithms will always find a way to "leak" info out of your poor choice of method.
The best thing is to just drop a black (#000000) blob over the relevant part, completely replacing the part snd deleting previous bits 100%. There's no information leak other than size then.
Blurring algos, pixelation, all that fancy stuff from the movies looks good on screen but is naive in practice.
Of course, if you use a "smart" file format that has undo history and such, then you might still be leaking info.
(you can of course cover something up with a random pixelated pattern if that looks less jarring, but it then is a purely cosmetic choice: the important bit is to remove all information connected to the obscured content)
Actually, quite a few times in the past with that method, you've gotten the complete text, because someone uses a format that includes the text as text (like PDF) and then dropped a solid-black box over the text, which looks like the text is obliterated, but preserves the entire content in the file.
Unfortunately, they did this without disabling the Excel feature that saved undo data in the file, so anyone in the class was able to hit Ctrl+Z a couple of times and see the entire list of grades.
Of course, if you block out a whole line, then things get much, much harder, basically impossible. There are many possible lines!
Deep convolutional networks (which are used here) are basically custom image kernels, non-linear activation functions, and simple downsampling algorithms chained together. So you don't really need to explicitly blur, because the network will train itself to use a blur filter if its necessary.
A large part of the implications of this research are about the amount of information needed to reliably identify a given human being. About 33 bits of nonredundant data -- enough that an image with 64 bits of extractable data might serve to identify any person with considerable redundancy.
And where that might be obtained from, how computers rather than humans could suss out identifying information, often where humans cannot, and finally, the questions of who might be motivated to use such information, on whom, and to what ends, which gets to the rather deeper question of the morality of technology, and how to address issues in which technology is in all likelihood inevitable, if not particularly desirable.
There's a good discussion of many of these points in a post by Yonatan Zunger at G+, which I recommend: https://plus.google.com/+YonatanZunger/posts/HxrBbAskg19
Yonatan and others raise a number of points.
Humans are very good at recognising faces, particularly intact ones. But computers can key off of other factors. Blurring out faces leaves other identifiable elements, including body morphology, clothing, skin or hair patterns, gait, vocal tones, writing patterns (even after multiple passes through Google Translate), and more.
Reconstructing blurred or pixelated faces is the least of our problems.
In the question of who benefits -- qui bono* -- I find it useful to consider that Francis Bacon's dictum is false.* Knowledge isn't power. But knowledge is a power multiplier. The party with intrinsically greater power, and information resources, and access (overtly or covertly) to corpora of identifying data and observations to be matched, has the distinct advantage. The more so if it can both acquire data and act with impunity.
This is amplified for any party which can either acquire data, or act, with impunity. National security organisations, international criminal syndicates, entities such as Anonymous, and non-state terrorist organisations all fall into this general category, though with differing resource and vulnerability profiles themselves.
This is balanced, somewhat, against the size of the surveilled population. Mass efforts, such as Edward Snowden revealed the NSA to be engaged in, are sweeping through tens of millions to billions of targets. A group targeting an oligarchical power structure has a much smaller and more tractable problem -- the six members, say, of a Communist Party Standing Committee, a few hundred national legislators, or the tens of thousands of members of the 0.01% elite of the US or Europe, say.
The nobility has the advantage of resources and power. The mobility, as Neal Stephenson termed the Mob, has the advantage of loose structures and high resilience, though it can also be difficult to organise effectively.
A particularly explosive situation emerges where a faction of the nobility discover a means to energise the mobility to their ends: demagoguery.
The technologies being demonstrated are only going to become more effective. So long as there's a technological civilisation, these capabilities will increase. Rebottling the genie is almost certainly impossible.
I'm left wondering both how this might be mitigated (if at all). Or, perhaps more possibly, how it will change dynamics of surveillance, sousveillance, mass protest, international espionage and crime, and more.
Another thought occurs: reverse-engineering algorithms might make it possible to construct obfuscated images which a machine reads as identifying some specific individual, but which no human would be able to either identify or challenge. Raising new concerns over fabricated evidence or narratives, whether as a framing mechanism against an enemy, or as part of a self-promotion campaign ("I was at Our Group's Great Defining Event and This Machine-Identifiable Image is Proof").
Interesting times / so it goes.