Making Blurry Faces Photorealistic Goes Only So Far

PragmaticPulp · on June 23, 2020

Everyone in the ML field understands that ML-assisted upscaling doesn't produce output that accurately represents the original full-resolution image. It produces output that humans perceive to be realistic-looking, or free of traditional upscaling artifacts.

While this is obvious to anyone familiar with the technology, it's difficult to explain to casual observers. The image output looks real. It feels like it could be real. It's free of traditional scaling artifacts that would trigger suspicion. Without additional explanation, it's easy to see why casual observers would assume the hallucinated upscaled version is an accurate representation of the original image.

Historically, police sketches and blurry surveillance images are obviously low quality enough that people inherently know they're approximations. The problem with these ML hallucinated upscaled images is that they look and feel real enough that they bypass people's suspicions. We can try to present them as "Here's what the suspect might look like", but when they look like a full-resolution photograph, people will simply assume that it's exactly what the suspect looks like.

wmf · on June 23, 2020

One obvious problem is creating only one output face when there are many possible faces that match the low-res version. A tool that turns a low-res face into, say, 16 maximally-different upscaled versions doesn't suffer from the same level of false certainty.

notdonspaulding · on June 23, 2020

Doesn't it?

You arbitrarily picked 16 possible photorealistic faces out of a total solution space of what? Millions?

Wouldn't the balance of probability be on someone in the general population of humans more closely resembling one of your 16 candidate images than any of your candidates resembling the Ground Truth image?

Doesn't the problem get both better and worse as you scale your N up from 16? That is, it would be better because one of your candidates is more likely to match the Ground Truth, but it would be worse because you've also widened your net for catching false positives?

rabidrat · on June 23, 2020

I saw the blurry picture of Obama and my brain thought "I think that's Obama". When it was 'enhanced' it became a (white) person that I would never recognize as Obama. In fact I did not recognize them at all. The image may have been a more probable person overall, but clearly not a more probably famous person.

Which makes me wonder, can we train a model on only famous people, weighted by their relative famousness?

ben_w · on June 23, 2020

> can we train a model on only famous people, weighted by their relative famousness?

Well, yes, but why would you want to?

rabidrat · on June 23, 2020

So that we can take a picture of ourselves and have it tell us which famous person we most look like when blurry?

bryanrasmussen · on June 24, 2020

The hit kids app of the summer.

SkyBelow · on June 23, 2020

>You arbitrarily picked 16 possible photorealistic faces out of a total solution space of what? Millions?

Wouldn't the solution be to find a 4 different axes and pick faces that represent different endpoints. With a little psychology to help us identify what features are best to inform the public of we should be able to create a collection of photos that will be more likely to result in someone identifying the suspect than either the single photo option or the photo collection option. We would still need to test to see if it is better than the single lower quality photo option.

acheron9383 · on June 23, 2020

Exactly, and then it is up to your defense attorney to somehow dispute that the AI just made up a face, that just happens to look like yours, and then explain it to 12 laymen who likely know nothing of how ML or computers work.

The company that develops this tech will want to market themselves as some sort of oracle to the people and LEO. Junk forensic science sticks in courts forever, we should be careful before we assent to more.

looping__lui · on June 23, 2020

Considering that many models are actually trained on L1, L2, “texture loss” or a combination thereof, I am not sure I would say “humans perceive to be realistic looking”. Then there are some GANs of course doing magic in lieu of coming up with an actual metric that evaluates “human perceived difference”.

A more correct view could be “It produces output that was trained on minimizing differences in the pixel/feature-space”.

My point being: minimizing actual human perceived differences is seldom/never done but always rather with some proxies or complex loss function constructs that make sure that no scientist has to actually deal with human observers and their preferences ;-)

enchiridion · on June 23, 2020

Maybe the answer is to use a technique like this and then put a nicer blur back on it.

speedgoose · on June 23, 2020

I would say the technology has some issues. For example when you don't include enough black people in your training dataset, probably by simply not thinking about it, it can make the algorithm a bit racist. Example : https://twitter.com/Chicken3gg/status/1274314622447820801

scrooched_moose · on June 23, 2020

The Oprah one is probably worse:

https://twitter.com/Truttle1/status/1274361095285886982

cycrutchfield · on June 23, 2020

I mean, it's amusing, but the face is not aligned the same as the training data so what would you expect?

GuiA · on June 23, 2020

As a computer scientist? Garbage.

As a layperson? Magic, because that’s how these things are marketed by startups and journalists.

1-6 · on June 23, 2020

Thanks for the morning laugh.

jeffbee · on June 23, 2020

It seems like a stretch to call the algorithm "racist". It's the humans with the bias here. It only seems racist because you are capable of recognizing the image on the left and you know he identifies as a black man. The point of the algorithm is that the photo on the right closely resembles, when downsampled, the photo on the left, and the photo on the right is a generative artifact that has characteristics of human faces. It only seems "racist" if you cherry-pick one example and don't look at all the other faces the algorithm generated.

mola · on June 23, 2020

It takes one bad application of such technology to ruin a person's life. We can't have a mindless algorithm take or assist decisions that can effect a person's life without extreme caution. Not because people who make these decisions are never wrong but because they are held accountable. These algorithms give false impression of realness and truth while being totally unaccountable. Just look how Yan Lecun (who wasn't even accused of anything) immediately took to throw the blame on the data and not the algorithm. His claim is just a silly demagoguery because with ML there's no real distinction between data and algorithm. But the point is ML is dangerous because it makes it way too easy to pass the buck and doing harm without accountability. Lecunn just made a great demonstration....

jeffbee · on June 23, 2020

I think it's interesting in that it points out the range of faces that could have produced the pixelated version. You think it's Obama but on a certain objective metric it could just as easily have been anonymous guy. Why couldn't it be used to exonerate someone who supposedly appears in a blurry surveillance video?

JulianMorrison · on June 23, 2020

The training data is racist. This is a common problem in AI.

It produces no black faces. Even as a guess.

jeffbee · on June 23, 2020

How did you conclude this? The paper says 10.1% of the produced face are Black.

JulianMorrison · on June 24, 2020

I was mistaken, and going by examples I saw.

pc86 · on June 23, 2020

Citation needed, because you're making a claim in direct opposition to the paper itself.

JulianMorrison · on June 24, 2020

I was mistaken, and going by examples I saw.

tsumnia · on June 23, 2020

Agreed, while the algorithm is novel, it is vulnerable to introducing false assumptions about the original image. For example, if you were to use this as a method for determine suspects from low quality security camera footage, you may generate suspect images that are biased or completely false.

naravara · on June 23, 2020

Funnily enough, this is also true for humans: https://www.nytimes.com/2015/09/20/nyregion/the-science-behi...

xigency · on June 23, 2020

Is it racist though? He has a lighter complexion in that portrait and the result looks like one of his family members: https://twitter.com/iamknighton/status/1274658101006725120

Der_Einzige · on June 23, 2020

Sounds like it's time for anti-racist movements to start pressuring the ML community to start introducing and using datasets which aren't biased towards white people.

Though, things like black people being more poorly recognized by facial recognition can be a blessing in disguise depending on the circumstances

ars · on June 23, 2020

Let's say the training dataset has the same proportion of black people as their proportion of the population.

That, by itself would be biased for the ML, since there are fewer images. But it's not an indication of bias by the people programming it.

But this would imply you need huge data-sets for every minority, not matter how small a proportion of the population.

It might instead be necessary to teach the model about race as a concept, so it can categorize images, and then process them correctly. But of course that leads to a different can of worms since you are explicitly making a "race aware" ML.

leereeves · on June 23, 2020

Sounds dangerous. False recognition can lead police to suspect innocent people, and there are many ways that mere suspicion can lead to a conviction.

oh_sigh · on June 23, 2020

It's not clear to me why this matters for a research paper. And the algorithm isn't racist - perhaps the training set is.

If I want to work on a project like this, and the only appropriate training dataset I have is photographs white males, am I not allowed to work on generating faces until I've fleshed out the data set?

It's not like they are offering some service to the general public where they can de-pixelate faces. It's a research paper.

A final note: The only professor with her name on the paper(Cynthia Rudin) is very active in researching the intersection of machine learning and social justice. I'm not so sure she would put her name on a paper that can be flippantly described in 3 sentence comments on internet forums as having "some issues" wrt race.

MengerSponge · on June 23, 2020

A bit. This whole fiasco reminds me of Better Off Ted's "Racial Sensitivity" episode (S01E04).

2038AD · on June 23, 2020

Be careful! Are you sure you want an algorithm to produce more black faces if it would be applicable to policing?

JulianMorrison · on June 23, 2020

This one is very not applicable to policing.

2038AD · on June 24, 2020

It would be deeply wrong to use such an algorithm to add detail to blurry CCTV images in order to incriminate someone. I don't think every police force, security firm, security agency etc will agree with me on this and that concerns me. Police in my country are already using live facial recognition in some areas.

JulianMorrison · on June 24, 2020

It would be deeply useless. It produces plausible wrong answers. Those are the worst kind of wrong answers, because they are red herrings with statistical near certainty, and yet convincing enough to generate bias.

2038AD · on June 24, 2020

I agree and I don't want these incorrect answers to be made more plausible

morelisp · on June 23, 2020

Really? You don't think the police are interested in depixelating faces?

JulianMorrison · on June 24, 2020

I think the police would, if they are doing their jobs (a big "if") be completely uninterested in an algorithm that is the direct equivalent of an unreliable eyewitness confabulating a plausible face for a photofit.

stabbles · on June 23, 2020

Inverting a non-invertible operator impossible, study finds.

jaredtn · on June 23, 2020

Sure the theory is nice, but can I see some empirical evidence? :)

ttul · on June 23, 2020

With a large enough database of potential faces (see VKontakte), you might be able to use this tool to upscale blurry images and match them to a short list of candidates. Other intelligence could then lead you to the actual person in the blurred image. Scary implications.

_trampeltier · on June 23, 2020

It might be less true for pixeled videos. I know from some IR camera companys, they use very small movements from the camera to calculate the picture in a higher resolution. And I think, the first picture from a black hole use also a kind of this technologie

owenshen24 · on June 23, 2020

This seems pretty reasonable; I don't know if anyone was claiming otherwise throughout these upsampling results?

jszymborski · on June 23, 2020

Upsampling used to be pitched in papers as a way to add plausible information to images so they don't look so degraded.

Recently, however, there's been a lot of work pitching upsampling for "deblurring" faces, which seems like a great way for LEO to run the programme until they hit on a face that most likely looks like a person of interest.

Hell, you can even make it explicitly so that the net takes two inputs, the blurred image and a suspects image, and generate a plausible upsample that is similar to the suspect.

That sounds like bad faith science that no judge would accept, but perfectly legitimate technologies like DNA testing has historically been abused this way in the courts.

kevin_thibedeau · on June 23, 2020

It's only a matter of time until AI generated sketches are trotted out in court to show someone's "obvious" guilt.

rasz · on June 23, 2020

this is not upsampling

owenshen24 · on June 28, 2020

That's true; I was too loose with my terms.

robertlagrant · on June 23, 2020

> For starters, Rudin said, “We kind of proved that you can’t do facial recognition from blurry images because there are so many possibilities. So zoom and enhance, beyond a certain threshold level, cannot possibly exist.”

Least needed "proof" ever.

steerablesafe · on June 23, 2020

It's also not a proof.

hirundo · on June 23, 2020

Instead of blurring faces in photos of protests, Google Street View, etc., blur them and then upscale them, so they don't have the jarring blur effect but still anonymize.

m3kw9 · on June 23, 2020

Try a grid of 2x2. Is called ambiguity the more you hide it. The idea is so simple on why it wouldn’t work even on trained models.

elwell · on June 23, 2020

Waiting for the CSI episode:

"Ok, zoom in... alright, now enhance that... now upsample using AI trained on criminal database... "

Chazprime · on June 23, 2020

They already did that in No Way Out, way back in 1987:

https://youtu.be/0zLL_XdqxmQ

tyingq · on June 23, 2020

Depends on the blur used, right? Straight pixelation is very lossy, but gaussian is less so.

A_No_Name_Mouse · on June 23, 2020

Right. As long as the number of pixels remains the same it should be possible to remove gaussian blur almost completely. The information isn't lost, it is still there, but smeared out over a larger area.

Also I don't understand why the first Mona Lisa results in a picture that when pixelated again, wouldn't produce the original pixelated picture. It is as if the create a face inspirated by the original, but not one that could ever be the original.

hatsunearu · on June 24, 2020

You actually do lose quite a lot of information, particularly so if you have a limited bit depth/other quantization noise.

qayxc · on June 23, 2020

Not really. The images aren't pixelated as an effect - it's just a representation of the actual number of pixels taken from the ground truth.

You can just as well take these pixels and apply any kind of blur filter - you still wouldn't retain any more information. If you go from say 1k x 1k pixels down to 100 x 100 pixels, you end up with 1% of the original information, no matter what you do.

contravariant · on June 23, 2020

I think they may have been thinking of just blurring rather than actually lowering the amount of pixels. A Gaussian blur doesn't really remove as much information as you might expect. Although it's very sensitive to noise.

qayxc · on June 23, 2020

But that's something else entirely then - the very idea of upsampling is based around taking a lowres picture and turning it into a hires version.

contravariant · on June 23, 2020

Both pixelation and blurring are used to anonymize faces, which this very article is about, so I can see the relation.

not2b · on June 23, 2020

It's information theory. No matter what blur technique you use, there's only so much information in the pixels. You can't produce more information by magic, or by technology so advanced that it is indistinguishable from magic (h/t Arthur Clarke).

tyingq · on June 23, 2020

Well, I'm suggesting that, for example, pixelation where you change 10x10 blocks of pixels to 1 10x10 block of a single color...is very lossy. A blur of a known algorithm (gaussian, for example), can be somewhat reversed. It's not as lossy.

qayxc · on June 23, 2020

You seem to have misunderstood what the article is about.

It's not about turning a shitty image to a nice a clean one. It's about turning a lowres image into a hires version, i.e. you start with 100 x 100 pixels (doesn't matter how you obtained them - could be a section of a much bigger image, for example) and try to extrapolate a 1k x 1k pixel version from it.

The pixelation you see in the examples is just a representation of what little information you have to work with. It's NOT in any way shape or form related to where you get these pixels in from in the first place.

Just to give you a different context here: imagine this upscaling being used to "enhance" a single face in an image like this [1] - there's no way "Gaussian blur" or whatever filter you'd like gets you more information out of that.

In the example image I created, I applied a Gaussian blur to the "pixelated" (i.e. enlarged) version of the marked image section.

As you can see, enlarging the same section using super-sampling (i.e. similar to what you proposed), doesn't change the information content and one version can basically be transformed into the other.

[1] https://imgur.com/a/m23u6Xx

tyingq · on June 23, 2020

No, I understood. The article gives several different examples of "blurry", and they aren't all the same. Fixing motion blur is one example of something where you aren't just completely guessing and "inpainting".

WrtCdEvrydy · on June 23, 2020

Best blur method is just to make all pixels black.

aantix · on June 23, 2020

Slightly offtopic - who has the best video upscaling algorithm out there?

BurningFrog · on June 23, 2020

So since this algorithm can't be used to reliably recover faces from blurry photos, it won't be.

It can still be useful for entertainment purposes.

ithkuil · on June 23, 2020

Can such a model be used to compare a real picture of a suspect with a low res picture and produce an objective measure of similarity (i.e. not only raw pixel similarity but a similarity measure that actually takes into account how actual human faces downscale, possibly resilient to minor rotations?)

etaioinshrdlu · on June 23, 2020

The increasing politicization of the machine learning field makes me grow tired. It makes me want to work on things without political implications. Something like electrical engineering. Does anyone else feel this way?