Hacker News new | past | comments | ask | show | jobs | submit login
Deepfake detector can spot a real or fake video based on blood flow in pixels (zdnet.com)
171 points by sizzle on Nov 18, 2022 | hide | past | favorite | 92 comments



Won't this be used in the next deepfake as an adversarial network in order to produce more realistic results? It's an endless cat-and-mouse game.


> It's an endless cat-and-mouse game.

This is often stated but I think it has to be obviously wrong. This isn't a traditional interactive game such as malware & anti-malware.

You have existing sensors which operate under the constraint of [ real world -> theoretic pixel space -> optics & aberrations & sensor noise -> compression ]. And a single adversary which attempts to fake this chain.

The detection of fakes isn't even an adversary in this game, it's merely a detection of deviation of the faking process.

At some point, probably soon, the faking process will reach a point where any deviation will be drowned out by the noise aspect of optics & sensors & compression.


One method of generating things via neural networks is called a generative adversarial network. It works by having two models. One that generates content and one that detects fake content. You train them both in parallel. As the fake detector gets better, so does the generative model at generating fakes. It’s literally a cat-and-mouse game. If someone came up with a scheme to reliably detect your fakes, you could add it to your discriminator model and retrain the generator to improve the fake generation.


My understanding is that it's not quite that simple. GANs have stability problems (and as a result somewhat out of favor atm) and if the fake detection mechanism isn't a differentiable function itself no training can happen.


The fake detection mechanism (aka discriminator) is usually just another neural network and I bet that's the case here as well. So it must be differentiable and thus, if anyone ever gets a hold of it, it could be easily used to train a generator that will eventually fool the discriminator.


So, basically a NOT operator on a neural network. Does this even require a differentiation?


Not really a not operator. It’s more like tuning the generated models until the fake detector cannot detect it.


It depends on how you set it up. You can use a very non-expressive valuation that assigns a score, but then you need to use reinforcement learning techniques to transform that score to a model update. Alternatively, if your valuation represents a differentiable metric function from your target you have a way of going directly from your output to a model update.

The second way requires dramatically fewer update steps (usually) than the first. Thus - having your adversarial target be differentiable definitely helps, though it is possible to do even absent such a criterion.


A NN is trained to make something indistinguishable from reality as possible - so it can't tell the difference. The inverse of that will just claim reality is fake too.

What you need a a deep fake and a real video of the same thing, then train on the difference. Clearly this is impossible - which is what makes the problem hard.


You actually don't need that. You only need a set of real videos and a generator for fake ones. Then train the discriminator to tell these two classes apart and make use of its differentiability to update the generator in tandem with the discriminator.


Would you run into class balance issues if you do not have a growing pool of initial videos?


Not if you have a balanced dataset. If not, you're already getting into trouble with GANs, which is why they were superseded by diffusers on image generation tasks.


While Gradient Descent needs differentiable functions, there are evolutionary algorithms that do not need this and can also train neural networks.


It is accurate that GANs have stability problems, but they are absolutely being used today for solving problems similar to this (an output needs to be "improved").

Stable Diffusion produces faces- especially eyes that are malformed. You'll often see "restore faces" in online services which feeds the end result into a GFPGAN which is used to restore faces.


Diffusors are not GANs.


indeed. that is why you use GAN to fix the face generated https://www.reddit.com/r/StableDiffusion/comments/x33rs4/how...


There's an excellent sci-fi exploration of the concept and how humans interact with it in Neal Stephensons "Jipi and the Paranoid Chip"; well worth the read.



This link gave me a fake "this phone has been hacked" alert. I recommend avoiding.


Hot take: What if we just accepted that any video might be fake, just like any photograph might be fake?

And that we accept that the only thing that's assured to be real is face-to-face, and live with that reality?


It's already the legal standard in many places that a court doesn't just accept evidence like a photo or a video. Typically the photographer testifies under penalty of perjury that they were there and can be questioned about the circumstances in which they took the picture/video and any post-processing that might distort what we see.

So we can say that either it's real or we can identify the specific people who might be lying.


Unfortunately, in media and the public discourse / memosphere, that's a standard notoriously difficult to establish or enforce.

People, and crowds, will respond to first impressions.


What if we put out enough deepfakes that they get desensitized to it and just stop believing stuff on screens altogether?


Past experience suggests that that likely won't be too effective. "Big lie" propaganda is profoundly effective, even amongst those who are painfully aware of its existence and methods as individuals, and at the population level, there seem to be virtually no defences.

Rather, there'll need to be the emergence of credible and trustworthy channels which verify messaging and content before it's widely disseminated.

That largely means re-implementing the sort of media gatekeepers we've seen in the past. Though the challenge of bad-faith actors emerging in such roles at considerable scale points to further challenges, even with that model.

Under a regime in which free speech is considered a fundamental right, any sort of preemptive management by government mandate becomes intensely difficult.


>"Big lie" propaganda is profoundly effective,

I mean, people still believe we put men on the moon, and people still believe the Earth is round! /s


Some even subscribe to the belief that water is wet.


For now. It might become more important.


Don't think of a white elephant.

Don't breathe consciously.

Don't lose the game.


But if you lack technology to detect fakes how can you ever prove perjury? Photographer would then risk nothing by lying.


If the person is a professional, lying is a huge risk, as in the end of a career even if not convicted.


Earlier this year, a man who was repeatedly caught lying as a journalist was forced to resign.

Unfortunately, he was the British Prime Minister by that point.


Corroborating evidence


My startup was founded on technology to authenticate and provide provenance for digital media content.

The overwhelming response from the market, investors, etc was “We like what you’re doing and this is all well and good but the ship has sailed”.

I think the real hot take here (and they danced around this one) is essentially the partisanship. For almost everywhere this really matters a good chunk of people are more likely to believe as real whatever aligns with their pre-conceived notions, political leanings, whatever.

Essentially, real is whatever my tribe says is real.


Your startup is probably to early. As in, at some point it might be a necessity. And it’s been studied that we believe something first, and then apply reasoning to it. We skew towards our preferred bias by belief.


Thanks, I appreciate that but the experience was eye opening. Speaking of which…

Get on Facebook and watch people swap around things that are so blatantly and obviously fake. In the words of one potential investor: “no one cares”. Or, to repeat a phrase you may have heard “fake news”.

“Falsehood flies, and the Truth comes limping after it” - Jonathan Swift, 1710


On things like facebook usually our “guard” is down. We just assume at some point that if someone feeds it to us, it’s vetted already.

Usually a lot of context is missing with for instance a scary headline. In that sense context tools could also be great to have. Something that exposes connections between information.

I don’t believe no one cares tbh, but there are those who are resistant to change in bias.


Investors are dumb. See the list of people who have ftx truckloads of money. Sometimes what’s needed is a couple good sentences with create the appropriate spark in their imagination. If you can sell truth as a service, you have potentially a total addressable market of half the planet’s population.


or, conversely, to judge ideas based on their merit and not based on who says them, maybe in some deep wonderfull future ...


>It's an endless cat-and-mouse game

Yes, this is the way with anything software based that can earn people money.

See: video game hacks, SEO manipulation, etc


But especially so when machine learning is involved since a model can train off its adversary.


Not really special in the case of ML.

Before deepfakes, if you wanted to claim a video was doctored in court, you'd find an expert on video editing and have them testify.

But the same knowledge that allowed them to identify a doctored video (like 50hz/60hz hum) could be used in an adversarial manner to create a very convincing video.

At most deepfakes democratize that "knowledge" in the form of a model, so it still works both ways.


> But the same knowledge that allowed them to identify a doctored video (like 50hz/60hz hum) could be in an adversarial manner to create a very convincing video.

I don't think it's automatically true that being good at spotting fakes means that you're good at generating fakes. For example, I suspect that I, like most humans, would be pretty good at spotting humanoid robots at the current level of terminology. However, that does not suggest that I would be particularly good at creating humanoid robots that would evade detection.


That's why I specified an example that requires a domain expert, you could have a very poorly done fake video that anyone can tell is fake too.


> That's why I specified an example that requires a domain expert, you could have a very poorly done fake video that anyone can tell is fake too.

But my example also satisfies that criterion. I (and most humans) am a domain expert at identifying humans, but that doesn't mean I would be good at faking a human.


> I (and most humans) am a domain expert at identifying humans, but that doesn't mean I would be good at faking a human.

I don't think you're thinking about this enough. If you make one effort to create something that looks like a human, you'll do a terrible job unless you are already a skilled artist.

But that doesn't support your claims here. We have a scenario where you've prepared a fake human and attempted to pass it off as real. The only way this could actually happen is if you looked at your own handiwork and it passed quality tests. And since you're good at identifying humans, that necessarily means that your fake human is a high-quality imitation. If it were terrible, you'd look at it and realize it couldn't be passed off as real.


I really didn't think I needed to specify domain expert at faking humans.

I said video editor not video watcher.


This is probably intended for encoding webcam chat between Intel devices. They can hash the video "frames" to detect interception.


Just like computer security


Pet peeve of mine: articles using stats like 96% accuracy.

If the test set had 4% deep fakes, and 96% legitimate videos, a model which always predicts legitimate video would score 96% accuracy, even if it were useless.

Stats like precision, recall, F1 scores etc. are important.


It's the reality of popular terminology.

We only have one word in English that gets commonly used and means "is correct this much of the time". Nuance is lost, but it ensures more people understand the general gist of the article.

One could now get meta and discuss the "accuracy of language", but I feel that might just make you more upset :)


I read that 96% here means guessing both sides of the witha weighted coefficient on how many of them are in the sample.


Simulating blood flow is a technique currently used in high-end VFX animation for movies. https://www.fxguide.com/fxfeatured/maleficent/


This is not new. It is news because it's from Intel.

I looked into that a year or two ago and there were papers on the this.

Anyone who is familiar with Euler Video Magnification and with neural network likely though of that.

Does this work in encoded videos? I doubt. Intel probably can add a a feature the video encoder and sell it as an authentication service for webcam communication on Intel Plataform.


> Does this work in encoded videos?

For older encodings such as MPEG2, as used in DVDs, it does work. For newer codecs with higher compression I can't extract a heartbeat signal anymore when using the tool that I've written for that (*).

(*) https://github.com/erdewit/heartwave


I hardly believe it could work on media uploaded on YT and similar platforms, and assuming it does, it would be easily defeated either by over compressing the videos so that subtle chromatic changes are eliminated or applying smoothing filters before reuploading. Should the technology catch on, it's just a matter of time before the appearance of filters that scramble those subtle differences, masking them for example as a grain filter effect, to make it useless.


I've always imagined deep fakes would just encode to 360p to make it believable.


I hope that the existence of this tech doesn't end up being used as a cover to justify censoring real videos.

At the end of the day, in the current tech environment, the most reliable way to know what is real from what is fake is using your own critical thinking skills.

It's just unfortunate that most people have really terrible critical thinking skills.


It frightens me to learn pixels have blood.


They didn't until the early 2000s, but it turns out the easiest way to increase the resolution is to feed in less information and just give the pixels souls, so they can use free will to fill in the gaps. In a few years they'll have them in CPUs too, they're just running into thermal issues with the blood and the fact that every ensouled CPU inevitably either kills itself at first opportunity or goes insane. Hopefully it'll get hammered out for 5nm.


I'd insert that one quote from Billy Madison here, but your comment is not even worth the copy+paste effort.



By definition, a joke is supposed to be funny. What you wrote was just complete garbage that I would have sooner believed to have been written by a GPT-3 text generator than a human being.


I laughed. So it must be with the reader that the problem exists, not the writer.



Just stop. I am not religious, but will pray for any co-workers or family members who have to deal with you and your sense of humor on a daily basis.


Alright I'll bite- what here specifically offended you? I'm building out my people-model, but clearly there are still some edge cases.


You keep doubling down on your current "sense of humor" rather than trying something, literally anything, different. Here is a word of advice - if you are trying to be funny with a joke, but it doesn't have a punchline, it is likely not funny.


I found it funny.

And if you'd replied with that one quote from Billy Madison, rather than making a grumpy reference to it, I'd have found that funny too.

There's plenty of humour in absurdity, even without punchlines; and ravi-delia displayed a similar style of humour to the "Look Around You" TV series.


I was just setting up autotune for a link to the wikipedia for punchline (that they actually talked about punchlines in the comment was a truly blessed coincidence) but instead I would like to thank you for introducing me to the Look Around You series. It looks fantastically funny.


I didn't include it specifically because I don't come to HN for funny comments that make others laugh, that is what reddit is for. Different site, different purpose.


Can this technology eventually detect their heartbeat, or is it just looking at slower changes over time? If the latter it sounds much simpler to defeat, if the former that would have many repercussions.

Live heart rate by video analysis would make things like televised court proceedings, congressional hearings, and news interviews much more invasive. Elevated heart rate is a sign of stress, and it wouldn't be long before people were jumping to conclusions over whether someone was lying or hiding their true feelings/intentions.


This has actually been done before, awhile ago: https://people.csail.mit.edu/mrub/vidmag/


Not sure if it's quite the same, but Google Fit has a feature that gets your respiratory rate from the selfie camera. They also have one where you put your finger on the camera flash and it uses that to see your bloodflow.

https://www.lifewire.com/measure-respiratory-and-heart-rates...


Very cool, thank you. I'm honestly surprised this dark magic hasn't been (ab)used yet, unless it has some strong limitations.


This was quite a big news ~10 years ago... I think there are some patents involved, which makes the use of this technique difficult.


Correct, although those conclusions will be no more useful than existing ones based on facial expressions or lawyers' theatrics. The legal process can be just as stressful for an innocent defendant as a guilty/liable one, arguably more so.


Love the cat and mouse game!

So then this will be the next target of better deepfake models, right?

We saw that fake pharma tweet that (supposedly, but not really) sent the stock crashing – how long before a fake video of a CEO making an announcement at a fake Davos-like conference stage interview?

As a techie, is this going to make things like digital signatures more important? But more realistically though, most of the audience that would do impulsive things won't care to verify.


HN story a month from now: new deepfake software can evade Intel's detector.


If the filter doesn't notice blood flow on a non-deepfaked subject, run.


Or we find out that certain population groups have different blood flow patterns, which the system incorrectly identifies as proof of fakery. Or perhaps for some, it's simply not detectible even though they are real live people.


Yeah, neither deepfakes nor deepfake detectors will end epistemology. We'll need to use a multiplicity of tools, with strengths and weaknesses known and unknown, and come to a conclusion based on the preponderance of evidence knowing full well we will sometimes get it wrong.


Or we find out some people have dark skin and the blood flow isn't visible to the camera in these situations.


That's a great observation to make deep fakes more realistic

I often think about subtleties that throw us off a little, too bad that disclosing this subtlety reduces the ability to discern


I'm sure that one day, most of the things we'll see or hear on the web will be filtered by this kind of software.


Not likely IMO, the arms race will continue.

Plus, are you sure you’re eager to sign up for even more censorship-by-opaque-algorithm?


I'm not eager, but I do think it's inevitable.

That said, there are some images I wish I'd never seen.

If I could be sure it was only being used for good (by my definition of the word), that would make me eager to install a magical perfect opaque filter algorithm.

But it won't be perfect, and it won't only be used for what I consider to be good.


Well this headline is equally comforting and terrifying.


What about text?


enhance blood flow in 3... 2.. 1.


...for now


For now…


For now.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: