Hacker News new | past | comments | ask | show | jobs | submit login
Algorithm allows video editors to modify talking-head videos as if editing text (stanford.edu)
558 points by jonbaer on June 6, 2019 | hide | past | favorite | 202 comments

You know, I thought that deepfake videos would be politically weaponized when I first heard about them. However, after doing more thinking on this, we have had photoshop for 30 years already! We see photoshopped images all the time and while some people can be fooled, many others remain skeptical of an image then try to verify it hasn't been altered. I don't think photoshopping has really been a big problem yet, which makes me think that deepfakes won't be one either because it is fundamentally the same kind of deception but in video form.

All of these things make it easier to mass-produce bullshit at low cost.

I'm pretty sure I know people who have been convinced by meme quotes. A headshot of a politician they don't like, with text overlaid, which they never said. People are outraged! And never bother to inspect the source.

Anything that makes it easier to lie about what someone said or did, or makes it harder to disprove... They're all politically weaponized, already.

Look at the "drunk pelosi" video.

Isn't the drunk pelosi video literally just slowed down audio? If anything, it proves that we don't really need these insanely advanced machine learning techniques to make bullshit. Something as basic as slowing down a video 20% will do just fine in misleading people.

The thing is deepfakes don't really make this easier. They are a lot of work to produce and ultimately aren't a whole lot more effective than a good old fashioned photoshop or crappy meme.

They're only a lot of work as long as the tooling is difficult to use, not accessible and produces output that doesn't look authentic.

30 years ago, that might have been the case with doctoring images, now practically everyone has a personal computer that they can install and use PhotoShop or some similar tool on.

The research demonstrations I've seen are sufficiently terrifying. I believe we'll have something like http://www.xtranormal.com/ for major political figures within two years, producing deepfakes that are sufficiently realistic that I have several relatives who will be tricked by them. Do you not know people who will be fooled?

People's eyes and ears may be fooled by a video but as this capability becomes widespread, which it certainly will, I'm not so sure that many people will be deceived in the long-run.

Technology is evolving but not in a vacuum, society's reactions also evolve in response. Today, many people interpret video to be "evidence" but those same people can interpret a photo to be a "claim" or perhaps some form of lower-confidence indication. Before photo manipulation was commonly known, I think photos were in a similar place as video - more trusted. Based on history, it's reasonable to expect video may follow a similar trajectory as photos to becoming less trusted in situations where it matters.

So, what happens when media types which were previously more trusted as evidence become less trusted? The same things that happened with print, audio and photos. Viewers will evaluate external cues such as the reputation of the publisher and corroborating evidence. The leading indicators that we should suspect deception will likely be similar. For example, how divergent the behavior depicted is from expectation, how contentious the surrounding context is and the existence of parties with an interest in creating such a deception.

This effect already happens with manipulation of intent through tricky video editing, for example deleting the rest of a reply to a question or even swapping in an alternate question. In the last decade I'd say the typical person is far more aware this is possible.

So, in the near-term there may be some successful deception but in the long-term I expect the potential value of creating such deceptions will diminish and we'll arrive at a new "normal" much like we have now. The biggest long-term impact may be false claims of "doctored video!" from those who were actually caught on video doing something they didn't want seen by others. But as we already see now, those pre-disposed to believe whatever is shown is false will search for indications it's doctored. Those pre-disposed to believe whatever is shown is true will search for indications it's just more confirmation of what what they already suspected. Either way, the existing reputation of the person shown, the distribution source and the pre-existing knowledge of viewers will likely be more determinitive than the media itself.

What you're saying is that video will cease to be a useful tool for exposing flawed-yet-entrenched viewpoints for what they are. If you have any idea of the role expository media has played in civil rights and anti-war efforts, this should terrify.

Being aware that something might be fake and actually not getting influenced by it are not the same though. Particularly when it's reinforcing an existing belief, but it already starts when being a little gullible provides more entertainment value than being sceptical.

It's hard to predict actual effects, I don't think that anyone could have foreseen that the primary use of stills image editing for manipulation is not the perfect crime of an elaborate fake but a barrage of provocatively simple memes that don't even pretend to care about believability. The act of sharing is the message.

Viewers' pre-existing knowledge has to be formed somehow. Expect echo chamber effects to get even stronger.

I'm not sure that's quite true, as video is more likely to be cited or redistributed by journalists as a primary source, for example, where memes would not be. The age of credible video is at an end.

They are not a lot of work to produce unless you're a GPU. A sophisticated audience might not be impressed by video alone but seek correlation from witnesses, the date and time of the alleged event etc., but a well-timed lie is often enough to swing an election or trigger a political crisis.

Yes, but this could be radically different in 3 years, which isn’t enough time for one election cycle let alone the time it takes for society to iterate towards a solution. This tech is moving much faster than society generally acclimated to things.

Politicians say outrageous things all of the time. The ones that don’t, usually have trouble raising money much less winning elections.

Do these transformations leave any discrepancy or signature in the video or audio that would be detectable by a machine? (So, tiny, tiny discrepancies might work.) Someone could make a browser plugin to alert the user when video/audio has a good chance of being fake.

If software can identify it as fake, another can be improved till that isn't the case anymore. This is actually being used, search Generative Adversarial Network for more info and background.

This may actually make brand names in media valuable again.

> I'm pretty sure I know people who have been convinced by meme quotes. A headshot of a politician they don't like, with text overlaid, which they never said. People are outraged! And never bother to inspect the source.

I really wonder which type of meme has the most influence on average, straightforward and outright lies like the one you've noted, or the more subtle, subversive social commentary style. I'm a big fan of the latter, I think they're very interesting and underappreciated.

For example, this one - nothing more than a simple screenshot of Twitter, but to me this seems very persuasive: http://magaimg.net/img/80rb.jpg

Pointing out hypocritical grandstanding: https://i.redd.it/74n4exoy2g131.jpg

Media bias: https://i.redd.it/gtrq4xgemtx21.jpg

If Biden runs will be interesting to see how much airplay this meme will get: https://i.imgur.com/tqzGS6E.mp4

Laughing at the silliness of popular narratives: https://i.redd.it/a9wrex3ijc231.jpg

Just for a laugh: https://i.redd.it/i5bvml5m92131.jpg

Laughing at logical inconsistency: https://i.redd.it/rp9ydf0s88y21.jpg

An interesting way of looking at Brexit: https://i.redd.it/cs6o72p3dp031.png

Historical hypocrisy on border control: https://i.redd.it/983sm9wjley21.png

All of these are from t_d so obviously one-sided, I'm sure a similarly impressive collection from the other perspective could easily be assembled, and it's not that uncommon to encounter otherwise intelligent people who have obviously had their beliefs shaped by those memes.

I remember thinking that computer networks would connect people and make the world a better place.

I'm now just about ready to unplug the whole thing and launch it at the sun.

To your point, there are people who are persuaded by assertions.

I personally find it gets even more entrenched when people believe they have seen the evidence with their own eyes. When they see a doctored photo, clip out of context, etc.

I'm reminded of the gun / water context photo:


It is pretty hard to be optimistic about this whole mess sometimes, but then on the other hand, just as negativity and hate spreads so quickly, might it be possible for positivity and love to do the same, some day? I think so.

Oh ya I love that picture, was trying to find it not that long ago with no luck. It does a brilliant job communicating how powerful propaganda can be.

Downvotes on hope for the spread of positivity and love? Meme magic is powerful indeed.

Tacking a disclaimer on the end isn't sufficient to justify posting so much flamebait.

You do realize the topic of discussion in this thread, right?

Just as you realize I was suggesting an oversupply of examples.

Do you think you have been affected by memes, and if so to what degree?

Yeah the bigger problem is taking video footage of a politician, and then going frame by frame to find the most unflattering possible depiction of them (usually right after a cough or a sneeze) so you can use it to "support" your trash click bait headline. No deep fakes needed - you can make anyone look like a raving lunatic if you take the frame right before a sneeze.

The problem with deep fakes is a hostile nation taking over your phone/facetime calls and sounding exactly like your own parents. How can you tell if they can do it in real time? A bad actor could get you to do some really bad things.

That is an interesting point, but this already happens on voice calls. My grandma got a call a while ago from someone claiming to be my brother. Said he needed money because he was in a South American jail or something. Luckily shes still pretty sharp, so she hung up and called my brother (he was not in South America) so the ruse was up. She was pretty shaken up though. A video call would be more convincing but only an incremental, not fundamental, difference.

My mother received a similar call, telling her that I'd had an accident.

If you're thinking such an evil persons should be in jail, don't worry, they are! It's them who are in South American prisons using burners or stolen phones.

That's not a prisoner wardialing on a smuggled phone, it's a well-know scam targeting seniors: https://www.aarp.org/money/scams-fraud/info-2018/grandparent...

There's nothing in that link that contradicts what I wrote. My source was an official police alert. I lost the reference, but here is another one from a newspaper (in Spanish):


Edit: that was from Argentina, another one in Chile:


Could be, but I think "official police alerts" are actually one of the least reliable sources of information.

"Could be" on sources, scare quotes, stay classy.

Maybe PGP or some other form of cryptographic signing will become (more) mainstream as a result of this. Or at the very least a secret word that families can share amongst eachother to verify identities.

Cryptographic signing just moves the problem from the authenticity of the document to the authenticity of the key.

That can be very useful when it is useful to only have establish trust once but that's not really the problem described here. The secret word is probably more useful in being so simple, however it still has to be established beforehand.

Photoshopping is done by hand and generally has mistakes. Good photoshops are still believed.

Videos are being altered by machines. They’re being optimized for natural looking results. It’s harder to notice small mistakes when frames are going by at 24FPS vs poring over a static image for 30 seconds until you finally notice the one region with mismatched shadows or odd clipping.

I've literally never been able to identify a photoshopped image (except for immediately obvious work), without someone first pointing it out. I have a feeling that the percentage of the population that can spot edited images is in the single digits.

As someone who usually spots the Photoshop, I have some trouble assessing the danger of such techniques; thanks for bringing your input. I feel that someone who sees “through” a doctored piece of information will not even realize the power it may have had to others who might not participate in the conversation but end up part of the bubble nonetheless. By the time a fake has been debunked it's too late—but what's the alternative, censorship?

Not censorship, context. Censorship is disrespectful of the reader; it assumes the authority knows better. Context is respectful; it assumes you will make the best decision (for you) when you have all the relevant information.

So don’t delete the fake video. Put a big red exclamation mark next to it that says “this video has been substantially manipulated. Contents may not be genuine.”

Also: while viewers and producers both deserve the same respect, producers can forfeit theirs by consistently failing to respect their viewers. A consistent pattern of intentional deception should earn a shadow ban.

On the bright side, video gives a lot more opportunities to make small mistakes for conspiracy theorists, er, fact-checkers to find!

"One person's freedom fighter is another's terrorist."

Yes but you can pore over each frame if you’re taking time to study as you are suggesting someone is for the photo in this situation

Mostly people seem to be sceptical of things they already don’t believe. If someone repeats something you believe, how much research are you going to do. So photoshopped images that reinforce your beliefs slip by, and the one that challenge you, you catch. Or worse, the ones that challenge you get labeled as photoshopped regardless of their provenance.

You don't think that photoshopping has really been a problem and you think people remain skeptical?

I guess if you think that epidemics regarding images of male/female body image, body dismorphia, self-harm, anxiety, and using celebrities to sell products aren't connected to it, but I've found exactly the opposite.

I love photography and I am utterly unable to talk to non-photographers or convince them about what happens in the production of most images they see in most forms of commercial media.

It goes something like this:

"Hey ACowAdonis, how much of that photo do you think was retouched?"

looks at photo

"All of it".

"All of it? What do you mean?"

"I mean all of it."

"But that's Reese Witherspoon! (or insert popular celebrity here)"

"Yep, and you can see how her eyes have been adjusted, her skins been adjusted, they've changed the shape of her arm, taken a few pounds off the mid section, increased the boob size, changed the colour of her hair...and i'm pretty sure that's not her hand".

"Nah, you crazy..."

"You want crazy...pretty much every photo in every fashion magazine and every media item involving that celebrity has been adjusted to a similar extent"

"Nah mate, you're having me on. You're nuts."

The best way to fool someone isn't to do an indistinguishable Photoshop job. It's to do a passable-enough fake of something the person wanted to believe anyway.

I see this technology as no different.

Although in most cases doctoring an image isn't even necessary, you just need to put it in a misleading context.

Here's how it will work, someone will make a deepfake of a political opponent and then publish it on a forum where like minded people gather using a dummy account.

Other dummy accounts will take the deepfake and start making a narrative around it, sending chain emails to their real world contacts.

Real world contacts will start passing around deepfake chain mail they were sent.

Some of these emails will take the deepfake as true, some will talk of it as being a funny parody, but "funny because it's true" anyway.

Major news organizations can now address the issue as news because people are passing it around, maybe it will be something like 'Well Bob, I think the X have a real image problem on their hand, if the video is true or not..' You don't mean to say you think it's true!? "I didn't say that Bob, I'm frankly not qualified to judge and I haven't done any research what I'm worried about here is that there is a perception that it is true or if it is not exactly true in this particular instance that it might be true, and that is what I mean by a real image problem"

The problem is that just like "fake news", deep fakes will provide alibis for people caught doing real crimes.

And their fanbases will listen because it's easier than accepting their idol could be a bad actor.

The REAL issue here will not be the fake videos themselves. They will cause many messes, but the real issue is an acceleration of what we see today: a loss of trust in information and in particular the established media. More societal rift, easy to dismiss any negative news about your favourite politician/rapper/.. as fake video; more difficult court cases even where there's video evidence, ..... Terrifying.

>a loss of trust in information and in particular the established media.

Yeah, the trust is not coming back, regardless of fake videos.

This scepticism is itself a problem. There's a whole branch of philosophy that claims that the objective truth is impossible to know. With deepfakes that's even more true, and might drive a lot of people into despair and apathy.

We'll still have provenance and trusted organizations which is what we mainly use to verify important and easily faked things like written words, photos, and videos that might be taken from a different context to what their descriptions claim. There are other techniques people have developed to verify things too, like the group recitals that transmitted the old testament orally for multiple generations. You couldn't just edit it and repeat a fake version because the change would have conflicted with the consensus.

Society survived before videos and photos, when all information was easily edited. I think we'll be fine. Maybe we're just in a brief decade or two where we became complacent at believing all videos were real without taking any of the care that we used to take with grainy films of alien autopsies or spoken testimonies of people who claimed to have seen bigfoot.

We'll still have provenance and trusted organizations

Oh boy are you in for a rude awakening. It doesn't matter what smart people believe if enough stupid people are convinced of something else. You're using examples from times of very low information distribution to form expectations about the opposite condition, which is already problematic, and ignoring all nonsensical superstition that used to be the norm.

When I was growing up I remember a minor local mania over a supposed miracle at a religious shrine which became a summer sensation. People were charting tour buses to say prayers and hoping to witness a miracle themselves. Right now in the US we have a community of people who have been enthusiastically chanting at political rallies about locking their opponents up for the last 3 years without any apparent care for evidence or factual basis. Obviously political rallies are known for their hyperbole but at some point you have to feed the beast.

Oh, I've got no hope for the majority of people. They're a lost cause - they still believe in religions! They don't need videos to convince them because even rumors will do. I'm thinking of at least casually critical people or courts, those who have some interest in what's true.

Exactly, I have shown my friends media of their favorite politician contradicting himself and their response is fake news

I don't think that fakes (photoshop, deepfakes, whatever) need to be absolutely believable to be effective. The long-game purpose is to erode trust in institutions, media, politicians, etc. Fakes accomplish this goal by being just believable and just frequent enough that more and more people start deciding to believe whatever it is they want to believe because "who knows what the real truth is!".

Audio-video is much more stimulating and convincing because it mimics real life more than still images.

Yes, I realize that and photoshopping got a bunch of press for being used on modelling pictures to "enhance" the models before putting them in magazines. These deepfakes will probably be used to do some other similar things as well. My point was that deepfakes aren't anything new and we already have the tools to analyze them. Those tools just aren't computer programs, they are people posting videos on youtube going pixel-by-pixel to show how a certain photo was doctored. After all, a video is just a series of photographs.

I think writing computer programs designed to spot these deepfake videos would be very helpful as the volume of doctored videos increases, but this isn't some disruptive technology (at least for people trying to deceive others).

You're kidding yourself. A large majority of Americans still believe Sarah Palin said she could see Russia from her house, and that was SNL satire. It will be very difficult to undo the damage done by a convincing and well-timed deep fake. Especially a fake that people want to believe.

I feel like we just need some comedians to make a bunch of entertaining, realistic but labelled-fake content with deepfakes. Production cost of making a show where each actor is deepfaked into a world leader or dead historical figure or whatnot is not a significant hurdle.

There is The Fakening channel on YouTube that does this - clearly labeled fake: https://m.youtube.com/channel/UC5D-8hVVwLB0DNrcSBqoVxg

Like this Jordan Peele / Obama PSA?


That was a good start!

I agree, it was fake news articles on Facebook that spread false information but it took desperately ignorant people to believe it for the consequent chaos to ensue. So while deepfake videos are scary, I think what we have to really worry about is the deep ignorance of the voters.

Isn't the deep ignorance of the voters the entire point of a deep fake? It's knowingly a fake, so by creating it you are already trying to pull one over on someone. When the president of the US can say "I never said that" even with video/audio evidence of him saying it while also screaming about deep fakes, the slippery slope is being greased. It's not hard to believe that viewers that only get information from a single source will fall for it. Even if they hear arguments it is fake, they will not research it on their own because their single source is never wrong.

From my observation I believe that a lot of educated and less educated people don’t really care if something is fake as long it confirms their opinions. They don’t want to let go of their “facts” although they know they are wrong.

We’ve come to learn that images and audio can’t be trusted. It was inevitable that video would fall as a trusted source.

Next up: the person you’re sitting across from at dinner.

Some people already experience this: https://en.wikipedia.org/wiki/Capgras_delusion

Totally agreed.

Writers have been able to write nonsense for a long time... and photo manipulation we've gotten quite used to. All we do is add video to the category of things that might be lies, and so need independent verification.

Skepticism is good and healthy, and verification in the age of Google isn't that hard.

You can trust that if the NY Times or CBS publishes a video, they verified its authenticity, or else will be publishing a big retraction within a few days that will also make the news because it's so rare.

Whether your uncle sends you a random photo or a video of a politician that seems too exaggerated or weird or unbelievable... you assume it might be manipulated... as you already do now. Making Nancy Pelosi seem drunk didn't take a deepfake, just slowing it down.

It's not any kind of big change. Just applying the same skepticism we already automatically apply to so many other things.

> You can trust that if the NY Times or CBS publishes a video, they verified its authenticity, or else will be publishing a big retraction within a few days that will also make the news because it's so rare.

This may be, or become false, due to political motivation to seriously damage the "other side of the aisle".

In fact, often times you don’t even need to lie to skew the “truth”. Cherry picking facts or even just highlighting certain facts over others, plus an optional bit of extrapolation or subtle misinterpretation, is often enough to fit whatever narrative you want to push.

> and verification in the age of Google isn't that hard.

It’s hard because publications often parrot each other. You walk away confident of your “verified” truth due to echo chamber effect, which might be worse than not verifying at all.

> You can trust that if the NY Times or CBS publishes a video...

I can’t. Again, you don’t need to make factual mistakes to push an agenda.

I remember there was some oil companies-backed anti-Tesla propaganda image a while ago showing the "environmental disaster a lithium mine creates," which went viral for a bit. It was, I think, a tar sands mine.

There's no way deepfake videos won't make the propaganda situation worse, at least for a while.

There are also many useful tools for the skeptic with regard to edited images: http://fotoforensics.com/

It will be interesting to see if such tools are developed for video as well.

People know about photo manipulations and get suspicious because we've had Photoshop for thirty years (and analog photomanipulation even longer) and see them all the time. This wasn't always true. When photography was new, manipulations that wouldn't fool anyone today were taken as proof by many people. See for example https://en.wikipedia.org/wiki/Cottingley_Fairies

Photoshop has been used for years to successfully fool millions of men and women that consume magazines showing people with smooth skin and sexy bodies.

This is exactly what I wanted to post. People are becoming insecure about their bodies because of fake images of famous people. It creates high expectations that can never be met in real life and seriously ruins lives.

I think the problem is that if we can’t trust video then there is really nothing visual left we can trust. Until now you could at least trust video recordings to some degree. Not sure if that a good thing or not.

Not being able to believe realistic looking video is in itself a problem. People are simply not going to believe news anymore.

God forbid media outlets are forced to rely on reputation.

Doesn't a lot of the "fake news" use photoshop?

We’ve had photo manipulation without the need of a darkroom or skilled optical retouchers for 30 years, but weaponized photo manipulation has been a thing since Stalin’s censors airbrushed out Trotsky a century ago.

Yes, this is true. However, good quality photo editing became relatively cheap and convincing in more recent years. So now you don't need to be a nation-state to be able to convincingly pull something like that off. You can just be some guy in a basement with $1000 worth of computing gear and experience using photoshop.

Video is way worse, because it's connected to audio. Which means "did you hear that? He just said he wants to go to war...".

I find it worrying that only now people start to be sceptical about visual information. There is a huge difference between real world and the framed and curated view photographer or documentarist gives you. If you go to school to learn about this stuff it's mostly about how to convey your view through these tools and the ethical implications of it.

I think people vastly underestimate how much editing and framing change the perceived truth of what happened. It is more subtle than manipulating the contents of video, but I think it can be in many ways more effective as most of this stuff bypasses your cognition and is not straight up lying.

It feels the same as in written news changing the quote vs. changing text around the quote.

I think we would be better of looking at video like it was picture drawn or text written by someone. It's an artistic rendition of the events.

I completely agree. The hardest part of returning from military operations overseas was that Americans inevitably had their views shaped by one of two prevailing video viewpoints. Trying to discuss what happens over there in any meaningful way with someone who hasn't been inevitably devolves into an emotional reaction (usually roughly pro or against) instead of a conversation.

Every news is to some degree manipulated.

Together with a friend I was one of the geocaching and confluencing pioneers in Germany.

Some large papers and TV stations reported about this "phenomenon" and wanted to make an article/ documentation about us. Every one of them came with a story in their head which we had to fill with our pictures and quotes. No one was interested in "reality". For a news clip we had to shoot situations several times, I remember leaving a house 5 times until they shot was done. Up until then I thought news would be unstaged.

the moment they essentially ask you to cooperate in staging a scene, why would you cooperate?

I was young and needed the money ;-)

This [1] image is a perfect embodiment of everything you're saying. But I'd add that I think there's a dissonance in our belief in other people's naivete and the actual numbers. In particular trust in media has fallen off a cliff. Only 31% of people trust in mass media to "to report the news fully, accurately and fairly" at least a fair amount. [1] If you restrict the question to those 18-49 years old, it's 26%. People may not be able to precisely point to why they feel this way, but it's clear that people have become deeply skeptical of media in general. The reason this is relevant is because media, in turn, relies heavily on what you're talking about - images framed to convey a narrative.

I think what we really take issue with is something related but different. And that is people voluntarily believing things that confirm their biases, while ignoring or even denying things that challenge them. Pew did a nice piece on that here [3]. These [4] are just their poll questions, which are quite interesting. For instance, "spending on social security, medicare, and medicaid make up the largest portion of the US federal budget" - 41% of Americans incorrectly labeled that as an opinion. And of those that labeled it an opinion, 82% further incorrectly labeled it as false. Going the other direction, "increasing the federal minimum wage to $15 an hour is essential for the health of the US economy" - 26% of Americans incorrectly classified this as a factual statement, and of those 83% claimed it was accurate. The same is of course true on both sides of the aisle, with the questions affirming as such.

People have a tendency of believing what they want to be true, while challenging (often quite aggressively) everything that goes against that, or even simply denying it. This can create the perception of a naive people being misled by malicious actors, but I think reality is that people tend to pick the views that they want to be true often for entirely subjective reasons that cannot be clearly qualified, and then work to find evidence to support that.

[1] - https://www.washingtonian.com/wp-content/uploads/2017/01/gar...

[2] - https://news.gallup.com/poll/195542/americans-trust-mass-med...

[3] - https://www.journalism.org/2018/06/18/distinguishing-between...

[4] - https://www.journalism.org/2018/06/18/distinguishing-between...

The danger is not in false positives, but false negatives. The very existence of this kind of things erode trust and sow paranoia.

A simple morph cut in a John Pilger interview of Assange made a sizeable portion of nutjobs believe Assange has been long dead. Don't think this kind of behaviour can't eventually extend to the mainstream.

I agree. This technology will give everyone plausible deniability.

It's the slow erosion of video evidence being trustworthy.

Cryptographic signing of video footage is a useful blockchain application, but it will presumably be subject to the same flaws as domain security certificates.

If you can sign a video then you can sign a doctored video. If it's only your camera that can sign the video, not you directly, don't be fooled that it will be possible to protect the private keys in the camera from extraction.

>If you can sign a video then you can sign a doctored video.

I don't understand your point, you wouldn't sign a doctored video unless you wanted to do so. It's entirely possible to apply the principles of PGP to video.

>If it's only your camera that can sign the video, not you directly, don't be fooled that it will be possible to protect the private keys in the camera from extraction.

I doubt this would be the solution on which the world settles.

If you can sign anything then what purpose does a signature serve?

You only sign what you want to sign.

When you publish a video of you speaking at a public event, you sign it. When someone else publishes a doctored video of you, you do not sign it.

This alone doesn't protect you in the case that someone speaks and then later intentionally doctors and signs the video in order to change what they said. In this case trusted third parties (e.g. news organizations) could sign videos as well. A set of signatures taken together can provide trust.

That's not how this works though. If I want to post video clamining someone did something, I sign it, but what does this prove? Nothing.

Well yeah, because you aren't the subject of the video and you have no reputation.

You wouldn't think a letter from your mom is from your mom unless your mom signed it.

It's not supposed to validate the truthfulness of the content, why would you think that was the purpose?

This series of words masquerading as sentence makes zero sense.

What would signing video prove? Who signs it? Who controls the signing keys? How would blockchain, in any form, help here, in any capacity whatsoever?

Signing video would prove it existed in that form at a particular time. Derivative instances could be linked back to the original. Every modification is like a transaction. You would always be able to follow a piece of media back to its original published source. Just imagine any given media stream as an edit decision list containing clips, each of which points back to an asset, each asset being a piece of video with a date of publication.

The aset isn't a single entity in one place, it can be distributed via IPFS or whatever. The earliest known version of a thing is the canonical one for practical purposes. In this view blockchain isn't producing coins for hoarding, but tags for people to locate things on the public graph.

Why would you need a blockchain to cryptographically sign video footage?

Recording a hash of a video in the Bitcoin blockchain creates public evidence that the video existed at a specific point in time. If that point is very soon after the events portrayed in the video then it can increase confidence in the video's authenticity.

Proves nothing about authenticity. Requires an internet connection which means in many cases it's completely impractical.

I think it would be more like prior art, to distinguish originals from copies rather than to authenticate the content itself. Given two pieces of similar-but-different video the one with the earliest timestamp is presumably the original.

The "Ethical concerns" section in the article feels like a punt. The author quoting "this technology is really about better storytelling" is aspirational -- the technology's story will be written by those who use it, and you can bet people will use this maliciously.

See this? This hand grenade I'm leaving in the schoolyard? This is about better games of hot potato.

And that's loads more self-aware than other researchers' I've seen completely blindsided by obvious ethical questions at end of their paper talks.

"Fixed film dialog without reshoots" and "better story telling" seemed rather compelling to me.

Perhaps the malicious use cases are more obvious due to how trustworthy a video can appear.

Forget deepfakes and such for a moment...

Think of the impact of this on dubbing movies between languages. This seems like an incredible tool.

Of course, we can’t just forget about deepfakes and such, but this particular usecase kind of excites me.

I've watched many dubbed movies and tv shows and the slightly-off lip movement never bothered me, you stop noticing it after a bit. It wouldn't be that big of an improvement.

Well, it bothered me. I never watch dubs - if possible. I watch subs instead.

I prefer the subs tho mainly because the voice acting in different languages is usually not great - movies’ original languages usually sound much more natural.

Well, it is if you are deaf.

What would we call such a category of movies? Vubs?

I suspect in the not too distant future we'll need a way to produce provably true videos. I'm thinking something like the subject, a politician giving a press conference for example, carries something that emits a signal that the cameras encode into the video in a way that any alterations could be detected, something like a cryptographic signature. I don't really know enough about cryptography time be sure how / if it would work.

I've been trying to think of a way to popularize digital signatures so when a video is purporting to be a CSPAN clip, you could check that the authorship is really CSPAN, but this always relies on trusting your sources, and who's to say CSPAN won't get hacked? Maybe they already were. [1]

I like your idea, it reminds me of something I stumbled upon reading about the 60Hz hum of AC electricity.

"...this side effect has resulted in its use as a forensic tool. When a recording is made that captures audio near an AC appliance or socket, the hum is also incidentally recorded. The peaks of the hum repeat every AC cycle (every 20 ms for 50 Hz AC, or every 16.67 ms for 60 Hz AC). Any edit of the audio that is not a multiplication of the time between the peaks will distort the regularity, introducing a phase shift. A continuous wavelet transform analysis will show discontinuities that may tell if the audio has been cut."

[1] https://www.nytimes.com/2017/01/12/business/media/cspan-russ... [2] https://en.wikipedia.org/wiki/Utility_frequency#Audible_nois...

More likely that trusted sources become more important. A politician can share their own stream of a speech.

Right... What about when someone posts a video of them saying a racial slur backstage.

The politician says it was doctored.

The poster says it is unedited.

How do you verify who is telling the truth.

Multiple adversarial journalists recording it, like we currently have, would protect against that.

They all wear those stupid flag pins, so why not have them flashing a digital signature in IR? This is almost so bizarre it's genius...

Or perhaps a return to analog, non-digital, older media such as paper film and video rolls will serve as "probably true".

I believe you already know this, hence the "probably" part, but such analog media can easily be produced from (modified) digital media, thus no credibility.

news agencies could just use an encrypted video stream, since only they have the private key you'd know it came from them and could be trusted as from them.

I had this idea that devices which record content like images or video should have an unforgeable key internal to their hardware, like we have with PGP / GPG. Content that comes from the device would be signed, and allow users to validate whether it originated unmodified from the hardware source.

Granted, derived content will fail validation, but it will motivate tracking down the original, until validation can be performed. Maybe you can take pictures of fake imagery printed onto large high-def paper, but at least you eliminate one stage in the process...

Honestly, we should not trust digital content these days.

> unforgeable key internal to their hardware, like we have with PGP / GPG

PGP involves a private key, and if you have the private key you can "forge" any message. If you put the key in hardware, it can be read by an adversary with access to a powerful microscope.

I mean how is this any different than the "secure enclave" on iphones or other forms of hardware security modules. Yes a sufficiently advanced adversary with an electron microscope could possibly extract the key but it still greatly raises the bar for 99% of other would be abusers.

Because the sort of actors who would try to forge videos are precisely the sort of actors who would have such advanced technology. The secure enclave on an iphone protects against say someone trying to convert a stolen phone into a stolen identity, not against nation-states.

what if the camera is connected to the internet, generates its own random key every 10 seconds, signs the new public key with the previous key, and a quorum of receiver citizens selected by sortition send their public keys, and the camera uses threshold cryptography to send each receiver their share of the secret frames for ~10 seconds.

The adversary would have to extract the key within 10 seconds without damaging the security envellope of the device, which can't be powered down. If such a camera is powered down, a replacement camera would need to be manufactured and sent and isntalled (again by citizens selected through sortition) at the place the malfunctioning / perturbed camera was. If the cameras cover each other (say cameras along both sides of a street such that a camera sees 2 or more other cameras) the perturber can be tracked, both where he came from, as well as where he went to...

But each device could have it's own unique signing key, signed by a manufacturer master key that is not distributed. Then you could revoke just that one device if the need arises (or even a manufacturer). Isn't this a fairly solved problem in the TLS/CA space?

How would you know when to revoke a key?

You seem to know more than you are admitting via your responses here. Perhaps you can share your thoughts, rather than provide responses which defeat the discussion?

It's a genuine question, I don't have an argument for impossibility or anything, I just don't see how revocation could practically be done. In scenario A, Joe Schmo buys a camera, happens upon a powerful politician in a compromising situation, and uploads the video to YouTube with a digital signature. In scenario B, Joe Schmo buys a camera, actually works for the NSA, uses an electron microscope to extract the private key, and stamps a digital signature on a forged video, uploading that instead. How would any third party be able to differentiate between those? I could imagine potential avenues for solutions (e.g. maybe you could use quantum entanglement to make some sort of tamper-proof chamber around the key in hardware?) but then we're pretty far afield of straightforward PGP/TLS. Not to mention the problem that an adversary like the NSA could just get inside the ASIC fab and copy keys from the machine that prints them.

make a distinction between cameras for fun, and cameras for undercover investigative journalism by journalists or citizen journalists.

the first type of camera is the one we have today, the second type would be more expensive, need to stay connected to the group consensus protocol, and need to stay powered, so journalists will be lugging extra batteries, and the camera would have 2 battery ports for switchover...

Your logic here could be applied to any form of cryptographic signing. How do we know when to revoke an SSL key? Someone could be misusing them without us knowing. But sometimes we do know.

When the key signs some embarrassing video it gets revoked as compromised. Simple as that

> Content that comes from the device would be signed

Wonder if that is possible. One can always convert a picture to a 2-D array of RGB values, so signature can't be in the video (or image) file container. So it has to be a watermark of a kind. If the algorithm is known, it's interesting if the signature can be unforgeable. If algorithm isn't known, then other issues (like with DeCSS) can appear.

A signature doesn't have to be protected from malicious removal. If a bad actor removes it (or modifies it), then the video can no longer be validated and is considered compromised/unknown (which may be the same thing depending on how important validity is). DeCSS was removal of encryption, not signature, which is a whole different set of concerns and goals.

Alternatively, we need to see how well these algorithms can fake videos from different angles in a consistent manner. If we have enough enough people recording, it may just be enough to prove something is not a fake.

In that case, video editing softwares and materials should also conceal a private key, in order to trace the origin of manipulation.

The whole chain of video production should be signed in order to trace filming and edition altogether and, all intermediary signatures from each stage of production process contained in a final signature.

That may be a business opportunity or at least (may be preferably) an interresting open source project.

You would need to sign a video with at least four keys: a public key from you, a private key from editor software, a public key from editor software, a public key from editor software generating the previous version of the video. And you would generate signatures for all public-private and private-private pairs and a seal value from a signed hash of those signatures, which is signed by a combination of two public keys in the scheme.

Could it be something even more low-tech? Kind of like those faint yellow dots that all printers must print. Maybe a distinct pattern of pixel coloration that a person wouldn’t even be aware of unless they were looking for it.

The algorithm which validates would need to be resistant to attacks, e.g., positioning of these pixels.

There are techniques for steganography, which may apply here, I wonder. Like adding a digital fingerprint in someway directly into the media itself.

you might be interested in this discussion: https://news.ycombinator.com/item?id=16947652

Everyone seems to think of nefarious uses, but I can't wait for this tech to appear in video calls, combined with translation. This could enable two people without a common language to have a conversation while appearing to each other as native speakers of their respective languages.

Wow, that would be something else. I assume whoever gets that to market first would make a killing.

I imagine the most above-board use for something like this would be in scripted tv and movies. Basically an enhanced form of ADR (https://en.wikipedia.org/wiki/Dubbing_(filmmaking)#ADR/post-...). Of course, I anticipate plenty of nefarious uses.

A place where I don't think it will be used much is actual facing-the-camera-talking-head content. Something we have learned from YouTubers is that audiences don't care if there are discontinuous cuts during a monologue. YouTubers don't try to pretend they did it all in one take, and will happily edit their video as if editing text. The cuts are obvious in both the audio and video. And still it works.

Seems really cool but I wonder how well it will handle a case where you want to swap a phrase for a phrase, but have the new phrase have a "human specific" emphasis or variant to it.

Example: "That was a short trip" vs "That was a reaaaaaalllly long trip".

Language is so much more than words. When you deliver the variant message, your whole facial expression might change. So much would get lost if that doesn't carry over. Your facial expression and tone in that context also completely changes the meaning from you enjoyed the long trip to not enjoying it, but how can a machine know which one to pick.

It's strange to me that people are so concerned about these deep fakes when the National Enquirer has been around for so long. It's been easy to lie to people in mass for awhile now. I don't think this changes the number of people that are open to these suggestions, I think people in general are smarter than a lot of people give them credit for.

On the flip side, it may become harder to convince people of the truth when there is such a convenient way to reject unwelcome video evidence. This could amplify echo chambers.

Science Fiction author Greg Egan wrote a novel called Distress[0] where the main character is a science journalist who makes documentaries. He uses software exactly like this. The book was published in 1995. It's a very good book and I highly recommended it and basically any other book written by Egan. (My personal favorite is probably "Diaspora" followed closely by "Permutation City".)

[0]: https://www.goodreads.com/book/show/19328253-distress

This tech allows the state or corporations to quietly adjust the historical record of their representatives words and statements to fit their ambitions at any given point.

How is that any different than the bullshit they do right now.

It’s not at all, that’s how you can be pretty sure it’s being done basically as soon as the tech gets invented.

All human evidence rests upon the shaky foundation of "because I believe its true", at the bottom of which rests the shaky foundation of your personal experiences. Don't believe me? Just ask a schizophrenic how hard it is to disbelieve your own experience.

Reminds me of the HP Lovecraft novel "The Call of Cthulu" Page 1, Paragraph 1:

> The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the deadly light into the peace and safety of a new dark age.

Interested to see if counter-measures begin to be deployed in order to make a deep fake more difficult and buy time. Incorporating dynamic backgrounds and body gestures like touching one's face while talking.

Just imagine people using this tool to spread false claims and propaganda. Can we also determine if the video was actually edited or forged?

Their mouth movement wasn't particularly natural in the parts where the speech was edited.

It's good enough that it will fool some people on Facebook/Twitter, but it's pretty far from being able to stand up to any scrutiny.

Isn't that all it takes to spread propaganda? We're living in an age where if it feels right, it is, even if it's blatantly false.

I agree that their mouth movement wasn’t particularly natural, but I disagree that it isn’t enough to fool most of us if we’re unaware. We’re not fixing our eyes on a person’s mouth while they talk.

Think of the scenario where someone edits a few seconds of a 30-minute interview. They make the interviewee go from saying they hate drugs to saying they love drugs. Even if you weren’t expecting that claim from that person, would you go back and recheck their mouth movement, to be certain if it was edited or not? Unlikely. Even if you would, I’d wager most wouldn’t, including most of us that could detect it.

For propaganda, this will be great. The thing is, you don't even need convincing videos for propaganda to work.

For truly scary things, like falsifying evidence, I think it will be awhile before this will get past expert analysis, or even a group of people on Reddit trying to prove it wrong.

In the long term, video will simply be treated like photos are now. With disbelief.

It can also be used to 'ratfuck', if a video is released with a politician saying a gaffe, they can release a bunch of similar but fake videos and then claim that 'all of them' are fake. The confusion would sow doubt- the technique could be used offensively as well as defensively.

The tech is only going to get better, faster, and easier to use. And that will happen incredibly quickly. Being able to spot artefacts in the output is not a viable defence.

> Just imagine people using this tool to spread false claims and propaganda.

That was done in The Expanse novels, I wonder if that is where people get the ideas to create things like this from.

> Can we also determine if the video was actually edited or forged?

Detecting fake videos is actively being researched and there are working methods available, governments are very much aware of the danger.

There will still be some lag between initial release and govt response. C.f. The War of the Worlds broadcast

Yikes, yeah. The War of the Worlds broadcast panic was an accident, too. Imagine an engineered panic on that scale. Especially if it was released through a trusted news source, by either hackery or social engineering.

Ironically, the current belief is that The War of the Worlds panic was greatly exaggerated: https://www.snopes.com/fact-check/war-of-the-worlds/

GANs can be used to detect this sort of thing if trained well enough. Any manipulation tool is going to leave some kind of signature. Eventually you'll end up in an algorithmic arms race, and you'll be able to fool some people, some of the time. (Pretty much the same as things have been since the Spanish-American War.)

Smaller economies are going to get screwed by this, though, as they won't have the resources to fight yet another technological battle.

How do we know it's not already in the wild? Ethical release of this would focus also on some way for easier forensic telltales that are hard to avoid.

Also, rule34 implications.

It'll be used to debunk true claims, too.

My thoughts exactly.

Running m-x dissociated-press on a Trump speech makes it still sound like a Trump speech.


At least they can’t synthesise the voice automatically. But pair this with Lyrebird.ai and you can basically just stop trusting all video right now.

Adobe was showing off some amazing tech for editing voices several years ago, called Adobe Voco


You can go to the official website of whoever is supposed to have said it, or a trusted news organization who recorded it and see if they publish the same video. Internet commenters will be quick to point out problems like that. In case it was recorded secretly or it's a crime occurring or whatever, then you should never have trusted it anyway because you don't know the context. Taking things out of context is routinely used by the news to fool people already. Stop uncritically believing whatever appears on your screen if you're concerned about what's true.

Where is the audio coming from? If that was computer manufactured that was pretty good because it sounded very natural

The video mentions that the audio was recorded separately, and shows a few other options like text-to-speech (which obviously doesn't match the voice) and some smarter voice matching audio generation (VoCo) which could pass for the original voice sent over heavily compressed, low-bandwidth video conferencing or something like that. I'm guessing that if this is used for actual disinformation, finding a voice actor/audio engineer to try and match the speech would be most effective.

Most of the examples in the video they had the subject record the audio separately from the generated video. In one or two of them they cite some audio-generating thing.

I actually love this ongoing cat and mouse game. I don't follow the events in this field keenly so I don't know if it exists, but the challenge is to find antidote to this concoction, created by mad scientists just for sake of science, that will be weaponized anytime now.

The algorythm demonstrated in this video, that moves the cats mouth in real time to the naration, was written in javascript and html5 audio.


In regards to all the talk of not being able to verify your video source; perhaps we'll go back to film for things that need to be probably legitimate. Though perhaps that has the same issues.

Is there any open source software like this available anywhere ?

Looking forward to the autogenerated youtube videos about various topics. Perhaps they would be interesting to watch rather than a few images and a robotic voice.

This is gonna revolutionize the animation industry.

lead author's page with links to other researchers' pages, the paper itself, and nearly 200MB of supplementary materials: https://www.ohadf.com/projects/text-based-editing/

“Unfortunately, technologies like this will always attract bad actors,” --- uhmmm, what is the "good actors"?? I want to take video, and delete what you said and make you say something else. uhmmm, can't actually see the "good actor" point of that.

Making edits to scenes in movies during post production, cutting out mistakes in broadcasts... I'm sure video production professionals could easily come up with dozens of use cases.

It could be used to bleep movies for tv so that the reading the lips of the actor match the dubbed audio. You can easily tell what curses people are saying.

Well, now that scene in movies where the newscaster narrates the aftermath of an event can be done realistically without having to hire actors?

Read the article?

Was I only one who thought of Jim Carrey in God almighty?

Bruce almighty?

Reality TV is probably going to have a hay-day with this

youtube and other video sites need a framework for content verifying videos that haven't been digitally produced using steganography or keys injected into every frame. This could software could be embedded in video recording devices (keys could be updated OTA so hard hacks don't matter). If videos haven't been "reality" verified then people can just enjoy them as fake/fiction works. Video editing/compression software would need to be aware of the location of the key bits and maintain them within each frame.

S*. It’s scary

people choose to believe things. The kind of people who will be duped want to be duped as it serves their own ideology. You can’t change that.

Now we need cameras that watermark (using steganography) original videos so they can be authenticated and a blockchain solution for registering originals. Video sharing sites will need to process all their uploads to check for modifications and serve the original file (not re-encoded) as well, for third party checking.


What triggered you to downvote me? Maybe the fact that I used the dirty word 'blockchain'? I just used it because it seems like an obvious application to certify and long-term preserve the originals.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact