Hacker News new | past | comments | ask | show | jobs | submit login
YouTube now requires to label their realistic-looking videos made using AI (blog.google)
882 points by marban 57 days ago | hide | past | favorite | 485 comments



I think it's smart to start trying things here. This has infinite flaws with it, but from a business and learnings standpoint it's a step toward the right direction. Over time we're going to both learn and decide what is and isn't important to designate as "AI" - Google's approach here at least breaks this into rules of what "AI" things are important to label:

• Makes a real person appear to say or do something they didn't say or do

• Alters footage of a real event or place

• Generates a realistic-looking scene that didn't actually occur

At the very least this will test each of these hypotheses, which we'll learn from and iterate on. I am curious to see the legal arguments that will inevitably kick up from each of these - is color correction altering footage of a real event or place? They explicitly say it isn't in the wider description, but what about beauty filters? If I have 16 video angles, and use photogrammetry / gaussian splatting / AI to generate a 17th, is that a realistic-looking scene that didn't actually occur? Do I need to have actually captured the photons themselves if I can be 99% sure my predictions of them are accurate?

So many flaws, but all early steps have flaws. At least it is a step.


One black hat thing I'm curious about though is whether or not this tag can be weaponized. If I upload a real event and tag it as AI, will it reduce user trust that the real event ever happened?


The AI tags are fundamentally useless. The premise is that it would prevent someone from misleading you by thinking that something happened when it didn't, but someone who wants to do that would just not tag it then.

Which is where the real abuse comes in: You post footage of a real event and they say it was AI, and ban you for it etc., because what actually happened is politically inconvenient.

And the only way to prevent that would be a reliable way to detect AI-generated content which, if it existed, would obviate any need to tag anything because then it could be automated.


I think you have a bit backwards. If you want to publish pixels on a screen there should be no assumption that they represent real events.

If you want to publish proof of an event, you should have some pixels on a screen along with some cryptographic signature from a device sensor that would necessitate atleast a big corporation like Nikon / Sony / etc. being "in on it" to fake.

Also since no one likes RAW footage it should probably just be you post your edited version which may have "AI" upscaling / de-noising / motion blur fixing etc, AND you can post a link to your cryptographically signed verifiable RAW footage.

Of course there's still ways around that like your footage could just be a camera being pointed at an 8k screen or something but at least you make some serious hurdles and have a reasonable argument to the video being a result of photons bouncing off real objects hitting your camera sensor.


> If you want to publish proof of an event, you should have some pixels on a screen along with some cryptographic signature from a device sensor that would necessitate atleast a big corporation like Nikon / Sony / etc. being "in on it" to fake.

At which point nobody could verify anything that happened with any existing camera, including all past events as of today and all future events captured with any existing camera.

Then someone will publish a way to extract the key from some new camera model, both allowing anyone to forge anything by extracting a key and using it to sign whatever they want, and calling into question everything actually taken with that camera model/manufacturer.

Meanwhile cheap cameras will continue to be made that don't even support RAW, and people will capture real events with them because they were in hand when the events unexpectedly happened. Which is the most important use case because footage taken by a staff photographer at a large media company with a professional camera can already be authenticated by a big corporation, specifically the large media company.


also the three letter agencies (not just from the US) will have access to private keys of at least some manufacturers, allowing them to authenticate fake events and sow chaos by strategically leaking keys for cameras that recorded something they really don't like.


For all the folks that bash the United States for "reasons" this one gave me a chuckle. Our handling of privacy and data and such is absolute ass, but at least we *can* hide our data from big government with little repercussion in most cases (translation: you aren't actively being investigated for a crime that a judge isn't aware of)

Of course that says nothing about the issues of corruption of judges in the court system, but that is a "relatively" new issues that DOES absolutely need to be addressed.

(Shoot one could argue that the way certain folks are behaving right now is in itself unconstitutional and those folks should be booted)

Countries all over the world (EVEN IN EUROPE WITH THE GDPR) are a lot less "gracious" with anonymous communication. The UK actually has been trying to outlaw private encryption, for a while now, as an example, but there are worse examples from certain other countries. You can find them by examining their political system, most (all? I did quit a bit of research, but also was not interested in spending a ton of time on this topic) are "conservative leaning"

Note that I'm not talking just about existing policy, but countries that are continually trying to enact new policy.

Just like the US has "guarantees" on free speech, the right to vote, etc. The world needs guaranteed access to freedom of speech, religion, right to vote, healthcare, food, water, shelter, electricity, and medical care. I don't know of a single country in the world, including the US, that does anywhere close to a good of job with that.

I'm actually hoping that Ukraine is given both the motive and opportunity to push the boundaries in that regard. If you've been following some of the policy stuff, it is a step in the right direction. I 100% know they won't even come close to getting the job done, but they are definitely moving in the right direction. I definitely do not support this war, but with all of the death and destruction, at least there is a tiny little pinprick of light...

...Even if a single country in the world got everything right, we still need to find a way to unite everyone.

Our time in this universe is limited and our time on earth more-so. We should have been working together 60 years ago for a viable off-planet colony and related stuff. If the world ended tomorrow, humanity would cease to exist. You need over 100,000 people to sustain the human race in the event a catastrophic event wipes almost everyone out. Even if we had 1,000 people in space, our species would be doomed.

I am really super surprised that basic survival needs are NOT on the table when we are all arguing about religion, abortion, guns, etc. Like really?


> We should have been working together 60 years ago for a viable off-planet colony and related stuff. If the world ended tomorrow, humanity would cease to exist. You need over 100,000 people to sustain the human race in the event a catastrophic event wipes almost everyone out.

We are hundreds of years away from the kind of technology you would need for a viable fully self-sustainable off-world colony that houses 100k or more humans. We couldn't even build something close to one in Antarctica.

This kind of colony would need to span half of Mars to actually have access to all the resources it needs to build all of the high-tech gear they would require to just not die of asphixiation. And they would need top-tier universities to actually have people capable of designing and building those high-tech systems, and media companies, and gigantic farms to make not just food but bioplastics and on and on.

Starting 60 years earlier on a project that would take a millennium is ultimately irrelevant.

Not to mention, nothing we could possibly do on Earth would make it even a tenth as hard to live here than on Mars. Nuclear wars, the worse bio-engineered weapons, super volcanoes - it's much, much easier to create tech that would allow us to survive and thrive after all of these than it is to create tech for humans to survive on a frozen irradiated dusty planet with next to no atmosphere. And Mars is still the most hospitable other celestial body in the solar system.


> Nuclear wars, the worse bio-engineered weapons, super volcanoes - it's much, much easier to create tech that would allow us to survive and thrive after all of these than it is to create tech for humans to survive on a frozen irradiated dusty planet with next to no atmosphere.

This is the best argument I've heard for why we should do it. Once you can survive on Mars you've created the technology to survive whatever happens on Earth.


> I am really super surprised that basic survival needs are NOT on the table when we are all arguing about religion, abortion, guns, etc. Like really?

Most people in the world struggle to feed themselves and their families. This is the basic survival need. Do you think they fucking care what happens to humantiy in 100k years? Stop drinking that transhumanism kool-aid, give your windows a good cleaning and look at what's happening in the real world, every day.


The transhumanist/effective altruism types really do a great service in making me chuckle. I wonder where that attitude comes from, lack of community?


Narcissism


> but at least we can hide our data from big government with little repercussion

They come and ask. You say no? They find cocaine in your home.

You aren't in jail because you refused to hand out data. You are in jail because you were dealing drugs.


I think at minimum YouTube could tag existing footage uploaded before 2015 as very unlikely to be AI generated.


The first (acknowledged) deepfake video is from 1997


Hence, "unlikely" instead of "guaranteed real."


I think doing this right goes the other direction. What we're going to end up with is a focus on provenance.

We already understand that with text. We know that to verify words, we have to trace it back to the source, and then we evaluate the credibility of the source.

There have been periods where recording technology ran ahead of faking technology, so we tended to just trust photos, audio, and video (even though they could always be used to paint misleading pictures). But that era is over. New technological tricks may push back the tide a little here and there, but mostly we're going to end up relying on, "Who says this is real, and why should we believe them?"


> If you want to publish proof of an event, you should have some pixels on a screen along with some cryptographic signature from a device sensor that would necessitate atleast a big corporation like Nikon / Sony / etc. being "in on it" to fake.

That idea doesn't work, at all.

Even assuming a perfect technical implementation, all you'd have to do to defeat it is launder your fake image through a camera's image sensor. And there's even a term for doing that: telecine.

With the right jig, a HiDPI display, and typical photo editing (no one shows you raw, full-res images), I don't think such a signature forgery would detectable by a layman or maybe even an expert.


I worked in device attestation at Android. It’s not robust enough to put our understanding of reality in. Fine for preventing API abuse but that’s it.


> I worked in device attestation at Android. It’s not robust enough to put our understanding of reality in.

I don't follow. Isn't software backward compatibility a big reason why Android device attestation is so hard? For cameras, why can't the camera sensor output a digital signature of the sensor data along with the actual sensor data?


I am not sure how verifying that a photo was unaltered after capture from a camera if very useful though. You could just take a photo of a high-resolution display when an edited photo on it


That wouldn't look nearly realistic. And it would be significantly harder to achieve for most people anyway.


It's true that 1990s pirated videos where someone snuck a handheld camera into the cinema were often very low quality.

But did you know large portions of The Mandalorian were produced with the actors acting in front of an enormous, high-resolution LED screen [1] instead of building a set, or using greenscreen?

It turns out pointing a camera at a screen can actually be pretty realistic, if you know what you're doing.

And I suspect the pr agencies interested in flooding the internet with images of Politician A kicking a puppy and Politician B rescuing flood victims do, in fact, know what they're doing.

[1] https://techcrunch.com/2020/02/20/how-the-mandalorian-and-il...


That's a freaking massive LED wall... with professional cinematography on top. If you believed my comment was intended to imply that I believed that's somehow impossible, well... you and I have a very different understanding of what it means to "just take a picture of a high-resolution display"...


There's been a slow march to requiring hardware-backed security. I believe all new devices from the last couple of years need a TEE or a dedicated security chip.

At least with Android there are too many OEMs and they screw up too often. Bad actors will specifically seek out these devices, even if they're not very technically skilled. The skilled bad actors will 0-day the devices with the weakest security. For political reasons, even if a batch of a million devices are compromised it's hard to quickly ban them because that means those phones can no longer watch Netflix etc.


But you don't have to ban them for this use case? You just need something opportunistic, not ironclad. An entity like Google could publish those devices' certificates as "we can't verify the integrity of these devices' cameras", and let the public deal with that information (or not) as they wish. Customers who care about proving integrity (e.g., the media) will seek the verifiable devices. Those who don't, won't. I can't tell if I'm missing something here, but this seems much more straightforward than the software attestation problem Android has been dealing with so far.


Woudln't that prevent most folks from being able to root their devices without making the camera lesser than everyone else's camera?


What does this have to do with root? The camera chip would be the one signing the data flowing through it, not the Android kernel.


If you do a jpeg compression, or crop the file, then does that signature matter anymore?


Cryptography also has answers for some of this sort of thing. For example, you could use STARKs (Succinct Transparent Arguments of Knowledge) to create a proof that there exists a raw image I, and a signature S_I of I corresponding to the public key K (public input), and that H_O (public input) is a hash of an image O, and that O is the output of providing a specified transformation (cropping, JPEG compression) to I.

Then you give me O, I already know K (you tell me which manufacturer key to use, and I decide if I trust it), and the STARK proof. I validate the proof (including the public inputs K and H_O, which I recalculate from O myself), and if it validates I know that you have access to a signed image I that O is derived from in a well-defined way. You never have to disclose I to me. And with the advent of zkVMs, it isn't even necessarily that hard to do as long as you can tolerate the overhead of running the compression / cropping algorithm on a zkVM instead of real hardware, and don't mind the proof size (which is probably in the tens of megabytes at least).


Not if you do it, only if the chip also gives you a signed JPEG. Cropping and other simple transformations aren't an issue, though, since you could just specify them in unsigned metadata, and people would be able to inspect what they're doing. Either way, just having a signed image from the sensor ought to be adequate for any case where the authenticity is more important than anesthetics. You share both the processed version and the original, as proof that there's no misleading alteration.


> You share both the processed version and the original, as proof that there's no misleading alteration

so you cannot share the original if you intend to black out something from the original that you don't want revealed (e.g., a face or name or something).

The way you specced out how a signed jpeg works means the raw data _must_ remain visible. There's gonna be unintended consequences from such a system.

And it aint even that trustworthy - the signing key could potentially be stolen or coerced out, and fakes made. It's not a rock-solid proof - my benchmark for proof needs to be on par with blockchains'.


> The way you specced out how a signed jpeg works means the raw data _must_ remain visible. There's gonna be unintended consequences from such a system.

You can obviously extend this if you want to add bells and whistles like cropping or whatever. Like signing every NxN sub-block separately, or more fancy stuff if you really care. It should be obvious I'm not going to design in every feature you could possibly dream of in an HN comment...

And regardless, like I said: this whole thing is intended to be opportunistic. You use it when you can. When you can't, well, you explain why, or you don't. Ultimately it's always up to the beholder to decide whether to believe you, with or without proof.

> And it aint even that trustworthy - the signing key could potentially be stolen or coerced out, and fakes made.

I already addressed this: once you determine a particular camera model's signature ain't trustworthy, you publish it for the rest of the world to know.

> It's not a rock-solid proof - my benchmark for proof needs to be on par with blockchains'.

It's rock-solid enough for enough people. I can't guarantee I'll personally satisfy you, but you're going to be sorely disappointed when you realize what benchmarks courts currently use for assessing evidence tampering...


It also occurs to me that the camera chips -- or even separately-sold chips -- could be augmented to perform transformations (like black-out) on already-signed images. You could even make this work with arbitrary transformations - just sign the new image along with a description (e.g., bytecode) of the sequence of transformations applied to it so far. This would let you post-process authentic images while maintaining authenticity.

The possibilities are pretty endless here.


ah. I thought it'd be more in the vein of safetynet, but guess not.


> that would necessitate atleast a big corporation like Nikon / Sony etc. being "in on it" to fake

Or an APT (AKA advanced persistent teenager) with their parents camera and more time than they know what to do with.


So you could never edit the video?


AI tags are to cover issues in the other direction: you publish an event as real, but they can prove it wasn't. If you didn't put the tag on it, malice can be inferred from your post (and further legal proceeding/moderation can happen)

It's the same as paid reviews: tags and disclaimers exist to make it easier to handle cases where you intentionally didn't put them.

It's not perfect and can be abused in other ways, but at least it's something.


> The premise is that it would prevent someone from misleading you by thinking that something happened when it didn't, but someone who wants to do that would just not tag it then.

And when they do that, the video is now against Google's policy and can be removed. That's the point of this policy.


That’s what I was thinking. Why don’t we just ask all scam videos to label themselves as scams while we’re at it?

It’s nice honest users will do that but they’re not really the problem are they.


> Why don’t we just ask all scam videos to label themselves as scams while we’re at it?

We do, we ask paid endorsements to be disclaimed.


Not convinced by this. Camera sensors have measurable individual noise, if you record RAW that won't be fakeable without prior access to the device. You'd have a straightforward case for defamation if your real footage were falsely labeled, and it would be easy to demonstrate in court.


> Camera sensors have measurable individual noise, if you record RAW that won't be fakeable without prior access to the device.

Which doesn't help you unless non-AI images are all required to be RAW. Moreover, someone who is trying to fabricate something could obviously obtain access to a real camera to emulate.

> You'd have a straightforward case for defamation if your real footage were falsely labeled, and it would be easy to demonstrate in court.

Defamation typically requires you to prove that the person making the claim knew it was false. They'll, of course, claim that they thought it was actually fake. Also, most people don't have the resources to sue YouTube for their screw ups.


Moreover, someone who is trying to fabricate something could obviously obtain access to a real camera to emulate.

Yes, but not to your camera. Sorry for not phrasing it more clearly: individual cameras have measurable noise signatures distinct from otherwise identical models.

On the lawsuit side, you just need to aver that you are the author of the original footage and are willing to prove it. As long as you are in possession of both the device and the footage, you have two pieces of solid evidence vs. someone elses feels/half-assed AI detection algorithm. There will be no shortage of tech-savvy media lawyers willing to take this case on contingency.


> Yes, but not to your camera.

But who is the "you" in this case? There can be footage of you that wasn't taken with your camera. The person falsifying it would just claim they used their own camera. Which they would have access to ahead of time in order to incorporate its fingerprint into the video before publishing it.


Most consumer cameras require access menus to enable raw because dealing with RAW is a truly terrible user experience. The vast majority of image/video sensors out there don't even support raw recordings, out of the box.


Anyone with a mid-to-upper range phone or better-than-entry level DSLR/bridge camera has access to this, and anyone who uses that camera to make a living (eg shooting footage of protests) understands how to use RAW. I have friends who are complete technophobes but have figured this out because they want to be able to sell their footage from time to time.


"Dealing with raw" is one of the major reasons to use an actual camera these days.


Unfortunately video codecs love to crush that fine detail.


DMCA abuse begs to differ.


That's because of safe harbor provisions, which don't exist in this context.


The do have some use. Take for example the AI images of the pope wearing luxury brands that someone made about last year. They clearly wanted to make it as a joke, not to purposefully misinform people, and as long as everybody is in on the joke then I see no issue with that. But some people who weren't aware of current AIgen capabilities took it as real and an AI tag would have avoided the discussion of "has AI art gone too far" while still allowing that person to make their joke.


> The AI tags are fundamentally useless.

To the extent that they allow Google to exclude AI video from training sets they’re obviously useful to Google.


They’re just gathering training data to train their AI-detection models.


I mean they’re building the labeled dataset right now by having creators label it for them.

I would suspect this helps make moderation models better at estimating confidence levels of ai generated content that isn’t labeled as such (ie for deception).

Surprised we aren’t seeing more of this in labeling datasets for this new world (outside of captchas)


agreed! this is another censorship tool.


I fear that we're barrelling fast toward a future when nobody can trust anything at all anymore, label or not.


And this isn't new. A fad in films in the 90's was hyper-realistic masks on the one side, and make-up and prosthetics artists on the other, making people look like other people.

Faking things is not new, and you've always been right to mistrust what you see on the internet. "AI" technology has made it easy, convenient, accessible and affordable to more people though, beforehand you needed image/video editing skills and software, a good voice mod, be a good (voice) actor, etc.


> you've always been right to mistrust what you see on the internet.

But these tools make deception easier and cheaper, meaning it will become much more common. Also, it's not just "on the internet". The trust problem this brings up applies to everything.


This deeply worries me. A post-truth society loses it's ability to participate in democracy, becomes a low-trust society, the population falls into learned helplessness and apathy ("who can even know what's true any more?")

Look at Russian society for a sneak preview if we don't get this right.


It just goes back to trusting the source. If 5 media orgs post different recordings of the same political speech, you can be reasonably sure it actually happened, or at least several orders of magnitude more sure than if it's one blurry video from a no name account.


And then you learn all of those media orgs are owned by the same billionare.

There will be no way to say something is true beside seeing it with own eyes.


Then that's a single media org.


This bodes well for autocracies and would-be autocrats. It's the logical extreme of what they've been trying to do on social media over the last decade or so.

https://en.wikipedia.org/wiki/Firehose_of_falsehood


I was immediately thinking that the #AI labels are going to give people a false sense of trust, so that when someone posts a good-enough fake without the #AI label, it can do damage if it goes viral before it gets taken down for the mislabeling. (Kudos for the effort, though, YouTube.)


Behind the scenes, I'm 99% confident that Google has deployed AI detection tools and will monitor for it.

That said, unless all the AI generators agree on a way to add an unalterable marker that something is generated, at one point it may become undetectable. May.


I'm not aware of any AI detection tools that are actually effective enough to be interesting. Perhaps Google has some super-secret method that works, but I rather doubt it. If they did, I think they'd be trumpeting it from the hilltops.


We have to expect people to think for themselves. People are flawed and will be deceived but trying to centralize critical thinking will have far more disastrous results. Its always been that way.

Im not saying Youtube shouldn’t have AI labels. Im saying we shouldn’t assume they’re reliable.


>but trying to centralize critical thinking will have far more disastrous results

No. Having sources of trust is the basis of managing complexity. When you turned the tap water on and bought a piece of meat at the butcher you didn't yourself verify whether its healthy right? You trust the medicine you buy contains exactly what is says on the label and didn't take a chemistry class. That's centralized trust. You rely on it ten thousand times a day implicitly.

There need to be measures to make sure media content is trustworthy, because the smartest person on the earth doesn't have enough resources to critically judge 1% of what they're exposed to every day. It is simply a question of information processing.

It's a mathematical necessity. Information that is collectively processed constantly goes up, individiual bandwith does not, therefore you need more division of labor, efficieny and higher forms of social organisation.


> Having sources of trust is the basis of managing complexity.

This is a false equivalence that I’ve already addressed.

> When you turned the tap water on and bought a piece of meat at the butcher you didn't yourself verify whether its healthy right?

To a degree, yeah, you do check. Especially when you get it from somewhere with prior problems. And if you see something off you check further and adjust accordingly.

Why resort to anology? Should we blindly trust YouTube to judge whats true or not? I stated that labeling videos is fine but what’s not fine is blindly trusting it.

Additionally, comparing to meat dispenses with all the controversy because food safety is a comparatively objective standard.

Compare, “is this steak safe to eat or not?” To “is this speech safe to hear or not?”


I'm probably paraphrasing Schneier (and getting it wrong), but getting water from the tap and having it polluted or poisonous, has legal and criminal consequences. Similarly getting meat from a butcher and having it tainted.

Right now, getting videos which are completely AI/deepfaked to misrepresent, are not subject to the same consequences, simply because either #1 people can't be bothered, #2 are too busy spreading it via social media, or #3 have no idea how to sue the party on the other side.

And therein lies the danger, as with social media, of the lack of consequences (and hence the popularity of swatting, pretexting etc)


I suspect we're headed into a world of attestation via cryptographically signed videos. If you're the sole witness, then you can reduce the trust in the event, however, if it's a major event, then we can fall back on existing news-gathering machinery to validate and counter your false tagging (e.g. if a BBC camera captured the event, or there is some other corroboration & fact checking).


How does the signature help? It only proves that the video hasn't been altered since [timestamp]. It doesn't prove that it wasn't AI-generated or manipulated.


Signatures are also able to (mostly) signal that a specific device (and/or application on that device) captured the video. It would be possible to check if a video was encoded by a specific instance of an iOS Camera app or AfterEffects on PC.

Everything else - corroboration, interviews, fact checking will remain as they are today and can't be replaced by technology. So I imagine a journalist would reach out to person who recorded thr video, ask them to show their device's fingerprint and ask about their experience when (event) occured, and then corroborate all that information from other sources.

When the news org publishes the video, they may sign it with their own key and/or vouch for the original one so viewers of clips on social media will know that Fox News (TM) is putting their name and reputation behind the video, and it hasn't been altered from the version Fox News chose to share, even though the "ModernMilitiaMan97" account that reshared it seems dubious.

Currently, there's no way to detect alterations or fabrications of both the "citizen-journalist" footage and post-broadcast footage.


If I have a CCTV camera that is in a known location and a TPM that signs its footage, I could probably convince a jury that it’s legit in the face of a deepfake defense.

That’s the bar- it’s not going to be infallible but if you don’t find evidence of tampering with the hardware then it’s probably going to be fine.


This might be worse than nothing. It's exactly the same tech as DRM, which is good enough to stop the average person, but where tons of people have private exploits stashed away to crack it. So the judge and general public trust the system to be basically foolproof, while criminals can forge fake signatures using keys they extracted from the hardware.


Having the tag weaponizes it by itself, because people will now consider any content without the tag real, whether it actually is or not.


If you upload a real event but you're the only source, it'll be doubted anyway; see also, most UFO sightings.


This isn’t strictly some blackhat thing, people will attempt to hand wave inconvenient evidence against them as AI generated and build reasonable doubt.


The labels collected by google will certainly be used to train classifiers to detect AI created content so I think that’s a legit concern.


Absolutely. Mass reporting for content violations to attack someone has been a tactic decades.


Porn classification / regulation boils down to: "I'll know it when I see it." Implying the existence of some hyper vigilant seer who can heroically determine what we should keep behind the video storre curtain of dencey, as if no grey areas exist. This also has the problem of requiring actual unbiased humans to view and accurately assess everything, which of course does not scale.

Perhaps AI classification is the mirror opposite to porn, using the test: "I'll know it when I don't see it", ie, if an average user would mistake AI generated content for reality, it should be clearly labeled as AI. But how do we enforce this? Does such enforcement scale? What about malicious actors?

We could conceivably use good AI to spot the bad AI, an endless AI cat and AI mouse game. Without strong AI regulation and norms a large portion of the internet will devolve into AI responding to AI generated content, seems like a gigantic waste of resources and the internet's potential.


> We could conceivably use good AI to spot the bad AI

I suspect this is Google's actual goal with the tagging system. It's not so much about helping users, rather it's a way to collect labeled data which they can later use to train their own "AI detection" algorithms


I can see a new type of capchas on the horizon: human verification! select all videos that were made by ai.


The volunteers who trained their subtitles generating mod without the knowledge that this was the ultimate goal know this.


The great thing about AI, is that it's exactly optimized for discriminating in gray areas with difficult-to-articulate rules like "I'll know it when I see it."


How so? I’ve previously thought that gray areas were an AI weakness, so I’m interested to hear the opposite perspective.


You mean it is exactly optimized to extend our conscious and unconscious biases to gray areas in an inconsistent and arbitrary way?


I think it also gives them a legal / business remedy if people fail to label their content.

If someone for example makes a political video and fails to label it, they can delete the video/terminate the account for a breach of service.


Given the regular stories posted on HN about folks who've had some aspect of their social or other media canceled by any some SaaS company, are these companies having many (legal) qualms as it is about canceling people without providing a good reason for it? Would be nice if they did, though...


I'd much prefer Google cancel capriciously with solid TOS backing to it than without, but I'll complain about their double standards about what they choose to censor... Not regardless, but without a doubt, because Google will choose to selectively enforce this rule.


At the very least it wouldn't be bad for PR if Google bans someone for specifically breaking a clear TOS.


Always gonna be greyzones. Someone with a 60 min video where everything is real except 10 sec of insignificant b-roll footage


but this gives room to also abuse the uncertainty to censor anyone without recourse - by arguing such and such video is "AI" (true or not), they have a plausiblely deniable reason to remove a video.

Power is power - can be used for good or bad. This labelling is a form of power.


This is no different from their existing power. They can already claim that a video contained copyright infringement, and you can only appeal that claim once, or try to sue Google.


I think the real benefit for this is that probably that it establishes trust as the default, and acts as a discriminator for good-faith uses of "AI". If most non-malicious uses of ML are transparently disclosed, and that's normalized, then it should be easier to identify and focus on bad-faith uses.


> I am curious to see the legal arguments that will inevitably kick up from each of these

Google policy isn't law; there's no court judging legal arguments, it is enforced at Google’s whim and with effectively no recourse, at least not one which is focused on parsing arguments about the details of the policy.

So there won’t be “legal arguments” over what exactly it applies to.


Can’t they be sued for breach of contract if they aren’t following their own tos? Having a rule like this gives them leeway to remove what they consider harmful.


No, because the ToS says they can do anything they want to. The purpose of a ToS is to set an expectation on which things the company will tolerate users doing. It's (almost always) not legally binding for either party.


As someone who studied video production two decades ago, regarding the criteria you mentioned for AI:

- Makes a real person appear to say or do something they didn't say or do

- Alters footage of a real event or place

- Generates a realistic-looking scene that didn't actually occur

These are things that have been true of edited video since even before AI was a thing. People can lie about reality with videos, and AI is just one of many tools to do so. So, as you said, there are many flaws with this approach, but I agree that requiring labels is at least a step in the right direction.


For videos that requires pretty significant effort even today to be done by hand. Humans don't scale. AI does.

I think anonymity is completely dead now. Not due to social networks but AI will definitely kill it. With enough resources it is possible to engage an AI arms race against whatever detector they put.

It is also possible to remove any watermarking. So the only way to prove if the content is made by humans will be requiring extensive proofs to their complete identity.

If you're from a minority or a fringe group, all of your hopes about spreading awareness anonymously on popular social media will be gone.


> Alters footage of a real event or place

I wonder if this will make all forms of surveillance, video or otherwise, inadmissible in court in the near future. It doesn’t seem like much of a stretch for a lawyer to make an argument for reasonable doubt, with any electronic media now.


> I wonder if this will make all forms of surveillance, video or otherwise, inadmissible in court in the near future.

No, it won't. Just as it does now, video evidence (like any other evidence that isn't testimony) will need to be supported by associated evidence (including, ultimately, testimony) as to its provenance.

> It doesn’t seem like much of a stretch for a lawyer to make an argument for reasonable doubt,

“Beyond a reasonable doubt” is only the standard for criminal convictions, and even then is based on the totality of evidence tending support or refute guilt, its not a standard each individual piece of evidence must clear for admissibility.


Bad evidence is not the same thing as inadmissible evidence. Evidence is admitted, and then the fact finder determines whether to consider it, and how much weight to give it. It is likely that surveillance video will be slightly less credible now, but can still be part of a large, convincing body of evidence.


Video evidence already requires attestation to be admissible evidence. You need a witness to claim that the footage comes from a camera that was placed there, that it was collected from the night of, etc. It's not like the prosecutor gets a tape in the mail and they can present it as evidence.


By that rationale, all witness testimony and written evidence should already be inadmissible.

This website focuses too much on the technical with little regard for the social a bit too often. Though in general, videos being easily fakable is still scary.


There will be a new side gig for ‘experts’ to explain deepfakes to a jury.


Basically, Google decides what’s real and what’s not. Cool.


Don't worry Google has become incompetent - it is in the "Fading AOL" portion of its life cycle. Do you remember how incontinent AOL became in the final minutes before it became completely irrelevant? Sure it's taking longer with Google. But it's not a process that they can reverse.

That means the system will be really really awful. So challengers can arise - maybe a challenger that YOU build, and open source!


I think you're living in a bubble if you believe that.


That open source can replace corporate centralization? Since centralized platforms started extracting more profits (including manipulation) things like Fediverse are on the rise. For mindless browsing, centralized is still king for now (Fediverse also works to an extent) but if your site has something better than what's on the centralized corporate platform, people will go there once they learn about it. We're on Hacker News instead of Reddit because?


1. Google lost the battle for LLMs, and cannot win that battle without putting a nail in the coffin of its own search monetization strategy

2. Google search is now no better than DDG, which it also cannot recover from without putting a nail in the coffin of its own search monetization strategy

If you want a randomly accuse me of living in a bubble that's fine. It won't bother me one bit. Me living in a bubble doesn't change the fact that Google is backed into a corner when it comes to search quality and competing with LLMs.


Eh, I think their control of Google, Youtube and Android is enough to keep them afloat while being as you said, incompetent.


Duck duck go is now more effective than Google, and I've been using it instead of Google for 5 years. Now I mainly ask GPT4 questions, and eliminate the need for search altogether in most cases. I run ad blockers on YouTube. And they're not getting cash from me via Android.

You may be right that they can survive more readily than AOL did.. but I certainly won't help them with anything more than a kick in the pants! ;)


For at least a dozen years it would seem.


On their own platforms, yes. We need to break up their monopolies so that their choices don't matter as much.


At least on Youtube, Google's AI does, yes.

And? Do they have a track record of crafting a false reality for people?


It's to comply with the EU AI regulatory framework. This step is just additional cost they wouldn't have voluntarily burdened themselves with.


And some of these questions apply to more traditional video editing techniques also.

If someone says "I am not a crook" and you edit out the "not", do you need to label it.

What if it is done for parody.

What if the edit is more subtle - where a 1 hour interview is edited into 10 minute excerpts.

Mislabeled videos for propaganda.

Or simply date or place incorrectly stated.

Dramatic recreations as often done in documentaries.

Etc


I know people on HN love to hate on Google, but at least they're a major platform that's TRYING. Mistakes will be made, but let's at least attempt at moving forward.


But why does it matter if it was generated by AI instead of generated like Forest Gump?


Because right now AI is an issue the public and policymakers are concerned about, and this is to show that private industry can take adequate steps to control it to stave up government regulation while the attention is high.


agreed. also, I think it will have the secondary effect of users rewarding non-ai content, and discourage ai generated content makers.


Truth in 'advertising' is important.

>> 17th angle - AI Generated vantage point based on existing 16 videos (reference links). <<

Would be • Alters footage of a real event or place


At some point fairly soon we will probably have to label everything as AI generated until proven otherwise


keep holding ourselves back with poorly written legislation designed to garner votes while rival companies take strides in the technology at rapid rates


> keep holding ourselves back

Not everyone works in ML.

> with poorly written legislation

This is a company policy, not legislation.

> designed to garner votes

Represent the will of the people?

> while rival companies take strides

Towards?

> in the technology at rapid rates

YouTube's "Trending" page isn't a research lab.

Even if it was, why would honesty slow it down?


Looks like there is a huge grea area that they need to figure out in practice. From https://support.google.com/youtube/answer/14328491#:

Examples of content creators don’t have to disclose:

  * Someone riding a unicorn through a fantastical world
  * Green screen used to depict someone floating in space
  * Color adjustment or lighting filters
  * Special effects filters, like adding background blur or vintage effects
  * Production assistance, like using generative AI tools to create or improve a video outline, script, thumbnail, title, or infographic
  * Caption creation
  * Video sharpening, upscaling or repair and voice or audio repair
  * Idea generation
Examples of content creators need to disclose:

  * Synthetically generating music (including music generated using Creator Music)
  * Voice cloning someone else’s voice to use it for voiceover
  * Synthetically generating extra footage of a real place, like a video of a surfer in Maui for a promotional travel video
  * Synthetically generating a realistic video of a match between two real professional tennis players
  * Making it appear as if someone gave advice that they did not actually give
  * Digitally altering audio to make it sound as if a popular singer missed a note in their live performance
  * Showing a realistic depiction of a tornado or other weather events moving toward a real city that didn’t actually happen
  * Making it appear as if hospital workers turned away sick or wounded patients
  * Depicting a public figure stealing something they did not steal, or admitting to stealing something when they did not make that admission
  * Making it look like a real person has been arrested or imprisoned


> * Voice cloning someone else’s voice to use it for voiceover

This is interesting because I was considering cloning my own voice as a way to record things without the inevitable hesitations, ums, errs, and stumbling over my words. By this standard I am allowed to do so.

But then I thought what does it even mean "someone else's" when multiple people can make a video, if my wife and I make a video together can we not then use my recorded voice because to her my voice is someone else.

I suspect all of these rules will have similar edge cases and a wide penumbra where arbitrary rulings will be autocratically applied.


> ...to her my voice is someone else.

To her, your voice is your voice not someone else's voice.

If you share a Youtube account with your wife, "someone else" means someone other than you or your wife.

The more interesting and troubling point is your use of "synthetic you" to make the real you sound better!


Is this really any different to say using makeup, cosmetic plastic surgery, or even choosing to wear specific clothes?


A lot different? The equivalent would be applying makeup to your mannequin replacement. All the things you mention are decoration. Replacing your voice is more than a surface alteration. I guess if some clever AI decides to take issue with what I say, and uses some enforcement tactic to arm-twist my opinion, I could change my mind.


We already have people editing videos anyway, you never see the first cut. AI-matically removing umms is really just speeding up that process.


I think the suggestion being discussed was AI-cloning your voice, and then using that for text-to-speech. Audio generation, rather than automating the cuts and tweaks to the recorded audio.


Yes exactly. Thanks. Voice cloning was indeed the suggestion made above.

The ethical challenge in my opinion, is that your status as living human narrator on a video is now irrelevant, when you're replaced by voice cloning. Perhaps we'll see a new book by "George Orwell" soon. We don't need the real man, his clone will do.


> The more interesting and troubling point is your use of "synthetic you" to make the real you sound better!

Why?


Did you replace the real you with "AI you" to ask me "why"?

I presume you wouldn't do that? How about replacing your own voice on a video that you make, allowing your viewers to believe it's your natural voice? Are you comfortable with that deception?

Everyone has evolving opinions about this subject. For me, I don't want to converse with stand-in replacements for living people.


How’s that different from photoshopping images?


> * Showing a realistic depiction of a tornado or other weather events moving toward a real city that didn’t actually happen

> * Making it appear as if hospital workers turned away sick or wounded patients

> * Depicting a public figure stealing something they did not steal, or admitting to stealing something when they did not make that admission

Considering they own the platform, why not just ban this type of content? It was possible to create this content before "AI".


There are many cases where such content is perfectly fine. After all, YouTube doesn't claim to be a place devoted to non-fiction only. The first one is an especially common thing in fiction.


Does that mean movies clips will need to be labeled?


Or video game footage? I explicitly remember people confusing Arma footage with real war footage.


The third one could easily be satire. Imagine that a politician is accused of stealing from the public purse, and issues a meme-worthy press statement denying it, and someone generates AI content of that politician claiming not to have stolen a car or something using a similar script.

Valid satire, fair use of the original content: parody is considered transformative. But it should be labeled as AI generated, or it's going to escape onto social media and cause havoc.

It might anyway, obviously. But that isn't a good reason to ban free expression here imho.


For what it's worth, this is already a genre of YouTube video, and I happen to find it absolutely hilarious:

https://youtu.be/3oWFFAVYMec

https://youtu.be/aL1f6w-ziOM


Respectfully disagree. Satire should not be labelled as satire. Onus is on the reader to be awake and thinking critically—not for the entire planet to be made into a safe space for the unthinking.

It was never historically the case that satire was expected to be labelled, or instantly recognized by anyone who stumbled across it. Satire is rude. It's meant to mock people—it is intended to muddle and provoke confused reactions. That's free expression nonetheless!


So when we have perfect deep fakes that are indistinguishable from real videos and people are using it for satire, people shouldn’t be required to inform people of that?

How is one to figure out what is real and what is a satire? Times and technologies change. What was once reasonable won’t always be.


- "How is one to figure out what is real and what is a satire?"

Context, source, tone of speech, and reasonability.

- "Times and technologies change."

And so do people! We adapt to times and technology; we don't need to be insulated from them. The only response needed to a new type of artificial medium, is, that people learn to be marginally more skeptical about that medium.


Nah. Satire was always safe when it's not pretending to have documented evidence of the thing actually happening.

Two recent headlines:

* Biden Urges Americans Not To Let Dangerous Online Rhetoric Humanize Palestinians [1]

* Trump says he would encourage Russia to attack Nato allies who pay too little [2]

Do you really think, if you jumped back a few years, you could have known which was satire and which wasn't?

The fact that we have video evidence of the second is (part) of how we know it's true. Sure, we could also trust the reporters who were there, but that doesn't lend itself to immediate verification by someone who sees the headline on their Facebook feed.

If the first had an accompanying AI video, do you think it would be believed by some people who are willing to believe the worst of Biden? Sure, especially in a timeline where the second headline is true.

1. https://www.theonion.com/biden-urges-americans-not-to-let-da...

2. https://www.theguardian.com/us-news/2024/feb/11/donald-trump...


> Synthetically generating music (including music generated using Creator Music)

What about music made with a synthesizer?


In one of the examples, they refer to something called "Dream Track"

> Dream Track in Shorts is an experimental song creation tool that allows creators to create a unique 30-second soundtrack with the voices of opted-in artists. It brings together the expertise of Google DeepMind and YouTube’s most innovative researchers with the expertise of our music industry partners, to open up new ways for creators on Shorts to create and engage with artists.

> Once a soundtrack is published, anyone can use the AI-generated soundtrack as-is to remix it into their own Shorts. These AI-generated soundtracks will have a text label indicating that they were created with Dream Track. We’re starting with a limited set of creators in the United States and opted-in artists. Based on the feedback from these experiments, we hope to expand this.

So my impression is they're talking about labeling music which is derived from a real source (like a singer or a band) and might conceivably be mistaken for coming from that source.


Even if it is fully AI-generated, this requirement seems off compared to the other ones.

In all of the other cases, it can be deceiving, but what is deceiving in synthetic music? There may be some cases where it is relevant, like when imitating the voice of a famous singer, but other than that, music is not "real", it is work coming from the imagination of its creator. That kind of thing is already dealt with with copyright, and attribution is a common requirement, and one that YouTube already enforces (how it does that is different matter).


From a Google/Alphabet perspective it could also be valuable to distinguish between „original“ and „ai generated“ music for the purpose of a cleaner database to train their own music generation models?


Alternatively they want to know who to ban when the RIAA inevitably starts suing the shit out of music generators.


If you manually did enough work have the copyright it is fine.

But since AI can't legally have copyright to their music Google probably wants to know for that reason.


> If you manually did enough work have the copyright it is fine.

Amount of work is not a basis for copyright. (Kind of work is, though the basis for the “kind” distinction used isn't actually a real objective category, so its ultimately almost entirely arbitary.)


That could get tricky. A lot of hardware and software MIDI sequencers these days have probabilistic triggering built in, to introduce variation in drum loops, basslines, and so forth. An argument could be made that even if you programmed the sequence and all the sounds yourself, having any randomization or algorithmic elements would make the resulting work ineligible for copyright.


It goes without saying that a piece of software can't be a copyright holder.

But the person who uses that software certainly can own the copyright to the resulting work.


If someone else uses the same AI generator software and makes the same piece of music should Google go after them for it? I don't think that would hold in court.

Hopefully this means that AI generated music gets skipped by Googles DRM checks.


I hope there is some kind of middle ground, legally, here? Like say you use a piano that uses AI to generate artificial piano sounds, but you create and play the melody yourself: can you get copyright or not?


IANAL. I think you'd get copyright on the melody and the recording, but not the sound font that the AI created.


> Synthetically generating music

Yagoddabekidding. That could cover any piece of music created with MIDI sequencing and synthesizers and such.


I think there's a clear difference between synthesizing music and synthetically generating music. One term has been around for decades and the other one is being confused with that.


To someone who is doing one or the there is a clear difference. I don't trust the EU or YouTube to be able to tell the difference from the other end, by the end product alone.

If AI writes MIDI input for a synthesizer, rather than producing the actual waveform, where does that land?


>Showing a realistic depiction of a tornado or other weather events moving toward a real city that didn’t actually happen

A bit funny considering a realistic warning and "live" radar map of an impending, major, natural disaster occurring in your city apparently doesn't violate their ad policy on YouTube. Probably the only time an ad gave me a genuine fright.


There’s a whole genre of videos on YouTube simulating the PSAs of large scale disasters. Nuclear war, meteors, etc. My 12 year old is really into them.


  * Digitally altering audio to make it sound as if a popular singer missed a note in their live performance
Does all the autotuning that singers use in live performance counts?

/j


Interestingly they only say you have to disclose it if it's a singer missing a note. Seems like it's fair game to fix a note that was off key in real life and not disclose that.


That was only an example, not a rule. Presumably you need to label it if you fix the note too


IMHO, not even a /j.

Under the current guidelines, doesn't all music performances that make use of some sort of pitch correction assist are technically "digitally altered"?


What about autotune is AI?


Those voice overs on tiktok that are computer generated but sound quite real and often are reading some script. Do they have to disclose that those voices are artificially produced?


So all those trash generated AI voice product videos and or just reading reddit text can stay?


To me, the guidelines are fairly clear: if it’s assisting production of a work of fiction, it’s ok.


They don't bother to mention it, but this is actually to comply with the the new EU AI act.

> Providers will also have to ensure that AI-generated content is identifiable. Besides, AI-generated text published with the purpose to inform the public on matters of public interest must be labelled as artificially generated. This also applies to audio and video content constituting deep fakes

https://digital-strategy.ec.europa.eu/en/policies/regulatory....

Some discussion here: https://news.ycombinator.com/item?id=39746669


Is anyone else worried about how naive this policy is?

The solution here is for important institutions to get onboard with the public key infrastructure, and start signing anything they want to certify as authentic.

The culture needs to shift from assuming video and pictures are real, to assuming they are made the easiest way possible. A signature means the signer wants you to know the content is theirs, nothing else.

It doesn't help to train people to live in a pretend world where fake content always has a warning sticker.


I see a lot of confusing authenticity with accuracy. Someone can sign the statement "Obama is white" but that doesn't make it a true statement. The use of PKI as part of showing provenance/chain of trust doesn't make any claims about the accuracy of what is signed. All it does is assert that a given identity signed something.


It's not about what is being signed, it's about who signed it and whether you trust that source. I want credible news outlets to start signing their content with a key I can verify as theirs. In that future all unsigned content is by definition fishy. PKI is the only way to implement trust in a digital realm.


> It's not about what is being signed, it's about who signed it

Yeah, that's what I said about PKI. I also said there is confusion between provenance of a statement and its accuracy. Just because it's signed doesn't mean anything about its accuracy, but those that confuse the two will think that just because it is signed, or because it was signed by a certain "trustworthy" party, that indicates accuracy. PKI does not establish the trustworthiness of the other party, it only gives you confidence in the identity of the party who signed something.

George Santos could sign his resume. We know it was signed by George Santos. And yet nothing in the resume could be considered accurate (or even a falsehood) purely because it is signed. That it was proven to be signed by George Santos via PKI is independent of the fact that George Santos is a known liar.


Time is important; have a trusted source of time. The nice thing about reality is that it is consistent.


Why do you need a whole PKI for that, rather than just, say, a link to the news outlet's website where the content is hosted? People have already been doing that pretty much since the web was created.


PKI has been around for, what, 30 years? Image authentication is just not going to happen at this point, because everyone's got too used to post-processing and it's a massive hassle for something that ultimately doesn't matter because real people use other processes to determine whether things are true or not.


Example: a video shows a group of police beating up a man for a minor crime (say littering). The video is signed by Michael Smith (the random passerby who filmed it on his phone). The video is published to Instagram and shared widely.

How do you expect people to take the authenticity of this video?


The signing happens using hardware inside the phone or camera. It implies that it came from a real device and thus was not AI created or modified. Check out https://c2pa.org/specifications/specifications/1.4/attestati... for more details


And what happens when people just film an AI video playing on a screen with this phone or camera?


One of Neal Stephenson's more recent novels deals with this concept. Fake news becomes so bad that everyone starts singing everything they create.


This is about as realistic as the next generation of congress people ending up 40 years younger.

We literally have politicians talking about pouring acid on hardware and expect these same bumbleheads to keep their signing keys safe at the same time. The average person is far too technologically illiterate to do that. Next time you go to grandmas house you'll learn she traded her signing key for chocolate chip cookies.


I imagine it would be something handled pretty automatically for everyone.

If Apple wanted to sign every photo and document the iPhone they could probably make the whole user experience simple enough for most grandmas.

Some people will certainly give away their keys, just like bank accounts and social security numbers today, but those people probably aren't terribly concerned with proving the ownership of their online documents.


>I imagine it would be something handled pretty automatically for everyone.

Then your imagination fails you.

If it is automatic/easy, then you have the 'easy key' problem, such as the key is easy to steal or copy. For example is it based on your apple account? Then what occurs with an account is stolen? Is it based on a device, what happens when the device is stolen?

Who's doing the PKI? Is it going to be like https, but for individuals (this has never really worked at this scale and with revocation). Like most social media is posting content taken by randos on the internet.


When your account is stolen someone can create "official" documents in your name and impersonate you. There could be a system for invalidating your key after a certain date to help out with those situations.

For prominent people who actually have to worry about being impersonated they could provide their own keys.

The infrastructure could be managed by multiple groups or a singular one like the government. The point isn't to be a perfect system, it's to generate enough trust that what you're looking at is genuine and not a total fraud.

In a world where AI bots are generating fake information about everyone in the world, that kind of system could certainly be built and be useful.


> The culture needs to shift from assuming video and pictures are real, to assuming they are made the easiest way possible.

That sounds like a dystopia, but I guess we're going into that direction. I expect that a lot of fringe groups like flat-earthers, lizard people conspiracy, war in Ukraine is fake, will become way more mainstream.


India is considering very similar laws as well (though not implemented at this time)[1], so it’s not just the EU.

Also, if every applicable regulation had to be mentioned, it’d be a very long list.

[1] https://epaper.telegraphindia.com/imageview/464914/53928423/...


Considering is different from actually something that should be enforced


Labeling AI-generated content (assuming it works) is beneficial for Google, as they can avoid some dataset contamination.


Excellent point. With more and more AI-generated content it will be key to be able to tell it apart from the human-generated content.


Usually when a big corporation gleefully announces a change like this it's worth checking whether there's any regulations on that topic taking effect in the near future.

On a local level, I recall how various brands started making a big deal of replacing disposable plastic bags with canvas or paper alternatives "for the environment" just coincidentally a few months before disposable plastic bags were banned in the entire country.


Seems like this is sort of a manufactured argument. I mean, should every product everywhere have to cite every regulation it complies with? Your ibuprofen bottle doesn't bother to cite the FDA rules under which it was tested. Your car doesn't list the DOT as the reason it's got ABS brakes.

The EU made a rule. YouTube complied. That changes the user experience. They documented it.


If the contents of my ibuprofen bottle changed due to regulatory changes, then it wouldn’t be weird to have that cited at all.


Certain goods sold in the EU are required to have CE marking to affirm that they satisfy EU regulations.


+1 in France at least, food products must not suggest that mandatory properties like "preservative free" is unique. When they advertise this on the package, they must disclose it's per regulation. Source: https://www.economie.gouv.fr/particuliers/denrees-alimentair...


Doesn't seem that out of place for a blog post on the exact change they made to comply though.

I mean you'd expect a pharmaceutical company to mention which rules they comply with at some point, even if not on the actual product (though in the case of medicine, probably also on the actual product).


Ofc. they don't mention it for big tech companies EU = Evil


[flagged]


I make good pay from that scam so screw EU trying to steal our wealth.

How is this shit different that useless and annoying cookie popups.


So you making good pay by enabling a scammer makes it totally okay for the scammer to operate? By extension of that logic, hitmen should no longer be persecuted provided they make good pay from it.


I love cookie popups, I get to reject all kinds of marketing and tracking cookies.


You'd think they're evil too if they let a bunch of middlemen and parasitic companies dictate how the software you invested untold sums and hours developing and marketing should work.


Why should software be any different from aircraft?


I have a more entertaining: "typical google, getting somebody else to give them training data in exchnage for free hosting of some sort"


What if a real person reads a script that was created with an LLM? Does that count? Should it?


Blog post specifically mentions that using AI to help writing the script does not require labeling the video.


Sorry, I wasn't entirely clear that I was specifically responding to the GP comment referencing the EU AI act (as opposed to creating a new top-level comment responding to the original blog post and Google's specific policy) which pointed out:

> Besides, AI-generated text published with the purpose to inform the public on matters of public interest must be labelled as artificially generated. This also applies to audio and video content constituting deep fakes

Clearly "AI-generated text" doesn't apply to YouTube videos.

But, it is interesting that if you use an LLM to generate text and present that text to users, you need to inform them it was AI-generated (per the act). But if a real person reads it out, apparently you don't (per the policy)?

This seems like a weird distinction to me. Should the audience be informed if a series of words were LLM-generated or not? If so, why does it matter if they're delivered as text, or if they're read out?


I would take this a step further and make it required that companies create an easy way for users to opt-out of this type of content.


I think many countries have started considering the legal regulation of using AI in any content


Thank you EU!


Most interesting example to me: "Digitally altering audio to make it sound as if a popular singer missed a note in their live performance".

This seems oddly specific to the inverse of what happened recently with Alicia Keys from the recent Superbowl. As Robert Komaniecki pointed out on X [1], Alicia Keys hit a "sour note" which was silently edited by the NFL to fix it.

[1] https://twitter.com/Komaniecki_R/status/1757074365102084464


Digitally altering audio to make it sound as if a popular singer hit a lot of notes is still fine though.


Correct, it's the inverse that requires disclosure by Youtube.

Still, I find it interesting. If you can't synthetically alter someone's performance to be "worse", is it OK that the NFL synthetically altered Alicia Key's performance to be "better"?

For a more consequential example, imagine Biden's marketing team "cleaning up" his speech after he has mumbled or trailed off a word, misleading the US public during an election year. Should that be disclosed?


I don't understand the distinction. if the intent is to protect the user, then what if I make the sound better for rival contestants on American idol and don't do it for singers of a certain race.

seems to comply?


This is a great example as a discussion point, thank you for sharing.

I will be coming back to this video in several months time to check whether the "Altered or synthetic content" tag has actually been applied to it or not. If not, I will report it to YouTube.


Yea, it’s a really super example!

However autotune has existed for decades. Would it have been better if artists were required to label when they used autotune to correct their singing? I say yes but reasonable people can disagree!

I wonder if we are going to settle on an AI regime where it’s OK to use AI to deceptively make someone seem “better” but not to deceptively make someone seem “worse.” We are entering a wild decade.


> I say yes but reasonable people can disagree!

A lot of people do! Tone correction [1] is a normal fact of life in the music industry, especially in recordings. Using it well takes both some degree of vocal skill and production skill. You'll often find that it's incredibly obvious when done poorly, but nearly unnoticeable when done well.

[1] AutoTune is a specific brand


Oh no, is that going to mess up my favorite genre called shreds? https://www.youtube.com/watch?v=1nAhQOoJTIA


Only if people start rejecting it because they learn it was modified by AI.

If they don't reject it for that, nothing changes.


>Some examples of content that require disclosure include: [...] Generating realistic scenes: Showing a realistic depiction of fictional major events, like a tornado moving toward a real town.

This sounds like every thumbnail on youtube these days. It's good that this is not limited to AI, but it also means this will be a nightmare to police.


Exactly, and many have done exactly the same kind of video using VFX. What's the difference? These kind of reactions remind me of the stories of the backlash following the introduction of calculators in schools...


Using VFX for realistic scenes is more involved. VFX requires more expertise to do convincingly and realistically, in the thousands of hours of experience. More involved scenes require multiple professionals. The tooling and assets costs more. An inexperienced person, in a hundred hours of effort, can put out 10ish realistic scenes with leading edge AI tools, when previously they could do 0.

This is like regulating handguns differently from compound bows. Both are lethal weapons, but the bow requires hours of training to use effectively, and is more difficult to carry discreetly. The combination of ease, convenience, and accessibility necessitates new regulation.

This being said, AI for video is an incredibly promising technology, and I look forward to watching the TV shows and movies generated with AI-powered tooling.


What if new AI tools negate the thousands of hours experience to generate realistic VFX scenes, so now realistic scenes can be made by both non-AI VFX experts and AI-assisted VFX laymen?

Do we make all usages of VFX now require a warning, just in case the VFX was generated by AI?

I think this is different to the bow v gun metaphor as I can tell an arrow from a bullet, but I can foresee a future where no human could tell the difference between AI-assisted and non-AI-assisted VFX / art

I believe this is evidenced by the fact that people can go around accusing any art piece of being AI art and the burden of proving them wrong falls on the artist. Essentially I believe we are rapidly approaching the point of it not mattering if someone uses AI in their art because people won't be able to tell anyway


> Using VFX for realistic scenes is more involved.

This really depends on what you're doing. There are some great Cinema 4d plugins out there. As the plethora of YouTube tutorials out there clearly demonstrate, multiple professionals, and vast experience, are not required for some of the things they have listed. Tooling and assets costs are 0, in the high seas.

Until Sora is widely available, or the open source models catch up, at this moment it's easier to use something like Cinema 4d than AI.


What if i use an LLM powered AI to operate VFX software to generate a realistic looking scene? ;)


so if I used Blender it's banned? it's very tough to draw that line in the sand


> What's the difference?

The ease and lack of skill required. That brings whole another set of implications.


I'm sorry, but using a calculator to get around having to learn arithmetic is not even close being the same thing. Prove to me that you can do basic arithmetic, and then we can move on to using calculators for the more complex stuff where if you had to could at least come to the same value as the calculator.

People using VFX aren't trying to create images in likeness of another existing person to get people to buy crypto or other scams. Comparing the two is disingenuous at best.


This is very reminiscent of an act where someone calls the police about a 'looming' terrorist attack, for example


You see thumbnails? I haven't seen a thumbnail in years, I use both DeArrow and Sponsorblock, youtube is very watchable.


I’m reminded of how banks require people to fill out forms explaining what they’re doing, where it’s expected that criminals will lie, but this is an easy thing to prosecute later after they’re caught.

Could a similar argument be applied here? It doesn’t seem like there is much in the way of consequences for lying to Google. But I suppose they have other ways of checking for it, and catching someone lying is a signal that makes the account more suspicious.


It’s a compliance checkbox for the most part I think. They can stay on top of new legislation by claiming they are providing tools to deal with misinformation, whereas it’d be easier to say that they are encouraging the proliferation of misinformation by not doing anything about it. It certainly shifts the legal question in the way you described it would seem.


Yeah I think it’s a very similar approach to what you’ve described. The scale of YouTube, I don’t think you can just start banning content you don’t like. Instead you have to have a policy, clearly documented, and then you can start enforcing based on that policy.

The other thing is that they don’t necessarily want to ban all of this content. For example a video demonstrating how AI can be used to create misinformation and showing examples, would be fairly clearly “morally” ok. The policy being that you have to declare it allows for this sort of content to live on the platform, but allows you to filter it out in certain contexts where it may be inappropriate (searches for election coverage?) and allows you to badge it for users (like Covid information tags).


> Altering footage of real events or places: Such as making it appear as if a real building caught fire, or altering a real cityscape to make it appear different than in reality.

What about the picture you see before clicking on the actual video? This article of course is addressing the content of the videos, but I can't help but look at the comically cartoonish, overly dramatic -- clickbait -- picture preview of the video.

For example, there is a video about a tornado that passed close to a content author and the author posts video captured by their phone. In the preview image, you see the author "literally getting sucked into a tornado". Is that "altered and synthetic content"?


I don't think they need to be treated the same.

The thumbnail isn't the content itself necessarily.


Yeah I agree; and it's generally a bit harder to communicate _too_ much misinformation in a thumbnail.


Without enforceability it'll go the same way as it has on Pixiv, the good actors will properly label their AI utilizing work, while the bad actors will continue to lie to try to maximize their audience until they get caught, then rinse and repeat. Kind of like crypto-scammers.

For context, Pixiv had to deal with a massive wave of AI content being dumped onto the site by wannabe artists basically right as the initial diffusion models became accessible. They responded by making 'AI-generated' a checkbox to go with the options to mark NSFW and adding an option for users to disable AI-generated content from being recommended to them. Then, after an incident of someone using their Patreon style service to pretend to be a popular artist, selling commissions generated by AI to copy the artist's style, they banned AI-generated content from being offered through that service.


I think that the idea is mostly to dictate culture. And I like the idea, not only for preventing fraud. Ever since the first Starship launches, the reality looks more incredible than the fiction. Go look up the SN-8 landing video, tell me that does not look generated. I just want to know what is real and what is generated, by AI or not.

I think that this policy is not perfect, but it is a step in the right direction.


Also remains to be seen if labeling your content as containing AI-generated work will help or hurt you in your viewership.

My guess is that youtube is going to downrank this content, and may be trying to crowdsource training data in order to do this automatically.


I think that for now they're just going to use it as a means of figuring out what kind of AI-involved content people are ok with and what kind they react negatively to.

Personally, I've developed a strong aversion to content that is primarily done by AI with very little human effort on top. After how things went with Pixiv I've come to hold the belief that our societies don't help people develop 'cultural maturity'. People want the clout/respect of being a popular artist/creator, without having to go through the journey they all go through which leads to them becoming popular. It's like wanting to use the title of Doctor without putting in the effort to earn a doctorate, the difference just being that we do have a culture of thinking that it's bad to do that.


I think one of the bigger issues will be false positives. You'll do an upload, and youtube will take it down claiming that some element was AI generated. You can appeal, but it'll get automatically rejected. So you have to rework your video and figure out what it thought might be AI generated and re-upload.


Rather than tagging what’s made up, why not tag what’s genuine? There’s gonna be less of it than the endless mountain of generated stuff.

I’m thinking something as simple as a digital signature that certifies e.g. a photo was made with my phone if I want to prove it, or if someone edits my file there should be a way of keeping track of the chain of trust.


This would I think be the ideal if it's possible. I'd love videos to have signatures that prove when it was recorded, that it was recorded from so and so a phone, that it hasn't had any modification, and maybe even optionally the GPS location (for like news organisations, to even more reliabily prove the validity of their media). And then have a way to have a video format that can allow certain modifications (eg colour grading), but encode that some aesthetic changes were made. And, more importantly, a way to denote that a region of video is a clip of a another video, and provide a backing signature for the validity of the clip.

That would allow a much strong verifiability of media. But I'm not sure if that would be possible...


yeah expect this to flip. i am guessing this will go like “https” path. first we will saw green lock for https enabled sites, later we saw insecure for http sites.


Does YouTube know that the Google Photos team actively encourages altering your videos and photographs to represent scenes that never happened?

https://blog.google/products/photos/google-photos-features-p... https://blog.google/products/photos/google-photos-magic-edit...


Google of yore would have offered a 'not AI' type of filter in their advanced search.

Present day Google is too busy selling AI shovels to quell Wall St's grumbling, to even consider what AI video will to do to the already bad 'needle in a haystack' nature of search.


A "not AI" filter is an excellent idea.


Adding before:2020 to the end a query is a fairly good approximation.

Note that this will become less useful over time, however, as Google prunes old results from its index.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: