Hacker News new | past | comments | ask | show | jobs | submit login

I have found EMO (not open though) [1] to be the best yet.

Look at the rapping example near the end. The lip sync is nearly flawless. The first black and white lady singing is also almost perfect. It even gives them the subtle jerk to pause for breathing. Unless you know and really are looking for flaws, you won't find anything that stands out making them look real.

[1] https://humanaigc.github.io/emote-portrait-alive/




It's awesome and I hate it.

This singular peace of technology makes me pessimistic towards the future. Until now, video record was considered to be a very good evidence. Let's say you argue with a person about what X person said. You show them a video and they will be like "ok, he did say that, but...". You could at least set some facts straight, and then discuss the interpretations.

But that will be now gone. You can now generate mass amount of real looking fakes and at the same time label anything you don't like as fake. There's really no independent evidence now, you can only put trust into the medium of your choosing (youtube channel, newspapers, tv station) that they report honestly.

This seems to have only minimal benefits for the society, but huge negatives. But there's no stopping here...


This was an inevitable outcome of the advancement of technology. I would argue that we lost trust in all mediums a long time ago it is just now being realized by the masses.

But as usual, we shall adapt and overcome.


> But as usual, we shall adapt and overcome.

I don't believe this is a problem we can "overcome". We will need to learn to live with the "alternative facts" being more prominent than now, but I'm not looking forward to it.


> I don't believe this is a problem we can "overcome".

Digital signatures can not remedy this problem? When you login to your bank how do you know you are logging into your bank? In the future a recording without signatures will be like a bank login without https is today.


The only thing which signatures / https provide is ascertaining the identity of the other side, it won't help you in determining whether the recording is fake or not.

For this to work, you need to have an already trusting relationship with the media. Like, ok, I can trust NYT, so I will trust videos signed by them. But another person distrusts NYT and trusts only Truth Social. In the past, we could at least agree on basic facts like January 6th actually happening, but I think this generative AI will make laying out the facts much more difficult or even impossible.


You raise a valid point that digital signatures and HTTPS alone cannot guarantee the authenticity of a recording. However, modern smartphones and other mobile devices have the capability to provide stronger assurances about the originality of recordings through the use of tamper-proof secure hardware.

Many high-end smartphones, such as iPhones and some Android devices, incorporate secure enclaves or trusted execution environments (TEEs). These are isolated, tamper-resistant hardware components that can securely store and process sensitive data. When a recording is made on such a device, the secure hardware can associate the recording with additional metadata, including the specific date, time, GPS coordinates, and user account information. This metadata is cryptographically bound to the recording itself.

Furthermore, the device can digitally sign the recording and its associated metadata using a unique key stored within the secure hardware. This digital signature serves as a testament to the recording's originality. Companies like Apple or Google, who manage the secure hardware and signing keys, can then vouch for the authenticity of the recording.

While this approach doesn't completely eliminate the possibility of fake recordings, it significantly raises the bar for creating convincing forgeries. Modifying the recording or its metadata would invalidate the digital signature, making it evident that tampering has occurred.

Of course, as you mentioned, trust in the entity verifying the signatures (e.g., Apple or Google) is still required. However, this trust is based on their reputation and the security measures they employ, rather than on the content of the recording itself.


No such thing as tamper proof hardware, only tamper resistant hardware. Also the whole "sign what came from the sensor" idea is widely known to not work because you can easily record a playback of doctored footage. Lots of LLM-isms from this comment too.


> you can easily record a playback of doctored footage

You believe this is easy when the device has multiple recording sensors and multidimensional information (such as spatial information, changes in focus sensors durning recording, etc) is part of recording that is digitally signed?


Who's proposing such a device to get widespread adoption? I've heard of sensor data signing [1] but not what you're describing.

[1] https://pro.sony/ue_US/solutions/forgery-detection


The concept of sensor data signing to authenticate videos and images captured on mobile devices is still an emerging technology, not yet widely adopted. However, as AI-generated synthetic media becomes more prevalent and potentially problematic, solutions like this may gain traction.

The key idea is to leverage the array of sensors built into modern smartphones and tablets - accelerometer, gyroscope, GPS, WiFi/cellular signal data, etc. - to cryptographically sign the sensor readings along with the visual data itself at the time of capture. This extra layer of verifiable sensor data would help establish that a recording originated from a real physical device in a particular place and time, as opposed to a purely digital fabrication.

Historically, technologies like digital signatures and public key cryptography started out in niche military/government applications before becoming ubiquitous in the computer era. In a similar way, sensor-level authentication of audiovisual media could follow an adoption curve driven by the growing need to combat sophisticated AI forgeries.


I know I’m logging into my bank because I initiated the connection, and refuse to believe anyone in any other context who claims to be my bank. People are routinely defrauded by scammers who claim to be their bank, and banks are routinely scammed by people who claim to be an account holder.


> I know I’m logging into my bank because I initiated the connection, ...

Just because you initiated the connection how do you know the other end is you bank? Do you trust every internet company that carries your packets to the bank? Trust their employees? Trust their security practices? Do you trust firmware on all the devices involved?

> People are routinely defrauded by scammers who claim to be their bank,

I have read about this in the news just like I read about snakes with two heads etc yet I have yet to meet someone that has had this happen to them. What fraction of people that you know have had this happen?

Could it be that these people believe like you do that "I know I’m logging into my bank because I initiated the connection" as opposed to checking the digital signatures on the connection?


Maybe relying on video "evidence" to prove something, is actually the bug/vulnerability, and this technology will finally "fix" the bug by calling into question all video evidence. I'd rather the tech be widely publicized and out there, so people know it's a thing and can be convinced to disregard video "evidence", than it be kept secret and the public just unknowingly trusting video. Just like people know photoshop is a thing and (hopefully) don't by default believe images they see on the Internet.


> by calling into question all video evidence

I think it's a dangerous mindset to doubt everything you can't see with your own eyes. There was this fringe group claiming that the war in Ukraine is fake, that there's no war. With this mindset, such claims will become more mainstream, you could even call it reasonable.


That is very unnerving to think about. If video evidence could currently be flawlessly faked it'd give some legitimacy to those claims. If we do reach this point the value of a source will shift from "I can prove it" to "you can trust me", and I do not think that will be for the better. I'm not yet ready to live in a completely subjective world, where truth and falsity have equal weight.


The Polk County Sheriff's Office recently announced a partnership with Florida Polytechnic University to start working on this, dubbed the Sheriff's Artificial Intelligence Laboratory (SAIL).

https://www.polksheriff.org/news-investigations/polk-county-...

The conference video at 1:00 starts off with a generated clip of Elon Musk saying he's going to move to Polk County. The Sheriff highlights your concerns as well as many others.

Conference video (29:56): https://www.youtube.com/watch?v=DHj18pOcXHc


Why do you care so much about what people said? Before video recording was a thing people didn't have to constantly watch their back and monitor what they were saying in fear of losing their jobs. What happened at a party, stayed at a party.

You may say it's important only for public officials. But why is it important? Because you're giving huge amount of power to single individuals and somehow we're taught that's a good thing - or at least that it's inevitable for keeping peace or to keep crime at bay. What a load of bs. I hope distrust in centralisation increases. It should have been there in the first place.


Speeches of politicians is just one aspect of it. This technology makes it easier to both deny the reality and construct new ones. Beyond the horizon of your own eyes, there will be only subjective facts, fed to you by the media of your choosing. Is war in Ukraine raging, or is it all fake? It's a matter of opinion, not a fact you can establish (apart from shipping you and your discussion partner there to see it with your own eyes).


You raise a good point yet it is not what people say that matters but what it predicts about them. Modern society is built on trust and the things you want to know but can not observe can often be predicted from what you can observe such as things being said.


Wow! EMO is impressive. Do they plan on open sourcing it?

The page has a link to github[1] right at the top but the repo is basically empty.

https://github.com/HumanAIGC/EMO


Issues comments in EMO repo point to V-Express repo [1] which was released 2 weeks ago and appears to be a fully functioning open source?

[1] https://github.com/tencent-ailab/V-Express


The black and white lady is nightmare fuel for me personally.


What irks you in her? I haven't seen her before at all may be that's the reason I am not seeing anything too strange.


What Ces11 said.


Audrey Hepburn? Was she in a scary movie or play someone scary?


In the synthetic video she looks like some kind of Frankenstein's monster, brought to life with electrodes or hidden motors, similar to the other video.

Both 'move' in ways that are very unnatural.


Glad it isn't just me.


She was a very graceful lady in wholesome movies.

The juxtaposition of modern facial expressions of an influencer type singer covering Ed Sheeran at some X factor type television show are what makes it creepy. It is somehow doubly fake and extremely out of character if you are familiar with her.


[flagged]


Ok. So I showed the first 2 videos to my wife. She noticed the teeth merging looking different each time and then ear. But that was all.

For me, lip sync and body movement is what excited me most. They are closest to real when compared with any similar tech.


Crickey. Well I don't know what to think anymore. I guess it got "good enough" for some things. I can still tell. This is going to suck for some people (it feels uncomfortable).


IMO, it sucks for me beyond the level of quality.

For starters, consent is the first problem I have. Yes, lots of examples, but none of the individuals consented to having their likeness used to say things they didn't. Now, abstract this problem of a lack of consent beyond "examples"—the creators of this have no problem with the ethics of not asking for consent, thus the world at large will not either.

Then we have the problem of how it is going to be abused and what problems will exist because of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: