These things really are out of the bag at this point. Google and OpenAI won't release their checkpoint but it's really just a matter of time (not long, just months probably) before some random group of people with enough patience and data (you really don't need insane compute if you just have patience on the scale of a few months) releases fully working text to video models ready to use.
The first short term application for this may be in online advertising. Cheap, short and low resolution clips that are only meant to grab attention and can be as exotic as you want, related to a product you want to sell.
The advertiser cherry-picks from a pool of generated clips related to their query, so the hit and miss nature of these models won’t be a big problem.
Pornography. The first application will be pornography.
The first mover who can surreptitiously replace cam girls doing stuff like drinking milk out of dog bowls while wearing dog costumes is going to make so much money from unsuspecting rubes who think they're actually paying strangers to degrade themselves.
Definitely. Porn sites tend to be leaders in many ways. Seen it in UI/UX features. iirc pornhub/mindgeek had the thumbnail timeline scrub and timeline viewing hotspots before YouTube and other video sites.
And video compression by extension. As for porn choreographer can express intent in video, put it through NN to decompose into vector representation, add the girl to the blender and put back through.
I think you're unaware of just how much some people make on onlyfans. It's far more than a few dollars.
not to mention something like this will allow you to generate body types that are very rare, if not outright impossible, and you can animate them doing things that are also very rare, or outright impossible.
Training from video might unleash new capabilities in language models. The volume of data in video format is huge, video contains different information - some common sense patterns that are rarely described in text, it might empower Gato-like RL agents to quickly understand and solve tasks in new environments. Even training from raw audio might open up language models to work for less represented languages / with little written down text, learn to interpret sentiment from tone and improve music composition.
So far we have
- multi modality - text, image, audio, video, math, code. At some point in the future brain scans will also become a cheap modality to train on.
- multi task - finetuning on hundreds of tasks at once and getting zero shot task capabilities
- multi memory - except for the limited buffer, the language models can use retrieval/search over an external corpus or knowledge base and in-batch memories. This opens up the possibility of large scale context, updating recent factual information without retraining; can also make a small LM perform as well as a huge LM without the memory
- multi environment - huge training corpus is great, but is fixed, while environments are dynamic. Agents interacting with games, chat bots, robots, REPL can explore endless scenarios interactively.
Since this comes from a Chinese university they will likely drop the code/checkpoint or at least API access also (their text->image model already has a api).
We generated images from text. We generate videos from text. What's next? We generate a bug free video games or business application source code from text?
What would happen if we can translate videos into holograms that can be projected into the living room. In fact hologram virtual objects that can be felt even.
Everything is converging to the ultimate use case: porn and virtual sex.
A solution against population decline or population explosion might be to conjure up holographic sex objects. It will cost nothing to the participant, only their sperm or eggs submitted to a government affiliated contractor.
To increase population, the freemium user donates their sperm or eggs fulfilment center. To decrease population, the user notices nothing the fertile fulfillment center is tasked with rejecting or accepting constant stream of sperm and eggs.
Babies born out of this state system would literally own nothing and be happy. Their foster parents have access to all well being and health resources provided by the state. Divorce is at an all time low due to the fact that people marry solely for the purpose, as nobody needs to work for a roof or gas anymore. Every need, every high (state regulated ofc), every material need (permanent hologram tactile objects that looks indistinguishable from the real thing but fixed to a specific room or ultrasound speaker lined walls).
Humanity is chained to this Matrix like system where humans are harvested and where the participant only provides seed and in return receive everything from the state. Food, shelter, entertainment, simulation of meaningful work (euro truck simulator 2223), all is provided and you will own nothing and be happy.
The two girls kissing looks innocent, but it shows the potential for AI to create believable virtual porn. They don't give the prompt for that video for some reason...
Everyone is freaking out about this. Wouldn't it just be better if we just accepted it and got on with our lives?
When anyone can generate any video of any subject doing anything at all, we just need to come to grips with that fact. It's the new normal, and there's no going back.
I think we are pretty close to advances in language models, text to speech, and image generation automating the parasocial relationship. When your AI gf looks how you want, (virtually) does whatever you want, and talks to you however and whenever you want - I don't see how OnlyFans and the like will compete.
Comparative advantage and the power of boredom help here. “Hiring” people means you get things you don’t want or didn’t ask for, and that’s a good thing.
Yeah; My gf was a "specialist" in her time at a dungeon while she was paying her way through art college. There will always be 1% of people who can pay for a professional, and 1% of girls who are great at controlling their urination at those guys faces. But realistically, the other 99% of dudes will definitely settle for an intelligent blow-up doll and an AI girlfriend. This is just hijacking the hard wired behavior of testicles. The people who really get cut out of this process are the 99% of girls who are not AI bots or domiatrixes and just want to meet a dude who isn't jacked into porn 24/7.
You won’t get that from Chinese research, it’s illegal.
Also noticed their “anime” prompt doesn’t look like anime, it looks like animated copyright free clip art. I can’t read the original Chinese prompt though, maybe it makes sense there.
Exciting times.