Hi everyone, cofounder of play.ht (the startup behind this podcast) here. let me know if you have any questions.
To give more context, the podcast was totally AI generated, the content itself was generated from a finetuned GPT3 on SteveJobs' biography, the voices were cloned from few hours of both Joe and Steve voices, even though it was tough to get good content for Steve Jobs. And the podcast artwork was generated by SD.
We will be releasing more episodes soon which will be even more mind-blowing!
Please take the naysayers with a grain of salt, this is a fantastic demo of what's possible. While the voices aren't 100% convincing on good speakers, if this was over the phone it would be indistinguishable, the content of the podcast is also spot on with Joe's sort of rambling and Steve always acting like he's making a point or telling a story.
Honestly I expected Joe to ramble a lot more, but I understand this demo is meant for people to listen to Jobs so they probably had to cut out a lot Joe's ramblings.
Wow, I was sure, from listening to the first few minutes, that the script was written by a human trying to be funny. The part about the NeXT Computer and the three applications was just too funny. Edit: Not to mention the reference to the movie Ghost.
We will never use any cloned voice in any commercial way without consent and compensation, we only wanted to show the community what is possible and what generative AI models can do.
Aren't you using the cloned voices to generate marketing material for your company to make profit? Indirectly profiting without consent or compensation seems like a difficult ethical line. It also seems like something you want to have a very good stance on when a law could be passed that completely shuts down your ability to operate.
I know the reaction here is mixed and tbh that's what makes HN so interesting for me. But FWIW I love this podcast! It's a great demonstration of what AI can do. I am going to share it with my students before the next class so that they see what can be possible.
>...the content itself was generated from a finetuned GPT3 on SteveJobs' biography
Was the actual dialog generated by GPT-3, or just Jobs' responses? If the former, was any of the dialog human edited/spliced together or was the entire script generated as a single output?
This was the prompt:
"
Podcast.AI
Great people, great interviews with our host Joe Rogan.
Episode 1 - Steve Jobs
Summary:
"
Then GPT3 generated the summary, then we added:
"
Transcript:
Joe:
"
That is all.
I still have a hard time believing that the opening of the script was auto-generated. Did GPT-3 really generate the part about the movie Ghost? If so, it has a surprising amount of understanding of what it's generating.
We just finetuned another voice recently with only 1hr though... I think eventually (soon) we will only need 15-20 mins with zeroshot not even finetuning.
To give more context, the podcast was totally AI generated, the content itself was generated from a finetuned GPT3 on SteveJobs' biography, the voices were cloned from few hours of both Joe and Steve voices, even though it was tough to get good content for Steve Jobs. And the podcast artwork was generated by SD.
We will be releasing more episodes soon which will be even more mind-blowing!