I found another article that states they had a real human voice sing the lyrics (then auto-tuned it to Scott's style), but there's no actual sources here, or on any other articles I can find.
A more exaggerated example would be artists that use a vocoder (e.g. Daft Punk or something like that), which would be even easier to synthesise convincingly because the real vocals are also mostly synthesised.
(not UK Vibe Squad)
So, having an aesthetic similar to rhythm and blues.
Then fine-tune it on a particular artist's style. It will then mimic that artist. Depending how strongly you fine tune it, you can bias more strongly towards the target artist or towards general music.
You could then generate music in the same general style as an artist but taking any amount of inspiration from the rest of the world of music. I imagine with enough data and the right algorithms it would work very well and sound fantastic.
The same applies to visual works which I'm sure the reader is familiar with.
Just remember the algorithms we have today are the starting point not the ending point.
Mimicking a style... easy. Delivering a coherent message or story... harder. Emotional impact... even harder. IMO. But I think these are great goals for an Artistic Turing Test.
Some genres (i'm looking at you mumble rap) don't even require coherent language.
It does via the Forer Effect. The goal is to make generic statements and tell the listener that you’re talking about them (or someone just like them). Listeners then subconsciously fill in the blanks and create the most relatable impactful art imaginable.
That’s why beauty is in the eye of the beholder. You need the audience to impart the meaning ;)
Mumble rap is a totally legitimate art form. I just tend to not be a huge fan of most of the song structures and compositions, personally.
That, to me, means it could probably be generated by an AI. And I'd say the same of any genre for which repetition matters more than lyricism or storytelling. I could see a machine coming up with Lil Pump's Gucci Gang much more easily than Wu Tang's Triumph.
“I ain’t got the surfers ‘cause I know I’m not that hard”
Same tech as GPT-2 applied to musical composition!
What is in reach of current methods is EDM. The function that produces a dance beat is a lot more tractable. It's a pure signal processing problem, not a linguistic one. I think sooner than later we will see a startup churning out chart-topping robot beep boop bangers. It seems like this is the Hegelian endpoint of music.
I loved the parts where they showed some PHP echoing some HTML to give the video that hacker vibe.
If you don’t see the model or any of the source code or any other process then aren’t you just cherry picking input ?
But why bother with living musicians or their estates at all? Use deepfake technology to generate a physical persona, and AI to generate a voice and style, all unique enough to be legally distinct and copyrightable, and fully owned by a company. Live performances can even be done via hologram.
I can easily imagine a future where most creative media, including movies, is entirely or almost entirely AI generated. Certainly, actors and performers won't be real people.
I’m not sure if that’s right, but I’d certainly rather not let it taint my appreciation of music.
I don’t think it will stop people from creating either.
But you might, unfortunately, be onto something with regard to the market. Maybe less because of what people appreciate, and more because the suits will pursue the lowest common denominator with the highest revenue potential.
Then I think this is a curse. Internet pranksters generate 1,000,000 amazing new Frank Zappa songs and laugh. Your life is meaningless. You are nothing.