Hacker News new | past | comments | ask | show | jobs | submit login
Deepfake version of Travis Scott (genius.com)
52 points by saadalem 6 days ago | hide | past | web | favorite | 45 comments





Since all of his music is R&Bish auto-tuned, not that hard for a computer to generate that sound. (I have to admit, I've hit the "get off my lawn" phase, missing the "good ole days" of "real music" ie 90's hip hop)

The irony being that the 90’s hip hop sound was largely influenced by MPC60/3000 quantization, amongst other drum machines.


They generated the lyrics and melodies using AI. Everything else was done by a person, including the vocals.

Do you have a link on that? I can’t find it in the HN link.

The Adweek source linked goes into a little more about it, but not much. Specifically mentioning a model was used for lyrics first, then the AI created a melodic and percussive arrangement. It doesn't say much more about how it was done.

I found another article that states they had a real human voice sing the lyrics (then auto-tuned it to Scott's style), but there's no actual sources here, or on any other articles I can find.

https://musebycl.io/digital-data/agency-used-ai-make-travis-...


I did follow through to Adweek, but given that it is Adweek and not NME, I wasn’t sure how literally to take the word “arrangement” there :-) Thanks!

See the adweek article that was linked in OP's article, it has more details. The AI part was only lyrics, melodies, and percussion arrangements; not actually the production itself.

I didn't get that from the article.

See the adweek article that was linked in OP's article, it has more details. The AI part was only lyrics, melodies, and percussion arrangements; not actually the production itself.

I'm not clear what you mean. How does auto-tune make it easier for the computer to generate?

Computer generated voice has certain artifacts that makes it a bit distinctive. Auto-tuned natural voice has similar artifacts, so if an artist uses auto-tune, then there's less noticeable difference between their real (computer-altered) voice and a fake computer-generated voice.

A more exaggerated example would be artists that use a vocoder (e.g. Daft Punk or something like that), which would be even easier to synthesise convincingly because the real vocals are also mostly synthesised.


Or VibeSquaD.

(not UK Vibe Squad)


Less "uncanny valley" affect, because in both cases, you're hearing a synthesized voice.

I can't parse this comment. What is "R&Bish"? Why does that matter that it's easy? What makes 90's hip-hop "real" or implicitly superior to the last two decades?

R&B: https://en.wikipedia.org/wiki/Rhythm_and_blues

ish: https://www.dictionary.com/browse/ish

So, having an aesthetic similar to rhythm and blues.


If we take created by AI to mean: machine learning generates a crazy amount of some things and then humans pick and release the most interesting one, then AI can create some pretty amazing stuff.

Imagine a GPT2-like algorithm trained on all the world's music. It could then generate practically any style and probably quite creatively.

Then fine-tune it on a particular artist's style. It will then mimic that artist. Depending how strongly you fine tune it, you can bias more strongly towards the target artist or towards general music.

You could then generate music in the same general style as an artist but taking any amount of inspiration from the rest of the world of music. I imagine with enough data and the right algorithms it would work very well and sound fantastic.

The same applies to visual works which I'm sure the reader is familiar with.

Just remember the algorithms we have today are the starting point not the ending point.


> It could then generate practically any style and probably quite creatively.

Mimicking a style... easy. Delivering a coherent message or story... harder. Emotional impact... even harder. IMO. But I think these are great goals for an Artistic Turing Test.


Most popular music doesn't deliver a coherent message or emotional impact.

Some genres (i'm looking at you mumble rap) don't even require coherent language.


> Most popular music doesn't deliver a coherent message or emotional impact.

It does via the Forer Effect. The goal is to make generic statements and tell the listener that you’re talking about them (or someone just like them). Listeners then subconsciously fill in the blanks and create the most relatable impactful art imaginable.

That’s why beauty is in the eye of the beholder. You need the audience to impart the meaning ;)

https://en.wikipedia.org/wiki/Barnum_effect


You don't need interpretable language for emotional impact or even narrative. Have you ever felt moved by a song in a language you don't know a word of, or a purely instrumental song? The same's true of paintings, movies without dialogue, etc.

Mumble rap is a totally legitimate art form. I just tend to not be a huge fan of most of the song structures and compositions, personally.


It may be art, but being art doesn't make it deep or emotionally impactful. Even the people who like mumble rap often say they just care about the beat, not the vocals.

That, to me, means it could probably be generated by an AI. And I'd say the same of any genre for which repetition matters more than lyricism or storytelling. I could see a machine coming up with Lil Pump's Gucci Gang[0] much more easily than Wu Tang's Triumph[1].

[0]https://www.youtube.com/watch?v=4LfJnj66HVQ

[1]https://youtu.be/cPRKsKwEdUQ?t=51


Seems like this is more of an uncomfortable vendetta backed by preference instead of any examples, logic, or facts.

We can deliver coherent stories. We recently published https://notrealnews.net. We're committed to improving journalistic integrity in the information age.

Style has always been slightly more important than content. Style is why people care about content. Currently art seems to be evolving into raw style devoid of any coherent content.

“I ain’t got the surfers ‘cause I know I’m not that hard”


That's assuming that what we describe as "artistic" is not the result of processes that neural network can emulate reasonably well. But even if that were the case, I'd argue that 99% of music is rehashing previous. Hell, that's exactly what I want from most of my favorite bands when they release a new album, I just want more of what I liked before, maybe with a few new things thrown in but it's extremely unlikely (more like never) that I end up enjoying new albums where the band makes very different music than previous albums.

Check out https://www.openai.com/blog/musenet/

Same tech as GPT-2 applied to musical composition!


You would need general AI for this.

What is in reach of current methods is EDM. The function that produces a dance beat is a lot more tractable. It's a pure signal processing problem, not a linguistic one. I think sooner than later we will see a startup churning out chart-topping robot beep boop bangers. It seems like this is the Hegelian endpoint of music.


There isn't any resemblance to Travis Scott besides the obvious "It's lit"-adlibs to be honest.

I loved the parts where they showed some PHP echoing some HTML to give the video that hacker vibe.


The voice was re-recorded and mixed to sound close to Travis's real vocals; no digital voice reconstruction. However, the reconstruction of his cadence/melodic style is cool to hear!

Yeah, if it was like that it would be truly impressive. There project is still cool nonetheless.

Check out AI made black metal, it's not fantastic but also well above garbage.

https://dadabots.bandcamp.com/album/coditany-of-timeness


I feel like it probably took a lot of tweaking and overlaying the more cogent things it generated to make this (?). Very impressive though.

If it takes enough tweaking then how do you know it was by AI?

If you don’t see the model or any of the source code or any other process then aren’t you just cherry picking input ?


I guess there's an argument that the creative process is automated but the production is not.

I'm certain that as the famous musicians die off or retire from performing, there will be a huge business of recreating their sound and generating new material.

I can't wait for AI to recreate the unique sound of Her's, an incredibly remarkable duo from Liverpool which I just recently came across... only to find out a few days later that they died in a tragic, unfathomable, unfair car accident along with their tour manager after a concert in the US last year.

Using famous actors for promotions after they die is already a business[0].

But why bother with living musicians or their estates at all? Use deepfake technology to generate a physical persona, and AI to generate a voice and style, all unique enough to be legally distinct and copyrightable, and fully owned by a company. Live performances can even be done via hologram.

I can easily imagine a future where most creative media, including movies, is entirely or almost entirely AI generated. Certainly, actors and performers won't be real people.

[0]https://www.vice.com/en_us/article/539a5z/hollywoods-post-bi...


This will be kind of sad since this will limit the market for new musicians and new styles. We will get so caught up with preserving the past, that we will ignore opportunities of the present.

I like to give humanity the benefit of the doubt in these cases.

I’m not sure if that’s right, but I’d certainly rather not let it taint my appreciation of music.

I don’t think it will stop people from creating either.

But you might, unfortunately, be onto something with regard to the market. Maybe less because of what people appreciate, and more because the suits will pursue the lowest common denominator with the highest revenue potential.


Everything is a remix, everything contemporary has history.

That's the first thing that comes to mind for me as well. More Zappa? More Bradley Nowell? Hell yes!

Then I think this is a curse. Internet pranksters generate 1,000,000 amazing new Frank Zappa songs and laugh. Your life is meaningless. You are nothing.


Besides imitating the singing voice of a person, training models to creating new music in an artist's style could make for an interesting use case of crowdsourcing.

I've posted this on HN before, NHK 'raised' one dead singer who sang a new song for this year's New Years special using similar tech:

https://www.japantimes.co.jp/culture/2020/01/10/films/nhk-ra...

It's starting!




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: