I trained an AI to copy my voice and it scared me silly

camtarn · on Jan 22, 2018

I'm honestly surprised that anybody could mistake the Lyrebird generated voices for something real. It's got this weird buzzy noise about it which sticks right out to my ear, and only a little bit of noticeable influence from the author's voice.

Either some people are much worse at perceiving this than I'd expect, or the article is hyping up something which just doesn't deserve that level of hype - yet.

hortonew · on Jan 22, 2018

I think the author had a little over-the-top reaction of "it’s time to unplug everything, chuck my phone, don a tinfoil hat, and move to the woods." The buzzing and computer-created audio was immediately noticed. I also consider myself very good at noticing sound differences, but I think that's beside the point in this case.

brlewis · on Jan 22, 2018

It isn't hard for them to hold onto the training audio to use with later versions of the AI.

Assuming, of course, that the buzzing and other flaws weren't added to lull you into a false sense of security.

agarden · on Jan 22, 2018

People with phonagnosia are incapable of recognizing anyone's voice. If it is like face-blindness, then probably there is a spectrum of ability to recognize voices, and so what sounds obviously computerish to you might be indistinguishable from the real thing to someone else.

https://www.thecut.com/2016/09/voice-recognition-apparently-...

camtarn · on Jan 22, 2018

The thing is - I find it almost impossible to recognise peoples' voices on the phone, unless they have a particularly distinctive voice - usually the accent is the distinguishing factor. But I think I'd get this straight away, even over the phone. Maybe something to do with being a (very amateur) music producer, though.

oxguy3 · on Jan 22, 2018

You can definitely tell which one's the AI, but if you weren't particularly paying attention on, say, a phone call (where you're used to bad audio quality anyway), you might not immediately notice.

markbnj · on Jan 22, 2018

I can't speak for anyone else but I would recognize that as software-generated instantaneously. It's unmistakable imo, and actually sounds significantly worse than, for example, these samples from Google's Tactotron 2 system: http://www.androidpolice.com/2017/12/28/googles-new-text-spe...

woodson · on Jan 22, 2018

Would you be able to tell the difference on a GSM cell phone call?

markbnj · on Jan 24, 2018

Mmm, I think so yes. The voice had a distinctly artificial character imo.

dingo_bat · on Jan 22, 2018

The example in the article is really bad. It won't fool anybody. It's distinctly robotic and nasal.

0xdada · on Jan 22, 2018

It's a little off topic because they do a different thing than in the article, but check this out for some (actually) scary good computer generated examples:

https://google.github.io/tacotron/publications/tacotron2/ind...

AstralStorm · on Jan 22, 2018

People can be fooled by an entirely different sounding person and a glib excuse...

falcolas · on Jan 22, 2018

It's a touch robotic, but I've heard real people sound like that (and worse) in Google hangouts with poor connections. I also imagine it can only get better with more knowledge and tech.

lechiffre10 · on Jan 22, 2018

Overly dramatized headline for people to click on the article, It's obvious which one's the AI when listening to both samples.

BrandoElFollito · on Jan 23, 2018

The "Adobe Photoshop for voice" is quite amazing: https://youtu.be/I3l4XLZ59iw (you can skip the first ~third to get to the demo)

chj · on Jan 23, 2018

The headline is overhyping, but the result is surprisingly good given I had a very low expectation. It won't fool anyone, but should be usable for some scenarios.

lozenge · on Jan 22, 2018

Note: it requires 30 sentences, not a minute as stated in the article.

thinkMOAR · on Jan 22, 2018

all i'm getting is upload failed :(