Hacker News new | past | comments | ask | show | jobs | submit login
I trained an AI to copy my voice and it scared me silly (thenextweb.com)
48 points by NicoJuicy on Jan 22, 2018 | hide | past | favorite | 18 comments



I'm honestly surprised that anybody could mistake the Lyrebird generated voices for something real. It's got this weird buzzy noise about it which sticks right out to my ear, and only a little bit of noticeable influence from the author's voice.

Either some people are much worse at perceiving this than I'd expect, or the article is hyping up something which just doesn't deserve that level of hype - yet.


I think the author had a little over-the-top reaction of "it’s time to unplug everything, chuck my phone, don a tinfoil hat, and move to the woods." The buzzing and computer-created audio was immediately noticed. I also consider myself very good at noticing sound differences, but I think that's beside the point in this case.


It isn't hard for them to hold onto the training audio to use with later versions of the AI.

Assuming, of course, that the buzzing and other flaws weren't added to lull you into a false sense of security.


People with phonagnosia are incapable of recognizing anyone's voice. If it is like face-blindness, then probably there is a spectrum of ability to recognize voices, and so what sounds obviously computerish to you might be indistinguishable from the real thing to someone else.

https://www.thecut.com/2016/09/voice-recognition-apparently-...


The thing is - I find it almost impossible to recognise peoples' voices on the phone, unless they have a particularly distinctive voice - usually the accent is the distinguishing factor. But I think I'd get this straight away, even over the phone. Maybe something to do with being a (very amateur) music producer, though.


You can definitely tell which one's the AI, but if you weren't particularly paying attention on, say, a phone call (where you're used to bad audio quality anyway), you might not immediately notice.


I can't speak for anyone else but I would recognize that as software-generated instantaneously. It's unmistakable imo, and actually sounds significantly worse than, for example, these samples from Google's Tactotron 2 system: http://www.androidpolice.com/2017/12/28/googles-new-text-spe...


Would you be able to tell the difference on a GSM cell phone call?


Mmm, I think so yes. The voice had a distinctly artificial character imo.


The example in the article is really bad. It won't fool anybody. It's distinctly robotic and nasal.


It's a little off topic because they do a different thing than in the article, but check this out for some (actually) scary good computer generated examples:

https://google.github.io/tacotron/publications/tacotron2/ind...


People can be fooled by an entirely different sounding person and a glib excuse...


It's a touch robotic, but I've heard real people sound like that (and worse) in Google hangouts with poor connections. I also imagine it can only get better with more knowledge and tech.


Overly dramatized headline for people to click on the article, It's obvious which one's the AI when listening to both samples.


The "Adobe Photoshop for voice" is quite amazing: https://youtu.be/I3l4XLZ59iw (you can skip the first ~third to get to the demo)


The headline is overhyping, but the result is surprisingly good given I had a very low expectation. It won't fool anyone, but should be usable for some scenarios.


Note: it requires 30 sentences, not a minute as stated in the article.


all i'm getting is upload failed :(




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: