
Show HN: VoiceClonr attempts to reconstruct human voices - voiceclonr
http://www.voiceclonr.com/?hnresubmit2
======
elmalaak
Certainly it's still far from being able to deceive a human into thinking
synthesized speech of any speaker saying anything is real, but it has
definitely and clearly capture a certain quality to each of those voices.
Really cool project and I'm sure it portends even more awesome work in the
area.

------
cristianpascu
I am wondering, what exacly is holding back the technology? Why isn't it there
yet?

~~~
voiceclonr
For one thing, to get a very good quality, lot of resources are needed. Studio
quality recordings lasting many hours, voice directors and voice experts who
can sift through wav files and ensure phoneme boundaries are aligned etc. And
even with this, the quality may not be predictable - but they have gotten
reasonably good. It is a hard task to do it at scale (The HMM HTS synthesis
used in my app is scalable - but quality is not that great and is robotic).

~~~
cristianpascu
That means that you can't simply reverse engineer voice from, say, a sample
text read by an voice actor? I mean, down to the tiny bits of audio waveform?
I mean, how hard could it be? :)

~~~
Animats
If you can have a speaker read through a specific list of items, a useful
singing model can be constructed. That's how Vocaloid works.

What hasn't been done well yet is extracting a model from existing
uncontrolled voice samples. That's what this is trying to do. Once this works
well, software clones of dead singers will be popular. The RIAA is going to
hate this.

------
chillingeffect
I love that you're shooting for a holy grail!

Aiming for lyrics is a much higher target than everyday text though, due to
grammatical hints and the extra pitch and phrasing demands of lyrics. Your
results might hit people harder on non-lyrical textual bodies.

Keep up the good work, I'd like to make something like this for musical
instrument someday :)

P.s. have you ever heard of Douglas Hofstadters Letter Spirit project which
synthesizes fonts from a subset?
[http://www.cogsci.indiana.edu/farg/mcgrawg/fonts.gif](http://www.cogsci.indiana.edu/farg/mcgrawg/fonts.gif)

------
slantyyz
This is very cool.

Reminds me of this -- before Roger Ebert died, he tried to have his voice
reconstructed by some company using audio from his TV show, etc., but alas, it
was too difficult at the time, so he ended up using one of the Apple TTS
voices instead.

~~~
WCityMike
I believe you're mistaken. Here is its debut, and it's pretty amazing:

[https://www.youtube.com/watch?v=93jREDSWOYY#t=1m23s](https://www.youtube.com/watch?v=93jREDSWOYY#t=1m23s)

Google "Cereproc" and "Roger Ebert" and you'll find that he was quite pleased
with it.

~~~
slantyyz
After some searching, this is the best support I could find:

\--snip

In early 2010, Ebert and Chaz announced on the “Oprah Winfrey Show” that
they’d enlisted a Scottish company called CereProc to create a computerized
voice that more closely resembled Ebert’s own by using snippets of his TV
work, DVD commentaries and the like, but that never fully materialized. Alex
stayed with him until the end.

\--snip

Alex being the Apple TTS voice I mentioned earlier.

Source: [http://voices.suntimes.com/arts-entertainment/the-daily-
sizz...](http://voices.suntimes.com/arts-entertainment/the-daily-
sizzle/voicematch-expert-channels-roger-ebert-for-new-doc-life-itself/)

------
Zikes
I would pay a fair amount for a TTS engine that can accurately mimic GLaDOS's
voice.

~~~
schoen
This would be a triumph.

~~~
kinduff
With a note of huge success. It's going to be hard to overstate my
satisfaction.

------
ocdtrekkie
I'm super excited about this.

One of my biggest peeves right now is that voices cost a ton of money, few are
readily available otherwise, and a lot of the new stuff is cloud-dependent.
(Which is a big turn-off to me.)

What are you looking to do with this?

~~~
voiceclonr
Right now, I'd like to see if there are any tweaks that can improve the
quality (or even experiment with concatenate synthesis). Most likely, it will
stay robotic. So an extension to do would be to research on singing synthesis
(vocaloid kind).

~~~
ocdtrekkie
What I meant was, are you looking at making it available for others to use in
things (open source, perhaps), or looking to make a product out of it? And if
the latter, something cloud-based, or something I could run on my own machine?

~~~
voiceclonr
Yes, I'm looking into a cloud based API that anyone can call into.

------
secfirstmd
There have been rumours of government capability to do this for some time. For
example, to use false voice messages for Radar instructions to enemy fighters
etc. Interesting to see it in the commercial space.

~~~
hellbanner
Links or references?

~~~
secfirstmd
It's been awhile but let me try to dig some up.

------
scottydelta
And I was hoping to find Morgan Freeman's voice there already.

~~~
voiceclonr
I thought about it - but I would have been disappointed with the synthetic
voice myself, so didn't even try :)

------
ocdtrekkie
Request: Star Trek computer voice (Majel Barrett Roddenberry)

There's hundreds of episodes containing it, including remastered audio in the
HD versions of TNG.

------
fasouto
Very nice project, congratulations!

One suggestion: make the text-to-speech button bigger and centered (I missed
it the first time).

~~~
voiceclonr
Thanks! Will make the button changes tonite. Which device were you on when you
missed it ?

------
acd
Could a deep neural network clone the voice of a person given previous sound
recordings of that person speaking?

~~~
Houshalter
Yes, see here:
[https://youtu.be/-yX1SYeDHbg?t=38m35s](https://youtu.be/-yX1SYeDHbg?t=38m35s)

------
SchizoDuckie
They all just sound robotic to me.

Why even attempt to get things like imitating specific people's voices to work
when your speech isn't even fluid and pronounced clearly to begin with?

~~~
voiceclonr
I don't know if your response is a review on my app or the idea in itself. If
it's my app, it's an iterative process like many things. I didn't know what I
would end up getting, so I made an attempt.

