
Launch HN: Lyrebird (YC S17) – Create a digital copy of your voice - adbrebs
Hi HN!<p>We are the co-founders of Lyrebird (<a href="https:&#x2F;&#x2F;lyrebird.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;lyrebird.ai&#x2F;</a>) and PhD students in AI at University of Montreal. We are building speech synthesis technologies to improve the way we communicate with computers. Right now, our key innovation is that we can copy the voice of someone else and make it say anything. The tech is still at its early stage but we believe that it is eventually going to make possible a wide range of new applications such as:<p>- reading loud text messages with the voice of the sender,<p>- reading audiobooks with the voice of your choice,<p>- giving a personalized digital voice to people who lost their voice due to a disease,<p>- allowing video game makers to have more customized dialogs generated on the fly, or avatars of their players,<p>- allowing movie makers to freeze the voice of their actors so that they can still use it if the actor ages or dies.<p>Yesterday we launched a beta version of our voice-cloning software: anyone can record one minute of audio and get a digital voice that sounds like them.<p>We know that many on HN are concerned about potential misuses surrounding these technologies and we share your concern. We write further on our ethical stance on this page: <a href="https:&#x2F;&#x2F;lyrebird.ai&#x2F;ethics&#x2F;" rel="nofollow">https:&#x2F;&#x2F;lyrebird.ai&#x2F;ethics&#x2F;</a>.<p>Our blogpost about the launch: <a href="https:&#x2F;&#x2F;lyrebird.ai&#x2F;blog&#x2F;create-your-voice-avatar" rel="nofollow">https:&#x2F;&#x2F;lyrebird.ai&#x2F;blog&#x2F;create-your-voice-avatar</a> that features the first video combining generated audio and generated elements of the video.<p>There was a thread about us on HN when we launched our website four months ago (<a href="https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14182262" rel="nofollow">https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=14182262</a>) but at that time, no one could test our software yet and we did not really answer any question of the community. So this time we are ready for questions and would love some feedback!
======
woollysammoth
This looks really great, congrats! Forgive me if I missed something, but I was
wondering if you could clear up some confusion. From the terms: "Subject to
the Biometric Data Agreement, you hereby grant to us a fully paid, royalty-
free, perpetual, irrevocable, worldwide, non-exclusive and fully sublicensable
right (including any moral rights) and license to use, license, distribute,
reproduce, modify, adapt, publicly perform, and publicly display Your Voice,
Digital Voice..."

Just to be clear, the license of the voice/digital voice is revoked upon
deletion of the recordings? I understand it is subject to the biometric
agreement, but the words perpetual and irrevocable still worried me. Thanks!

~~~
sotelo
Yes! This is what our lawyers suggested to protect ourselves.

We delete all the recordings when you click delete, so we can't recreate the
voice anymore. However, this is still necessary in case we share some
generated sentences in social media or so (like we're doing on twitter now).

~~~
JoshTriplett
> However, this is still necessary in case we share some generated sentences
> in social media or so (like we're doing on twitter now).

This is something that you should only do with the permission of the user who
provided the voice. You _don 't_ need generalized permission to do that for
every user, and given the nature of the technology, you shouldn't ask for such
permission.

~~~
Whitestrake
From the grandparent comment:

> This is what our lawyers suggested to protect ourselves.

Generally speaking, a lawyer's advice is going to be optimized for maximum
protection in possibly unforeseeable circumstances, not for what might
actually be needed or even reasonable to request of every user.

Generally speaking, companies aren't going to go out of their way to rein in
their lawyer. Most people won't even read that fine print, unfortunately.

------
rayalez
Sounds amazing! Just to add a usecase - for many people, creating a decent
voiceover is one of the big sticking points for producing youtube videos or
educational courses. If I could write a script, and have software generate a
decent enough voiceover, it would be amazing.

It's not even necessary to copy anyone's voice, as long as there's a selection
of the most comprehensible and human-sounding ones.

Then, you could even automatically generate slideshow presentation from a few
illustrations and headlines, and that would make "rendering" articles into
videos very fast and easy. I'm sure a lot of people would pay for such
service.

\----

By the way, recently I've encountered Deep Voice 2, a similar research project
by baidu:

[http://research.baidu.com/deep-voice-2-multi-speaker-
neural-...](http://research.baidu.com/deep-voice-2-multi-speaker-neural-text-
speech/)

Results are very impressive.

~~~
pavel_lishin
I made a joke video for work, featuring clips of Sir David Attenborough
narrating a fake nature documentary I cut together from video I took at work.

It would have been an order of magnitude better if I could just generate
arbitrary phrases in his voice.

(Or maybe not; maybe the constraint made the video better.)

~~~
joshschreuder
I think the constraint makes the video funnier, take for example dinoflask's
edits of Blizzard's Jeff Kaplan

[https://www.youtube.com/watch?v=gXTrrTX7YuY](https://www.youtube.com/watch?v=gXTrrTX7YuY)

------
dantiberian
While it's good that you have an ethics page:
[https://lyrebird.ai/ethics/](https://lyrebird.ai/ethics/), it only has two
ethical guidelines:

* Spread awareness of this technology

* Your digital voice remains yours

I would feel a lot better about this if you also had explicit ethical
boundaries, for example disallowing users impersonating someone else, e.g.
Donald Trump, Barack Obama. "Your digital voice remains yours" sort of sounds
like you won't use/share my digital voice with others, but doesn't directly
address whether bad actors can maliciously impersonate someone who hasn't
registered with Lyrebird.

~~~
jeffmould
To build on that a bit. My bank uses voice recognition as a security measure
and to authenticate me when I call in. Throw a bad actor in the mix and that
becomes a security issue.

~~~
gervase
Reminds me of the classic "My voice is my passport. Verify me." Hopefully that
is not the only second-factor option they provide?

~~~
jeffmould
Ha! Luckily it opt-in/out and they do allow you to keep a pin/code word as a
backup.

------
slackstation
Their ethics don't seem to be something they take seriously as the video they
use to promote their own site is an impersonation itself.

Seems from right out of the gate, they are breaking their own ethical
guidelines as a cheap promotional tactic. If they care that little about
themselves and a former president of the United States, what do they care
about your likeness.

It also doesn't help that you give them a universal perpetual license to do
whatever they want (including selling your likeness for someone else's use) by
uploading.

This just seems like a slimy team that put up an ethics page as a CYA.

I'm willing to eat my words if they had Barak Obama's consent to use his
digitized voice for this but, it's highly doubtful since there's also the coat
and seal of the President of the United States on the flag in the background
which would be a massive ethical breach of a former President just to promote
a silly little startup.

~~~
adbrebs
> Seems from right out of the gate, they are breaking their own ethical
> guidelines as a cheap promotional tactic. If they care that little about
> themselves and a former president of the United States, what do they care
> about your likeness.

We state in our blogpost that we make an exception for Obama/Trump in order to
raise public awareness. Both of them are regularly used in Machine Learning
benchmarks (for example [0] [1]). Note that we don't allow users to generate
from Trump/Obama's voice.

Once again, we care a lot about these issues and that's why we only allow
users to copy their own voice.

[0] [http://www.washington.edu/news/2017/07/11/lip-syncing-
obama-...](http://www.washington.edu/news/2017/07/11/lip-syncing-obama-new-
tools-turn-audio-clips-into-realistic-video/) [1]
[https://www.youtube.com/watch?v=ohmajJTcpNk](https://www.youtube.com/watch?v=ohmajJTcpNk)

These issues are challenging and suggestions about how you think the
technology should be introduced/regulated are very welcome.

~~~
slackstation
It's still hypocritical and insulting to the reader's intelligence.

You could make Obama say anything. He could say something humourous, something
that he's never said before. You would have just as impressive of a demo if
you had Obama say "I'm a little teapot short and stout..." and then used
overlay text to promote yourself. You chose instead to make a video where he
promotes your startup.

That is both hypocritical and immoral and not only using his personal likeness
but, also the seat of the Presidency of the United States.

This fast and loose way that Lyrebird treats their technology only makes me
think that they don't really think about the massive negative potential of the
technology and just want to get scale / profitability as fast as possible.

------
_lex
"I'm using my voice as my password".

Vanguard allows voice authentication ([https://investor.vanguard.com/account-
conveniences/voice-ver...](https://investor.vanguard.com/account-
conveniences/voice-verification?lang=en)) - and who knows who else will roll
something similar out in the future. Yeah, its really really dumb, but it's
happening in production now. I wouldn't use this product if I were you, but
honestly you should also not use voice verification/authentication for
anything.

~~~
tunetine
Fidelity began verifying voice for telephone customer service a short while
back. They recorded me during the call then at the end said they were going to
use it to verify for future calls. No way to opt out.

~~~
mysterydip
Did they say something to that effect before the call started, or only told
you at the end? Or did they just use the "this call may be monitored for
quality assurance and training purposes" blanket?

~~~
tunetine
I remember it too vaguely at this point but something was mentioned in the
beginning while I was waiting. I feel like it was worded along the lines of a
promo or I wouldn't have told the rep. I wasn't interested multiple times.
"Verifying is now easier and more secure with voice verification..."

------
CM30
Seems like a really useful piece of technology. As you said, it's got quite a
few applications in the gaming, film, medical and messaging industries.

That said, am I the only one imagining this getting abused by people in those
fields as well? Seems like a good way to avoid paying voice actors for future
work. Just record the minimum 30 recordings, then use this software to create
all their future dialogue.

This could lead to some interesting lawsuits over who a character's voice
belongs to and whether a company has the right to use someone's voice
recordings to get free work done on future projects. Like how during the
production of Trail of the Pink Panther, Peter Sellers' widow sued the film's
producers and studio over them using clips of him from deleted scenes in
earlier films in the movie.

------
0x4f3759df
The innovation I'm waiting for is

>> reading audiobooks with the voice of your choice, AND the speed of my
choice.

~~~
adbrebs
Yes definitively! This is also something we are working on.

------
dataisfun
I don't buy the "raising awareness" argument, ethically speaking. To do that,
you could release demo files that show the capability without weaponizing it
through easy access. It'd be great to increase awareness around our
vulnerability to EMP attacks, but we don't need to publish specs and or sell a
working prototype to make that case.

This is just one of those areas where the negative implications, I believe,
far outweigh the positive ones. Aside from the noble cause of helping the
disabled, most of the use cases center around entertainment. As great as that
may be, the likely application to fraud and the potential for a catastrophic
misuse in matters of war and peace just dwarf any upside.

------
plastroltech
Will this technology be licensed for redistribution or only for online API
use? I ask because in the video game scenario it would be great to have this
in a library I could distribute instead of relying on the API to be available
at all times.

~~~
adbrebs
The first version will only be an online API. I agree with you that we should
eventually think about licensing it for offline/embedded redistribution.

------
tekromancr
Really fun stuff. I noticed that it seems to have problems starting sentences.
Especially if I try to start a sentence with "hi,". Interesting nonetheless.
This passage seems to be rendered fairly well:
[https://lyrebird.ai/g/LYoVuaZm](https://lyrebird.ai/g/LYoVuaZm)

Also, [https://lyrebird.ai/g/D3Fw328D](https://lyrebird.ai/g/D3Fw328D)

~~~
adbrebs
Unfortunately for certain voices our model has difficulties to generate the
very beginning of the sentence. We hope to fix this problem soon.

Some other people shared their voices on twitter if you want to compare:
[https://twitter.com/LyrebirdAi](https://twitter.com/LyrebirdAi)

------
S_A_P
I guess I see a ton of upside here, but I also see that this could easily be
abused and possibly a tool to completely destroy someones life. Imagine
getting a phone call from your "partner" saying they cheated on you. I dont
know how it would be useable(api?) and I do still detect a bit of
artificialness to to voice, but as this gets better I worry about the down
sides and potential for harm by copying someones voice.

~~~
adbrebs
Thank you for raising those concerns. We take those very seriously. You can
read more about our ethical stance in this article:
[https://lyrebird.ai/ethics/](https://lyrebird.ai/ethics/)

To recap:

\- we want to start by raising public awareness about the technology and we
did demos with the voices of Trump/Obama for that,

\- your digital voice is yours, people can not use it without your
authorization.

------
Abundnce10
I just tried to signup with a Hotmail email address and I got this error
message: _This email cannot be used to create an account. It might be due to
your email domain name._

I realize Hotmail isn't the sexiest email provider these days but it's one of
the more commonly used. Do you have a list of email domains you allow?

~~~
sotelo
We accept hotmail. It might be because of some special characters. Do you use
+ ?

~~~
Abundnce10
Nope. Just letters and numbers. Same with my password.

I tried with my Gmail address and it worked fine. That address has no numbers
in it. I used the same password. If you aren't prohibiting Hotmail addresses
then it must be the numbers in the email address that are triggering some
validation.

Regardless, I have access now. Looking forward to trying your product!

~~~
bitwize
Don't reveal your powerlevel in HN, dude. Now you've not only reduced the
search space for your Hotmail password, yoy've clued an attacker in that
that's also your Gmail password!

~~~
Abundnce10
I use Pass [0] to generate unique/random passwords for each site I sign into,
I don't use the same password for all sites. I was just describing what I used
for this instance (only characters and numbers), not what I do every time. I
appreciate your concern though!

[0] [https://www.passwordstore.org/](https://www.passwordstore.org/)

------
songzme
When the demo page was launched it seemed like Lyrebird was going to be an
API. Will there still be an api?

~~~
adbrebs
Yes definitively, we are starting a private beta at the moment.

~~~
songzme
awesome! I signed up back then but haven't heard anything since. Is there
anything else I can do to try out the beta?

~~~
adbrebs
Not yet. We are starting with a few developers/companies only and will expand
it progressively.

What would be your use case?

~~~
songzme
My wife built an app that teaches people (foreigners) how to speak english.
Based on the words in their flashcards, we generate dynamic sentences so
during practice their flashcards are rarely the same. For example, if I have
(happy, sad, run, write) in my backpack, then a sample flashcard would show up
as: "When I run, I will be sad".

I see lyrebird api being very helpful in helping my users practice listening
skills and add a level of creative fun! If we had 10-20 different voices, the
flashcards will be read a little differently each time. Right now (since our
flashcards is dynamic), our audio feels very monotone. We would love to help
you beta test your API and work something out.

------
sova
I assume you guys know about VocalID that got an NSF SBiR grant for giving
mute people a voice (through similar means)
[https://www.vocalid.co/](https://www.vocalid.co/)

------
mipmap04
This is incredible - recorded my voice and I'm blown away with the results.

One thing: I found that I was in such a hurry to record that I probably spoke
faster than normal. It'd be nice if there was a way to tune a few parameters
manually (tempo, pitch, etc).

If I ever lose my voice and have to have a TTS appliance speak for me, I'll be
contacting you all to get my voice profile!

EDIT: For those interested, pretty impressive that it figured out the
appropriate cadence for this:
[https://lyrebird.ai/g/v7MpYaUA](https://lyrebird.ai/g/v7MpYaUA)

~~~
adbrebs
Thank you for the feedback!

> It'd be nice if there was a way to tune a few parameters manually (tempo,
> pitch, etc).

Yes we are currently exploring ways to control the generation: volume, pitch,
tempo, speed but also intonation and emotion.

~~~
mipmap04
Emotion would be a nice one - my wife's first comment was that it sounded too
bored.

------
drusepth
This looks awesome. I commented on the original post about how exciting this
is for worldbuilding (and creating realistic voices for fictional characters,
with all the uses that come there).

Random question: it's said that people think their own voices sound weird when
they hear recordings of themselves played back. Do you have a way to measure
that phenomenon? Have you seen people complaining about the accuracy when in
fact it was just that effect making people sound "weird" (to themselves)?

~~~
gasbag
The reason for the phenomenon is that some large percentage of how you hear
your own voice comes from bone conduction. In addition, the higher harmonics
of your voice are more directional, which is to say "aimed away from your
ears", and tend to be diminished when reflected back to you by the objects
around you.

The end result of this is that your own voice, when recorded and played back
to you, will generally sound less bassy and more harmonically rich than you
expect it to.

------
StavrosK
This is only tangentially on topic, but is there an API or some engine that I
can feed short sentences into and get high-quality generation back?

I have an RC controller radio that supports voice prompts, and I would like to
add some short phrases that are missing, such as "air mode on", "throttle
warning", etc.

Is there anything on par in quality with Google's/Siri's voice? Not the Google
TTS, but the voice they use in Now.

------
Vermeulen
Amazing - I cant wait to integrate this with our VR product. We previously
used Amazon Polly attached to a chatbot:
[https://twitter.com/Alientrap/status/829032930626383873](https://twitter.com/Alientrap/status/829032930626383873)

First uses that come to mind are players adding themselves to a VR world - or
maybe celebrities / public figures.

------
capocannoniere
Congrats on the launch! The tech is amazing

Quick q's (purely out of curiosity):

1) > We are [...] PhD students in AI at University of Montreal

Are you doing the startup on the side/planning on going back to school?

2) I don't recall reading about you guys in articles about YC S17 demo days.
What are reasons why some companies might not participate in demo day or
remain off-the-record? In your case, you seem to have had a working product
long before demo day

~~~
adbrebs
Thank you!

1) The research of the PhD and the startup are quite complementary at the end,
so we hope we can continue doing both.

2) We didn't do demo day because we raised our seed round just before YC and
did not want to raise again.

------
jlgosse
This is probably going to be great, but I just tested out voice generation
with the bare minimum of 30 recordings, and it really fell flat. When I tried
playback with an input, all it could produce was a high-pitched buzzing sound
and then maybe 1/4 of the words I typed in, which sounded nothing like me.

Perhaps you should increase the minimum from 30 recordings to 100?

~~~
sotelo
Hi! Thanks for testing it! For many voices it works well with only 30
recordings. For some, you need a bit more. It seems that quality of the audio
(no background noise, clear and loud voice, lots of intonation) is what
matters the most.

------
webwanderings
I am confused about the functionality. What is that I will be able to do, if I
go through recording 30 sentences?

~~~
adbrebs
You will be able to create a digital voice that sounds like you and generate
any sentence from it.

And thanks, we are going to update the instructions to make them more clear.

~~~
webwanderings
Thanks. Such an explanation on the website would be helpful. BTW, the
Trump/Obama tweets do not add value. Using political objects to define a
technical service, is a mismatch under the context. It also doesn't help in
explaining what this service provides (people wouldn't expect that Trump and
Obama have given you consent to use their voice). Just an opinion.

------
jonahx
When I try test my digital voice, after clicking "Generate," I get this error
after about 10 seconds:

Something went wrong. Please try again!

I've tried about 5 times.

EDIT: I went to back to the page a few minutes later, and the recordings were
all there. So it looks like it works, but is giving a false error message.

~~~
sotelo
Can you refresh and try again? Let me know if it works.

~~~
jonahx
It's working now. Thanks. Very cool.

Small issue: Would be nice if you could delete recordings.

~~~
sotelo
You can!

~~~
jonahx
Sorry, I meant the test recordings, not the originals of my voice.

------
mbonzo
I have a youtube channel (vimgirl) and before recording I have to write
scripts for what I plan to say in the video. The digital voice doesn't seem to
be working right now, but when it does it would cut down my screencast
production time by at least half.

------
sjbase
Cool stuff! Question from your FAQ:

> Q: Will I be able to copy another person's voice?

> A: Yes but only if you have the authorization of the person whose voice is
> being copied.

Perhaps you can unpack that answer a bit? What's the authorization process?

~~~
adbrebs
Sure, good question!

There will be two scenarii:

\- you want to use the voice of someone that has a Lyrebird account: he or she
has to give you their authorization.

\- you want to use the voice of someone who does not have an account. We have
specific contracts for that. Say you want to copy the voice of Morgan Freeman,
the contract will be between him/her, you and Lyrebird. We will also probably
explore alternative ways for that.

~~~
vageli
Did you get authorization to use the voices of public figures in your
promotional materials? If not, how can users be sure that you will not
arbitrary use their voice profile for promotional materials or otherwise?

------
mindhash
Hey.. how does lyrebird handle accent? I work in education space and due to
accent of people in my country, the content doesnt work well with global
audience.

are you open for beta? would like to try out your api on education content.

~~~
adbrebs
For now, it works better with American English accent but it is still able to
adapt to other accents.

Our upcoming versions should be more robust to different accents and we also
plan to extend it to other languages.

------
pashabitz
...make possible a wide range of new applications such as

\- hacking voice-controlled interfaces

\- generating fake news

FTFY

don't @ me saying "sure any technology can be used for good and bad stop being
a ludite" yeah I know that just messing with you

~~~
sotelo
Yes, this is a tricky subject! We have thought a lot about it and we think we
are doing the right thing for society.

We write more about it here:
[https://lyrebird.ai/ethics/](https://lyrebird.ai/ethics/)

------
SimbaOnSteroids
This is exciting I've been following you guys since at least May. How do you
plan on getting the voices out of the uncanny valley?

~~~
adbrebs
This is going to be very tricky! No clear answer to that, we are putting a lot
of effort on research but our progress is quite difficult to predict.

------
uoaei
I wonder how it would work using training data from one language in generating
voice in another language.

------
newsbinator
Strangely, the resulting digital voice sounds Irish... but I am not.

------
bernadus_edwin
Press start recording button and nothing happend. iphone 7 ios 10

------
echan00
Awesome idea. It was just a matter of time!

------
frag
voice upload is not working :(

~~~
adbrebs
Thanks for pointing this out, this was reported by a few others. We are
investigating it. For now, just refresh the page and it should work.

~~~
adbrebs
We've fixed the bug!

~~~
sixftmonster
Getting failed upload after clicking validation... Chrome showing this in
console: "VM291:1 POST
[https://lyrebird.ai/my/recordings/](https://lyrebird.ai/my/recordings/) 400
(Bad Request)"

------
tranv94
How about Adobe Voice? This seems to share a lot of the same breakthroughs as
Adobe Voice.

