Ask HN: What's the most natural-sounding text-to-voice synthesizer?
4 points by aloukissas on April 25, 2020 | hide | past | favorite | 3 comments
Looking to convert text to audio (narration) but don't want the narration to sound robotic. What's the best-in-class ML model that does this kind of synthesis?

Currently, there isn't any perfect TTS system that almost sounds like a human and if you are looking for good TTS services then neural voices from Google, IBM, and Amazon might be promising since they are trained on recent DeepLearning technologies. But they are pretty monotonous although you can define the speech prosody using SSML tags but the results are not good.

https://cloud.google.com/text-to-speech (select voicetype: wavenet) https://text-to-speech-demo.ng.bluemix.net

And if you are looking to train your own model then [tacotron 2](https://ai.googleblog.com/2017/12/tacotron-2-generating-huma...) will be a good start. https://github.com/mozilla/TTS

Thanks! This seems like a good start for me to start playing around :)

I like amazons neural one. However it does sound like it's morphing between two persons...

