Hacker News new | past | comments | ask | show | jobs | submit login

Cherry picking one of the AWS voices is a bit fishy to say the least and Azure is running away with the quality of their voices anyway.



You were not kidding! Azure is extremely impressive. Try the demo here: https://azure.microsoft.com/en-us/services/cognitive-service...


That's insane. It also has SSML and voice types (angry sad etc). This is hands down the winner for me.


Holy cow. That's outrageously good. The female voices are way better than Unreal/AWS


Hmm, I've been in the space for a bit, and I think it's not unsafe to say I picked the best voice AWS provides. I could've implemented a multi-speaker feature for AWS, but I just didn't get a chance. I did try IBM, but it sounds worse than AWS?


Azure is Microsoft.


Oh, I meant to say Azure. Not sure why I typed IBM.


If you build a product no matter what you have to be honest to yourself and imho most of the neural voices from azure sound better than your example. They may miss some of the tempre of your voices but the tempre comes from the examples you fed it... tbh it's not much better than doing it yourself with something like https://github.com/neonbjb/tortoise-tts


Well, sure, I mean MS is a 2T company with 180K employees. So I wouldn't be too surprised if theirs sounds better than mine. The tortoise tts repo seems pretty random though. Are you trying to promote something of your own or something? haha


No I am telling you that your implementation is subpar even to open source once that need only few shot training.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: