Hmm, I've been in the space for a bit, and I think it's not unsafe to say I picked the best voice AWS provides. I could've implemented a multi-speaker feature for AWS, but I just didn't get a chance. I did try IBM, but it sounds worse than AWS?
If you build a product no matter what you have to be honest to yourself and imho most of the neural voices from azure sound better than your example. They may miss some of the tempre of your voices but the tempre comes from the examples you fed it... tbh it's not much better than doing it yourself with something like https://github.com/neonbjb/tortoise-tts
Well, sure, I mean MS is a 2T company with 180K employees. So I wouldn't be too surprised if theirs sounds better than mine. The tortoise tts repo seems pretty random though. Are you trying to promote something of your own or something? haha