Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Open-source, local Text-to-Speech (TTS) generators
1 point by dv35z 11 days ago | hide | past | favorite | 4 comments
Hola amigos - I just noticed that https://coqui.ai/ is "Shutting down".

I'm building a web app (React / Django) which takes a list of affirmations & goals (in Markdown files), puts them into a database (SQlite), and uses voice synthesis to create voice audio files of the phrases. These are combined with a relaxed backing track (ffmpeg), made into playlists of 10-20 phrases (randomly sampled, or according to a theme: "mind" "body" "soul") and then play automatically in the morning & evening (cron). This allows you to persistently hear & vocalize your own goals & good vibes over time.

I had been planning to use Coqui TTS as the local text-to-speech engine, but with this cancellation, I'd love to hear from the community what is a great open-source, local text-to-speech engine?

Generally, I learn both the highest quality commercially available technology (example: ElevenLabs), and also the best open-source equivalent. Would love to hear suggestions & perspectives on this. What voice synth tools are you investing your time into learning & building with?






Mozilla's browser tts is kind of not bad, just parse and buffer one sentence at a time and it does all right.

For the backend, I've experimented with piper, which has a lot of voices and accents, though it's tricky to buffer and sync long texts.

https://github.com/rhasspy/piper


eSpeak NG?

I'm scoping this out, and was able to install espeak-ng on Mac using Homebrew... but I'm not understanding how to get quality voices loaded up (the default voice it installed is quite robotic sounding). What's the path to, for example - a high quality British voice...?

Sorry, I didn't need high quality, just in-browser, so I have no clue. ¡Buena suerte!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: