Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Any good text-to-speech solutions for someone who cannot talk normally?
18 points by akulbe 18 days ago | hide | past | favorite | 9 comments
tl;dr - had COVID ; had to be intubated ; can't talk ; need help

I had COVID in April and I had to be intubated. As a result, I now have bilateral vocal cord paralysis, and can barely talk above a whisper.

Since my vocal cords are paralyzed roughly halfway open, I cannot breathe a normal volume either.

When I talk, I can only <breathe> say a few <breathe> words at a time. <gasp>

ASIDE: I'm VERY lucky to be alive, and VERY thankful.

Examples of what I've seen… Microsoft's Edge browser has a "read aloud" mode that has a very human-like voice.

The built-in dictation in Windows itself still seems robotic compared to what's in Edge.

I'd like to have some kind of tool that I can type things in and it'll speak for me, when I'm on conference calls.


Add-Type -AssemblyName System.Speech

$s = New-Object -TypeName System.Speech.Synthesis.SpeechSynthesizer


You can wrap it in a function then just call it when you want to text talk.

I personally like ibm tts slightly more than Google although it is more expensive. I think Google, but definitely other startups also allow you to train the model to be closer to your voice. Since you might have many recordings of yourself before this, could be cool. Comment if you can't find it and I can look

It's none of my business how long term it is, but if it is long term look into stenography, especially open source opensteno and plover. Old talk but https://youtu.be/Wpv-Qb-dB6g

It's what court reporters use to transcribe at the speed of text but for you it could mean giving text into a tts software at the speed of normal talking. I've only tried it briefly but if you take the time to learn it, you can comfortably get to normal speech speeds with no hang fatigue. It's one of the reasons opensteno is so nice bc it also has cheap steno keyboards which are normally thousands and locked behind industry. Comment if you want more links! I can add when I'm by my computer

Edge's good sounding voices appear to be online only, and the locally installed ones are the ancient poor-sounding ones.

I noticed that the online ones sometimes translate numbers instead of just reading them; the Welsh one translates all numbers (I tried) into Welsh, the German one translates percentages into German but reads prices in English.

Anyway, if this is likely to be long-term permanent - https://www.youtube.com/watch?v=K3MYFT6VZk8 "Realtime text to speech with Plover" although it sounds awful.

(pet annoyance of mine, that all the best stuff (voices, recognition, image recognition, etc) these days seems to be cloud / mobile only and not desktop).

Google’s Text to Speech API’s WaveNet voices are nearly unmatched in terms of quality. Of course, the API is also fairly expensive.


You can test them out with any arbitrary text halfway down on this page.

While I was writing this, I had the idea of using the read-aloud feature in Edge, by splitting my screen between VS Code and Edge, and writing an HTML file.

It works when I'm not on a Teams call, but not while the call is active. :(

check out https://github.com/OptiKey/OptiKey/wiki There are specialised versions availble at: http://www.optikey.org/

No first hand experience, but emacspeak is a long term contender in this field.

Can't speak for Windows but OSX accessibility has lots of options for you.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact