Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: MeSpeak.js 2.0 – Text-to-Speech in JavaScript (masswerk.at)
84 points by masswerk on Aug 12, 2019 | hide | past | favorite | 28 comments

I'm a little surprised to have not seen any mention of the Speech Synthesis API (https://caniuse.com/#feat=speech-synthesis).

It's supported almost everywhere that the Web Audio API is.

For me, running window.speechSynthesis.getVoices() in the web console of both Firefox and Chromium results in an empty list. Any idea how to populate it? Using ubuntu packages of Firefox/Chromium. Nor does this demo work in any of the two: http://mdn.github.io/web-speech-api/speak-easy-synthesis/

On macOS, the returned list contains the built-in operating system voices (about 50) and some additional ones by Google (roughly 20). The spec [1] says the voices made available are entirely up to the browser. Doesn't really answer your question, but at least some additional insight.

[1] https://w3c.github.io/speech-api/#dom-speechsynthesis-getvoi...

Thanks for confirming that I'm looking at the right list!

It can take a while for the full voice list to load after speech synthesis is initialized on the page - try calling getVoices() again after a small delay.

This worked. First time I got an empty array. Second time I see an array with 47 items.

This project is older (2011) than the Web Speech API.

(However, one of the scenarios for a version 2.0 was to implement the same API, additionally to the 'native' one, to be used as a fallback solution. While I actually had implemented this already, I don't think it may be that useful, while it increases file size quite a bit.)

Edit: Viable points for this may be still a) reliable performance and interaction, and b) known voices (even, if they are a bit robotic), c) use in offline applications. Using an analyser node for animations may be yet another.

Author here, I'll check this every so often and try to answer any questions…

Very cool project and good job on releasing a new version! Libraries like these are huge work.

One question: compared to the native text-to-speech on macOS, the synthesized speech sounds, for a lack of a better word, robotic. Is this an inherent property of the approach you used or a result of trying to squeeze something as complex as this into a Javascript library?

This is based on eSpeak [1] for Un*x environments, which is based on an application for Acorn/Risc OS. So, yes, it's quite dated. On the other hand, it's lean enough to be run in realtime in Emscripten...

However, all the configuration data, including phoneme tables, may be overwritten (but you would have to install eSpeak on your machine first, in order to compile these.)

Another approach would be actually porting this to JS (instead of cross-compiling), by this having full access. But I simply do not have the resources for this. (Meanwhile, there's the Web Speech Synthesis API. With this being available on most modern clients, it's probably not worth the effort.)

[1] http://espeak.sourceforge.net/

I don't have a question, just wanted to say I found the Stereo Spanning example to be a genuine piece of art. The choice of voice and script were truly excellent, having such a robotic voice read the bot's lines was great. Their reading had me chuckling in a few places I would not have if I'd read it simply as text.


Is this is 100% client-side and would work offline?

Yes, it's 100% client side, but you have to cache additional files. (A working set consists of the main script, a worker script for the application core and at least one voice definition to be used.)

Mind that the core won't run concurrently as a worker on mobile devices, but rather as an instance in the main/UI thread. This is, because mobile devices will mute the playback triggered by a message from a worker, as there is no immediate user interaction. Therefore, longer utterances are likely to block the UI noticeably, while the internal sound file is processed. This is a bit sad, but how things are.

Haha, the article says it might not work on mobile. It completely (and immediately) crashed Firefox Android alongside my Live Wallpaper upon pressing the Read button

Sorry! (Maybe an out of memory issue?)

Very nice work, Norbert. I have had an absolute ton of fun playing with your older version and ended up writing wrapper library around your code to add some similar features (webworkers, voices, visualizer) but done in a slightly different way.


Here's a test application with workers still disabled for iOS, but enabled for Android: https://www.masswerk.at/mespeak/android-worker-test.html

Can anyone confirm that this is working on Android? (If so, I'll push this to the release.)

Interesting! Workers fail for me on iOS (even with an ontouch-preplay/unlocker. Can you confirm Android?

If so, I may enable them again for Android based systems.

Actually I edited my comment because I couldn't remember if it works on IOS. I just double checked and yeah my code fails on IOS also but does work fine (albeit slowly) on Android.

Works fine on Pixel 2XL on Android Q, latest version of mobile Chrome.

It didn't seem to work on my tablet (samsung tab, few years old).

Thanks! (Even, if reasults aren't what we may have hoped for.)

Feel free to look through my code, I haven't touched it in a few months but does seem to work alright on Android. I had to change how the espeak files get loaded quite a bit.

Beware, this code is licensed under the GPL.

Why did I read Meeseeks.js 2.0 initially...

Look at me!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact