

Show HN: TTS-API - Text-to-speech API - typeofNaN
http://tts-api.com

======
riobard
For those who want to run their own copy of this, here's how to do it:

1\. Find a Mac-based server (a co-located Mac Mini will be fine)

2\. Run `say -o output.wav $TEXT` to generate the voice

3\. Compress the WAVE file with `lame` or the system builtin `afconvert` to
get the MP3 file.

`say` command supports multiple languages and dialects, but you'll have to
install the necessary voice engines in OS X 10.8. Man page for `say` can be
found here <http://pastebin.com/nWbvJAAX>

The complete list of voices/languages supported so far:

* English (Australia): 2 voices

* English (India): 1 voice

* English (Ireland): 1 voice

* English (Scottish): 1 voice

* English (South Africa): 1 voice

* English (UK): 3 voices

* English (US - Female): 7 voices

* English (US - Male): 6 voices

* English (US - Novelty): 14 voices

* Arabic (Saudi Arabia): 1 voice

* Chinese (China): 1 voice

* Chinese (HK): 1 voice

* Chinese (Taiwan): 1 voice

* Czech: 1 voice

* Danish: 1 voice

* Dutch (Belgium): 1 voice

* Dutch (Netherlands): 2 voices

* Finnish: 1 voice

* French (Canada): 2 voices

* French (France): 4 voices

* German (Germany): 3 voices

* Greek: 2 voices

* Hindi: 1 voice

* Hungarian: 1 voice

* Indonesian: 1 voice

* Italian: 3 voices

* Japanese: 1 voice

* Korean: 2 voices

* Norwegian Bokmal: 1 voice

* Polish: 1 voice

* Portuguese (Brazil): 1 voice

* Portuguese (Portugal): 1 voice

* Romanian: 1 voice

* Russia: 1 voice

* Slovak: 1 voice

* Spanish (Mexico): 2 voices

* Spanish (Spain): 2 voices

* Swedish: 2 voices

* Thai: 1 voice

* Turkish: 1 voice

~~~
snoonan
Careful, though. This is expressly against the license agreement for Mac OS X.

------
yoda_sl
Quite cool... But if this is generated by the text to speech engine from OS X,
then I am afraid it is going beyond the license that come up with OS X. I
remember reading through that license and it was clearly stated that using the
OS X TTS was only for local usage on your Mac.

So I am extremely curious to know the license behind this tts-api? Can the OP
provide such info or provide some of the tech behind it?

~~~
mherdeg
In case anyone else is curious, the section you're thinking of is

""F. Voices. Subject to the terms and conditions of this License, you may use
the system voices included in the Apple Software (“System Voices”) (i) while
running the Apple Software and (ii) to create your own original content and
projects for your personal, non-commercial use. No other use of the System
Voices is permitted by this License, including but not limited to the use,
reproduction, display, performance, recording, publishing or redistribution of
any of the System Voices in a profit, non-profit, public sharing or commercial
context.""

~~~
yoda_sl
Thank you! Exactly what I was referring to.

------
bencevans
There's an unofficial Google API that does the same job people may be
interested in:
[http://translate.google.com/translate_tts?tl=en&q=Hello+...](http://translate.google.com/translate_tts?tl=en&q=Hello+World)

~~~
duiker101
limited to 150 characters IIRC

~~~
kajecounterhack
Bing's got an API -- you need to sign up for a key but you get a very large
number of free uses and I dont think there's a char limit.

------
lefthansolo
This is a nice one, however I'm still confounded by the lack of progress since
bell labs made an online text to speech converter many years ago.
Particularly, the notion that the interpretation of each sentence is
idempotent is just wrong. Want to see what I mean? A human would not speak
like the following; there should be differences in intonation, "emotion"
(sounding bored, angry, excited, etc. that varies depending on the number of
times "dogs" would be said), speed, and delay. In addition, you have to
breathe at some point, and even the best audiobooks have some level of breath
noise.

[http://tts-api.com/tts.mp3?q=dogs.%20dogs.%20dogs.%20dogs.%2...](http://tts-
api.com/tts.mp3?q=dogs.%20dogs.%20dogs.%20dogs.%20dogs.%20dogs.%20dogs.%20dogs.%20dogs.%20dogs.%20dogs.%20dogs).

~~~
username3
It takes a breath for blank lines or new paragraphs. [http://tts-
api.com/tts.mp3?q=High%20Quality%0AWe%20believe%2...](http://tts-
api.com/tts.mp3?q=High%20Quality%0AWe%20believe%20that%20we%20provide%20the%20highest%20quality%20free%20TTS%20service%20on%20the%20Internet.%0A%0APlease%20compare%20us%20to%20our%20alternatives%20and%20let%20us%20know%20if%20we%20can%20improve%20the%20conversion%20in%20any%20way.%20Bug%20reports%20are%20greatly%20appreciated)!

------
lobster_johnson
This is a bit off topic, but a related question: I have been looking for a
"bad" text to speech library that produces Stephen Hawking-style audio,
similar to what's found in old 1970/80s electronics. Examples:

<http://www.youtube.com/watch?v=gh0fBwiE4cE>

<http://www.youtube.com/watch?v=vvYvCaAN3Jg>

Anyone?

~~~
kellishaver
On OS X, the "Fred" voice is pretty close.

say -v Fred Hello. My name is Fred.

~~~
lobster_johnson
Thanks, but I need a library I can use in an app (and not just on OS X).

------
snoonan
Excellent and dead easy to use. Great work on making it simple.

I was actually looking for a similar API like this just a few hours ago, but
with some other languages as well. What's the TTS engine driving this?

BTW, One small critique on the page copy... "You expect" could be more
politely expressed and in terms of the user's pov/benefit.

~~~
stcredzero
_> Excellent and dead easy to use. Great work on making it simple._

The acronym should reflect this ease of use for proper pronunciation. How
about Text Intelligently To Speech?

------
franze
just wanted to add: you can now do this all in the browser (100% client side),
too -> <http://lalo.li/> (it's forkable)

~~~
archangel_one
Well, not really; the quality is nothing like as good. It's a nifty trick
being able to do synthesis at all in Javascript, though.

And, of course, it presumably doesn't have the licensing issues this other
approach would appear to, if it really is using Apple's voices.

------
alexmunroe
Pretty impressive, I've given it a go with a few of the more technical terms
that I come across at work and that other TTS' have difficulty handling and it
dictated them flawlessly. Very interested to see where this goes!

------
po84
<http://syntensity.com/static/espeak.html>

If you're looking for a client side solution, here's espeak compiled to JS
using emscripten.

~~~
ninjin
Neat, once again, emscripten proves useful. I do find it important though to
point out the lack of a good open text-to-speech engine.

Here is a speech as rendered by tts-api.com (<http://goo.gl/PoZc4>). Now, for
speak.js [1], to make a comparison, paste in the first few of the top
paragraphs from here [2] and compare the quality between the two.

There really is a gap to fill for a good open-source alternative here. But I
suspect the main barrier is that there is a large amount of data needed to
generate good voices. Still, a worthy target.

[1]: I tried to make a URL for this too, but despite the URL looking as if it
could take arguments it refused to work, at least for me under Firefox and
Chrome.

[2]:
[http://www.nytimes.com/2008/09/25/business/worldbusiness/25i...](http://www.nytimes.com/2008/09/25/business/worldbusiness/25iht-24textbush.16463831.html)

------
tantalor
Does it support IPA or SSML[1]? I ask because AT&T's TTS API[2] does, but it
kind of sucks!

For example,

    
    
      <phoneme alphabet="ipa" ph="/ˈkreɪp/"></phoneme>
    

[1]
[http://en.wikipedia.org/wiki/Speech_Synthesis_Markup_Languag...](http://en.wikipedia.org/wiki/Speech_Synthesis_Markup_Language)

[2] <http://www2.research.att.com/~ttsweb/tts/demo.php>

~~~
ChrisKelly
You might be able to do non-English pronunciations by trying phonetic
spellings, which can be tricky. The best I could get for "felicidades" was
this: fell isseedadesh.

------
dholowiski
Where are the voices from, and how are they licensed? Could I use the output
from this for commercial purposes?

------
elbuo8
Nodejs module available at: <https://npmjs.org/package/node-tts-api>
<https://github.com/elbuo8/node-tts-api>

Enjoy

------
dave84
Sounds like Alex from OSX.

------
chrisallick
<http://clubsexytime.com/projects/tweader/>

twitter tracker using tts :)

thanks, such an amazing service

------
babebridou
Great, simply great API that just works. Keep up the good work!

What I would love to see now would be the ability to send compressed text to
shorten the url.

------
d0vs
Wow, very impressed!

Try "Hello.", "Hello!" and "Hello?"

------
jeffehobbs
Is this piping to the Mac OS X "say" CLI command? Neat. I'd love to see the
source behind this, if you felt like putting it on Github.

------
sinzone
Hi, would like to have this API on Mashape.com ... I think our community of
developers will like it.

------
KwanEsq
Very nice. Would be good to have the option of other formats, specifically
Vorbis and/or Opus.

------
savrajsingh
Awesome! Will you release a speech-to-text API as well, or know of a good one?
Thanks!

~~~
taf2
i've had some success with the web api provided by att:
[http://developer.att.com/developer/forward.jsp?passedItemId=...](http://developer.att.com/developer/forward.jsp?passedItemId=12500023)
it's in beta i believe.

------
yati
Great job! Something that was really needed. I would love to see this open
sourced :)

------
morgangiraud
Very smooth and simple as it should be. Are you going to implement other
languages ?

------
leoplct
Would be great if you could share your code on Github

------
bussetta
App idea! Now listen to your tweets from timeline!

~~~
chrisallick
<http://clubsexytime.com/projects/tweader/> done

------
shritesh
Thank you :) Really needed something like this!

------
codegeek
pretty good. A nice to have will be to let the audio play in browser as well
instead of just having a link.

------
davecap1
What about a speech-to-text API?

~~~
kajecounterhack
Speech to text is a far more computationally difficult problem. Google has an
unofficial one -- you can curl flac voice files to them but even their
transcription is not terrific. (They use it for automatic captions on youtube
-- use that to judge...)

