

Text-to-speech has a new soul - AudioDrug
http://vocamedia.wordpress.com/2010/06/18/welcome-to-vocatalk-personal-podcast-blog

======
elblanco
May want to look into these papers

<http://speechprosody2010.illinois.edu/papers/100096.pdf>
<http://www.springerlink.com/index/P32774KN22650J42.pdf>
[http://www.softsea.com/review/Pistonsoft-Text-to-Speech-
Conv...](http://www.softsea.com/review/Pistonsoft-Text-to-Speech-
Converter.html)

I've heard that adding breaths into the mix are not just for the speaker, but
that the listener can take that moment to mentally index what the speaker has
just said. I can't find the paper at the moment, but there's been some good
research that simply adding breathing sounds and pauses at key points in the
audio can make listening to long TTS passage more bearable.

~~~
AudioDrug
Also, VocaTalk uses all the installed voices on your system, and randomly
changes the speaker for each paragraph. That's also great factor for reducing
monotony. It also has other optional audio processing features like moving the
voice location, echoes and dynamic voice pitch modulation.

------
teilo
Not really a leap forward so much as a synthesis of stuff that already exists.
Same technology we have heard in the last several years, with added music.

~~~
andfarm
And significantly worse voice synthesis than I've heard elsewhere. Most of
Apple's Macintalk 3 voices (which are over 15 years old now!) sounded better
than this.

------
AudioDrug
The voices are ATT's. VocaTalk supports any SAPI 5.1/5.3 compatible voice. It
combines music, binaural waves, and fx like positional audio to improve the
experience. Those features create a unique experience that makes it possible
to listen hours and hours of tts.

------
AudioDrug
Listen with stereo earphones or earbuds. You'll notice several interesting
features like echoes, positional audio. VocaTalk also supports voice
modulation feature. This is all for reducing the monotony of tts and making it
more fun to listen for hours.

