

Ask YC: Question(s) for speech technology experts. - opportunity

I would really appreciate if someone working on speech technologies (speech to text or text to speech) can provide some insight on this?<p>Since past couple of months, I have been completely fascinated with what speech technologies (specifically speech recognition and speech synthesis) can do and how they can enhance the user experience. I decided to delve deep into speech synthesis technologies. From my research into available solutions, there is a huuuuuuuuge difference between the open source solutions and the commercial solutions available for $$$$$.<p>From what I have read about speech recognition, the open source solutions perform extremely poorly when compared to their commercial counterparts.<p>Has anyone else here looked at the possibility of improving any of the available open source speech technologies to a level where it is close to the commercial ones? Is it even possible to improve Sphinx or festival to a level where it can be commercially used without developing everything from scratch?
Is it something even worth investigating?<p>Is it possible for someone working in this area to articulate the challenges(technical/monetary etc.) involved?<p>Okay, thanks a lot for reading this. Looking forward to your comments.<p>P.S.:<p>I would really really like to get opinion from someone who has worked or is working in this area about their experiences. I am located in south bay.
I am also attending the startup school this month.
======
JeffJenkins
I worked at Nuance Communications for about a year doing voice application
development. I'm not familiar with Sphynx, but I did talk to a number of
people at the company about it (this was 2004-2005).

From what I gathered it would be pretty difficult to get speech synthesis up
to their level. A single "voice" will be generated by taking _tens of hours of
audio_ and using algorithms to splice them together based on the text.

The only significant monetary constraint is going to be if you want to have a
real voice talent doing your recording in a studio, but I wouldn't try to
tackle the technical issues without a subject matter expert

