
Deep Speech: Scaling up end-to-end speech recognition - lelf
http://arxiv.org/abs/1412.5567
======
dthal
I'll kick things off by linking to the comments from the last time this was
posted:
[https://news.ycombinator.com/item?id=8769067](https://news.ycombinator.com/item?id=8769067).

------
natch
>16.0% error on the full test set

Does anyone know, what was the error rate with previous approaches these days?

~~~
woodson
Look at Table 3 in the paper. Also, 16.5% error ;)

~~~
natch
Oh thanks, and, wow!

------
infocollector
Github repo that I can compile and try?

~~~
woodson
Providing the source would be a step to improve transparency and
reproducibility (the text does not provide sufficient detail for even someone
working in the field to reproduce what they did so that he would arrive at the
same results); however, the more crucial thing is the data. Switchboard,
Fisher, and WSJ are available (provided you have a few grand to spend), but
they say they collected 5000 h of read speech from 9600 speakers.. That's a
huge effort!

~~~
xai3luGi
An alternative source of data that you can contribute to:

[http://www.voxforge.org/](http://www.voxforge.org/)

------
egfx
woo!

