
Show HN: Automatic Speech Recognition in TensorFlow - zzw922cn
https://github.com/zzw922cn/Automatic_Speech_Recognition
======
bmc7505
Just skimming the code, this is an impressive piece of work, probably one of
the most comprehensive open source ASR implementations using deep learning
I've seen yet. It certainly looks correct based on the careful documentation
(English comments would also help), but the most important part is providing a
pre-trained model. Validating TF code can be an expensive and painstaking
process, and if you have the weights, then sharing them proves that your net
works. Saying this seems obvious, but very few open source implementations
actually take that step, for whatever reason (not sure why).

~~~
jacquesm
I suspect that the size of the weights file might have to do something with
that. 500M+ is not strange and if enough people hit your small VPS or Amazon
account that can either cause you to be rate capped, go down or go broke.

It would be nice if a universal 'model+dataset+weights' sharing service would
spring into being.

~~~
carbocation
Seems like this is a perfect place to use Bittorrent: a large, desirable file
for which you own the distribution permissions.

~~~
ambicapter
Yeah, that seems really obvious now that you say it. Not sure if there is such
a thing as a cheap seedbox around.

------
asdffdsaa
I know nothing about ML, but I'm willing to read manuals, write scripts, and
deal with tedious technical stuff; is it feasible to use[1] this without
really understanding how it works?

[1] "Use" being defined as having short spoken phrases[2] trigger my scripts.
[2] I'm willing to accept significant restrictions on the nature of these
phrases, such as intentionally making them sound very different.

------
nshm
Alternatives for the reference:

[https://github.com/mozilla/DeepSpeech](https://github.com/mozilla/DeepSpeech)

[https://github.com/pannous/tensorflow-speech-
recognition](https://github.com/pannous/tensorflow-speech-recognition)

[https://github.com/buriburisuri/speech-to-text-
wavenet](https://github.com/buriburisuri/speech-to-text-wavenet)

------
d33
Nice! Just curious, is the git repo all one needs in order to run text to
speech on his PC? How accurate is this thing? Is there a demo somewhere
perhaps?

------
uaspeech
Good job! How much time did it take to train this model? I've heard TensorFlow
is slow comparing to the other toolkits.

------
NHQ
how does one evaluate an utterance?

