Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: What is the max audio quality to feed to machine-learning transcription?
2 points by kmfrk on June 5, 2017 | hide | past | favorite | 1 comment
Since machine learning doesn't map directly to human cognition, I was wondering how wasteful it is to feed something like 160 kbps speech audio to a transcription service, and how much time and performance could be saved by reducing it to something vastly smaller.

I know little to nothing about this field, but as someone interested in where transcription is and in archival, I was wondering what the optimal solution is when ripping videos from different places and extracting the audio to transcribe it programmatically.



Most datasets are 16khz sampling rate, wav file. A good place to start is Mozilla deepspeech implementation.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: