
Ask HN: What is the max audio quality to feed to machine-learning transcription? - kmfrk
Since machine learning doesn&#x27;t map directly to human cognition, I was wondering how wasteful it is to feed something like 160 kbps speech audio to a transcription service, and how much time and performance could be saved by reducing it to something vastly smaller.<p>I know little to nothing about this field, but as someone interested in where transcription is and in archival, I was wondering what the optimal solution is when ripping videos from different places and extracting the audio to transcribe it programmatically.
======
braindead_in
Most datasets are 16khz sampling rate, wav file. A good place to start is
Mozilla deepspeech implementation.

