Hi HN,
based on Whisper [0] and Whisper.cpp [1], I created a comparison of transcription performance (quantitative metrics such as relative speed).
You can find the code in the Colab [2] and a blog post [3] containing a how-to guide and visualizations.
In the future, I'd love to add WER evaluation and visualizations based on ground-truth data.
Bonus: Normally you would log these results from Python to Weights & Biases, but there is a way to log even from C++ / the cli by using `subprocess`
Would love to know what you think of this comparison and what features / attributes you would like to see in a more sophisticated comparison.
Thanks!
[0]: https://news.ycombinator.com/item?id=32927360
[1]: https://news.ycombinator.com/item?id=33877893
[2]: https://colab.research.google.com/drive/1mXZUdIbvdNVOFRJaIhW...
[3]: https://wandb.ai/hans-ramsl/gradient-dissent-transcription/r...