
Audio Datasets for Machine Learning - TakakiTohno
https://lionbridge.ai/datasets/12-best-audio-datasets-for-machine-learning/
======
beatle_sauce
IMHO the speech dataset list is missing other interesting free corpora, e.g.
the TEDlium dataset, Voxforge, Common Voice. A more comprehensive (but not
complete) list can be found here: [https://github.com/kaldi-
asr/kaldi/tree/master/egs](https://github.com/kaldi-asr/kaldi/tree/master/egs)
(download links can be found in the scripts)

------
sschmitt
Also see the "Heidelberg Spiking Datasets": [https://ieee-dataport.org/open-
access/heidelberg-spiking-dat...](https://ieee-dataport.org/open-
access/heidelberg-spiking-datasets)

------
MintChocoisEw
Spoken Wikipedia corpus is especially impressive

