
MFCCs: Engineering Features from Sound - yawz
https://pex.com/blog/machine-learning-mfccs-engineering-features-from-sound/
======
eindiran
I've always found it fascinating that both speaker identification and speech
recognition use MFCCs (which I discovered when talking with someone who had
worked on speaker identification for their PhD):

* In the case of speaker identification, you don't care about what is being said; you care about who is saying it.

* In the case of speech recognition, you don't care about who is speaking; you want to know what is being said.

That both tasks use the same underlying features is very surprising to me. I
imagine that it points to something very powerful about the mapping of the mel
scale to psychoacoustics, but I'd be interested to hear other theories about
why it shakes out that way, especially given that the research on the mel
scale has been frequently criticized.

