Hacker News new | comments | ask | show | jobs | submit login

The Shazam algorithm -- I don't want to be all cynical and dumpy because it's not like I remember exactly how it works either (it's proprietary, after all... and even the explanation I was given was not definitive) but one of my Music Information Retrieval professors once described his anecdotal knowledge of it. It was based on some features derived from FFT for sure but didn't seemed very concerned with note identification, if at all. There are a ton of features that can be post-processed from FFTs that can't be equated to "pitch". Beware misleading analogies... the frequency domain (& quefrency, etc. etc.) is a difficult space to conceptualize.

And when you get into machine learning, some of the operations performed by neural networks and the like don't really represent super linear, human-understandable transformations. It's important to understand feature extraction, but more important in the grand scheme of these things is to understand how to dig data that is useful and how it can be used.




There is a paper [1] describing the algorithms used by Shazam.

[1] http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf


awesome I will give this a read for sure. Thanks!




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: