Hacker News new | past | comments | ask | show | jobs | submit login
How does the music-identifying app Shazam work its magic? (slate.com)
50 points by chuck_taylor on Oct 20, 2009 | hide | past | favorite | 16 comments



Too bad they don't own the rights to their own fingerprinting software...

At this point, it seems Shazam isn't much more than a brand. Which, I'm sad to admit, they've done a very good job building.

(fair disclosure: I work for a company that sells a competitive product)


Lest we forget what Mint.com did with Yodlee.


It seems that the paper do come from its chief scientist. Who own the rights/patent? Columbia University?


The majority of patents applications (and issued ones) I found are assigned to Shazam Entertainment Ltd, there is not a mention about Columbia.

Issued Patents (Wang, Avery Li-Chun):

1. http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sec...

Here a search from the inventor at the AppFT (Patent Applications):

A. http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sec...

4 most relevant applications:

1. Method and System For Content Sampling and Identification

1.1 http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sec...

1.1.1 TIFF image (rename file *.tif): http://aiw1.uspto.gov/.DImg?Docid=us20080154401ki&PageNu...

2. Method and apparatus for automatically creating database for use in automated media recognition system

2.1 http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sec...

3. Method for High-Throughput Identification of Distributed Broadcast Content

3.1 http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sec...

4. Robust and invariant audio pattern matching

4.1 http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sec...


So let's go down the rabbit hole farther....

Wang/Li-Chun (US6990453) is assigned to Landmark Digital Services, Inc. Let's see who they are:

"Landmark Digital Services LLC is a wholly owned subsidiary of Broadcast Music, Inc. (BMI), a company long known for visionary technical innovation driven by a genuine passion for music. In 2005, BMI acquired the complete patent portfolio from Britain’s Shazam Entertainment Ltd."

I've always heard mentions that the Shazam technology was originally targeted to root out pirated music on P2P nets. Sounds about right to me now.


i worked at landmark digital up until about a year ago, so i can say with some confidence that you've got it backwards.

the music recognition algorithm was developed by mr wang, specifically for shazam, to do just what it has been doing all along.

the reason landmark got into the picture is because they are a subsidiary of bmi, the music rights organization. they monitor radio stations. since the dawn of time, they have been paying actual human beings to record logs of every song played on most of the big radio stations in the united states, 24/7. they use that information to ensure that the song's rightsholders get paid for the plays. obviously this is a situation ripe for automation. they hired a guy to do a multi-year search for the best technology for the job, and settled on shazam's algorithm.

i am not privy to the details, but i believe shazam sold the code to landmark because they were strapped for cash. there are complicated legal agreements in place of course, so that shazam can still use the code for cellphone music recognition.


It's actually technically not that hard to get something like this working. Monetizing is trickier. I don't see how they will make enough money from it.

Didn't know they were still going after 7 years.


Actually I have to completely disagree. Technically this is very hard to get right, monetizing it is a lot easier.

It is quite hard, believe me. Definitely not a weekend project. When it's as simple as recognizing clean audio, it's indeed not very hard. The problems start with (a) background noise (b) karaoke and mono audio input as without stereo channels removing the vocals is very hard (c) radio stations and television broadcasters often speed up songs a bit to fit in the timeslot.

The monetizing bit is a lot simpler. I can't really talk about that because of NDAs I've signed but as a hint, think about B2B instead of selling it to consumers. What big companies need to track music? How could audio recognition help them?


I haven't tried anything like it, so I can't comment on the difficulty of creating the fingerprinting technology, and frankly, I'm not all that impressed with Shazam. But as I understand it, Pandora uses the same Music Genome data to make its suggestions, and that is REALLY magic.


Pandora actually uses a mix of human and algorithm for the Music Genome. They have a whole team of musicians just cataloguing and categorizing music based on various metrics. That's why they're usually very accurate.


I've heard a couple times that Pandora doesn't really rely on the Music Genome any more because it has so much other listener data it can use to determine relevant matches. Haven't actually asked Pandora about it, so take that with a grain of salt.


Yah I believe that. We don't have access to the Music Genome and we can still build a decent rec engine. When you have millions of users you can assemble pretty good features if you log the usage properly.


melodis.com (the maker of midomi) is a lot more impressive than Shazam.

Here's a brief description about its proprietary search technology.

http://sev.prnewswire.com/multimedia-online-internet/2007012...

Here's an article that compare the features of both iPhone apps.

http://www.theiphoneblog.com/2008/11/05/app-app-shazam-midom...


Shazam works much better than Midomi in my incredibly informal, anecdotal experiences... never got any of the extra midomi features to work when I could use them, either.


When was the last time you used Midomi?

I think it's pretty well established that Midomi is faster and has better music recognition than Shazam. There's a few video comparisons out there that show this:

http://www.youtube.com/watch?v=P-EK50DWDn0

http://revision3.com/appjudgment/ip_mau_shazamvsmidomi


Yeah, I used it way back when it first got posted on the App Store.

I reinstalled it based on your post and used it in Starbucks earlier, where it definitely worked better than Shazam.

Still can't recognize my humming, but that might be my fault more than Midomi's...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: