Hacker News new | past | comments | ask | show | jobs | submit login
Creating Shazam in Java (redcode.nl)
295 points by freefrag on May 17, 2013 | hide | past | favorite | 43 comments




Update: I posted a link to some source code that implements the Shazam algorithm:

https://news.ycombinator.com/item?id=5724442

About the patent lawsuit thing:

As I understand it, Shazam sold their patent to Landmark Digital Services, which are a part of BMI the record label. They kept an exclusive license to make Shazam-like software for phones.

You can imagine BMI wanting it to make money from how a service such as Youtube fingerprints and detects copyright infringement...

And it was this BMI company that were trying to get this blog post explaining the patented algorithm removed from the internet.

One post from the BMI lawyers to Roy in the Netherlands was particularly broad bullying:

> Mr. Van Rijn,

> The two example patent numbers that I sent you are U.S. patents, but each of these patents has also been filed as patent applications in the Netherlands. Also, as I'm sure you are aware, your blogpost may be viewed internationally. As a result, you may contribute to someone infringing our patents in any part of the world.

> While we trust your good intentions, yes, we would like you to refrain from releasing the code at all and to remove the blogpost explaining the algorithm.

> Thank you for your understanding.

> Best regards,

> Darren

> P. Briggs

> Vice President &

> Chief Technical Officer

> Landmark Digital Services, LLC

Roy gave a great talk at Devox about this: http://www.redcode.nl/blog/2012/03/devoxx-2011-talk-freely-a...

I think I heard that Shazam recently got the patent back. I speculate BMI found no-one to license their fingerprinting tech for copyright infringement.


" you may contribute to someone infringing our patents in any part of the world."

Oh really? What a dolt.

Patents are, by definition, public. (not before they are accepted, though)


    Patents are, by definition, public.
That's actually the telling sign of the dysfunctional patent system. Companies want to use patents to prevent everybody else from doing something similar, and in this case, even from just talking about it (which is obviously ridiculous).

Patents used to be a framework for sharing technological progress without giving up ownership, i.e. make it easier for everybody else to build on other's progress - that's long gone.


[deleted]


In the article he explicitly states that he read the patent and implemented based on it. Which is not illegal.

And he never did publish the source code beyond the excepts in that blog post.


This is very cool. Minimum clear implementation of the algorithm that replicates the effect of Shazam. It's refreshing to see a blog with actual code sample got voted up instead of all the press releases.


+1


I mirrored this implementation a while ago since the full source isn't available. It was not nearly as successful as the blogger portrays. For example, if I used a high quality wav mono file to create a fingerprint it would have a hard time identifying a track that is an mp3. It seems the maximums actually get shifted and merged from compression. In other words there's a reason shazam uses entropy based anchor points to help it pick hashing values.


I'm wondering if they bound the fingerprint search to human audible frequencies. MP3 compression, as a lossy codec, works by discarding information in the input signal that corresponds to inaudible frequencies. I believe this could be mirrored in the implementation by running the frequency domain peak-pick algorithm only over specific bin ranges.


I don't recall if the paper specifies the frequency ranges used but my implementation was bound to audible frequencies. I was going to use hill climbing search to find optimal frequency ranges but came to the conclusion my implementation was too flawed regardless. If I looked at the two graphs side by side(compressed vs uncompressed) they looked nothing alike. For example, the peak might be in the same region but it would be shifted.



For those interested in more about the algorithm, one of the guys who created Shazam released a whitepaper on it. http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf


After using Shazam, I was kind of hoping there was more to it than just a time windowed frequency domain peak-pick algorithm. The algorithm itself is pretty basic from a signal processing perspective, but I think the key insight here was that the results are unique enough to store off and compare other samples against at some later point in time.


Yeah, the magic (if there is any) is doing the match across a silly amount of songs in a relatively short time. Not groundbreaking exactly, but operationally quite interesting.


I actually remember first using this by dialling 2580 about 10 years ago. At the time it felt truly magical.


Are there other uses for this algorithm/technique, when applied to signals other than audio? I mean apart from identifying a source from a small clip.


This type of analysis is commonly used in tons of things, like communications systems, image processing, radar, etc. I used a similar technique when trying to identify an underutilized wifi channel in the vicinity of my apartment.


IIRC there have been a number of papers on using a similar technique with speech to text applications.


Well I'm sure they must be using a few tricks in their implementation. I've always been interested in knowing how Shazam actually works and had in mind that they must somehow split a song in intervals and "hash" every interval, then store them in some kind of indexed database for fast retrieval. Seems I was not too far off:)


Yeah, this is the obvious implementation. As he said in his follow-up post:

>And second, I’d like to know which patents are in play. Because I just couldn’t think that something this easy (music-fingerprint is a hash, and we do a lookup) can be patented.. Maybe in the States, but in Europe?


Can someone please compare it to other fingerprinting approaches http://en.wikipedia.org/wiki/Acoustic_fingerprint ?


So, the patent infringement story ended up with "Good luck."?


It happened 3 years ago and there was nothing posted after 2010 on that issue anymore, so I believe they didn't file a suit in the end.

Here's the links to the infringement story for those that missed it: http://www.redcode.nl/blog/2010/07/patent-infringement/ http://www.redcode.nl/blog/2010/11/patent-infrigement-part-2...


Thanks for the link.

The result shows he did the right thing not complying, they had no legal grounds for their demands.


I find it absurd how they try to threaten people with their legal teams. They had nothing on you but didn't want you to release the code so they threw their legal team at you and made them come up with some bullshit. That's just ridiculous! Good for you for standing up against that crap.


This is interesting

I wonder how the work is split between client/server in (actual) Shazam. (I suppose only the key points are sent to the server, but I may be wrong - Siri for example sends the server a compressed audio file of the recorded sound)


You can phone Shazam up and have it make the identification using what it hears live. There's no client processing at all for that.



First time I used Shazam, was so amazed. Had to download the original paper, still couldn't understand well enough how it worked, in order to code it. now lets get to work on it.

Great article, thank you


Good article, title should have [2010] in it.


One possible way to solve the legal troubles is to just remove any references to the product name 'Shazam'. You could title the blog post "Algorithm in Java that identifies music similar to other commercial products" (too long.. but use your imagination)


That wouldn't do a thing. Patents cover the code, not the names. (Well, the "embodiment", but since all there is here is code, it is clearly covering the code.) That would only help a trademark infringement, and there isn't one here.


Ah.. good point.


From where Shazam gets its content - fingerprint database?

I mean, did they bought/rent mp3's?


I haven't had a chance to google for a source so take this as anecdotal but I vaguely remember reading an interview with the people behind it (when Shazam first launched in the UK) and in it they said they were ripping thousands of CDs a day/week (can't remember which) and running each track through their algo. Can't remember if they bought the CDs or had some deal in place with the record labels.


Interesting. So as with many other "innovative" startups - the content is the crucial thing.

As already pointed here, audio fingerprinting is not a new thing. Although, they might have added some twists in order to were able to patent it.


We had a 'build your own Shazam' as a lab for Berkeley's Intro. Signals & Systems class this semester. Super cool to see it working and quite an interesting application of Signals & Systems



What ever happened to the "patent infringement" issue?


Is there any code changes that you can make to not conflict with the patent?


For the first time I'm surprised that one of the first comments isn't "why was it written in Java, bla, bla bla". Those were getting really annoying.


I wonder if they would have bothered you if you had named the post: "Creating Google Ears in Java"


Who made that article could have said what external libraries did (s)he use.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: