I'm curious to see how well the iTunes Match feature works. Naturally it will have to use audio fingerprinting rather than just trusting user-supplied metadata. The catch is that this technology is probably based on Lala (who Apple bought out), and Lala's software was extremely dodgy. I had records where <50% of songs were correctly identified, the others "matched" to seemingly random tracks from completely unrelated genres.

If Apple has not earnestly dug into and improved on this software, users will be completely mystified and the whole thing will be a big embarrassment for Apple.

Good point, matching is a difficult feature to get right. Google Music does an extraordinary job, but this is the type of problem that Google's engineers are good at solving. Apple's engineers (traditionally) aren't, so it will be interesting to see if they've spent a lot of time on getting this right, or if they only tuned to matching iTunes bought music.

EDIT: Interesting that MusicMatch Jukebox perfected matching over 10 years ago. Even after Yahoo bought and ruined it, I kept the program around simply because it did the best job of tagging music than any other software.

I actually keep my music matching separate, and do it in Music Brainz Picard. I went through and fixed tags for somewhere around 10,000 tracks rather painlessly. It will also rename and move your files automatically. Since it also uses user submitted data, it does a great job of finding some pretty obscure stuff too. I highly recommend it.

Same here. I keep my MusicMatch Jukebox v9. Their synths are so great. Even now, I prefer MusicMatch over iTunes.

I don't know how big lala was, but Apple has maybe the biggest collection of correctly tagged collection of audio, along with user's listening habits through buys, and other stuff (soon playlists, etc...). This can certainly help mitigating this problem.

What's really impressive to me is that they manage to negotiate a deal with the majors to do this.

Apple uses Gracenote for identification and tags. It is the industry standard.


There are free and open implementations of the same thing as well. It hashes the track locations and lengths to create a 'fingerprint' for each CD. Then it looks up the hash in the database.


Wouldn't work for individual songs, this is where algorithms like the ones Shazam and LaLa use come into play since they'll actually look at the audio file. Getting this correct is pretty important or you'll end up with the wrong songs.

Pirates can actually try to game the algorithm and rip albums in bad quality to minimize file size but good enough for Apple to recognize the songs as such.

Yeah, I realize that. I guess I assumed it would be obvious since CDDB works from track spacing and lengths when there's no such thing with MP3 files. For the same reason, iTunes can't magically tag arbitrary MP3 files.

Yes, the metadata problem is a big issue too. Apple mentioned that the ones they can't match will get uploaded. This probably means the same thing as it did with lala -- those "unmatched" tracks are going to end up orphaned from the rest of the album since they don't have the same metadata.

Looking at Apple's track record, I don't expect anything else than a good-working service. Not that anything Apple touches turns into gold (MobileMe...), but seeing that this is a major new feature for them, they've probably tested it quite well.

MobileMe was a major new feature for them too. As was iTools before it. Every time Apple rebrands this suite, they make it out to be an amazing, revolutionary service, and it turns out to be kind of half-baked. From what I've seen so far, iCloud doesn't look much different. Apple makes great software and hardware combos, but has yet to prove that it gets the Internet.

The iPhone 4 antenna was touted as a major new feature as well. However, their method of testing (encasing them in thick plastic to mimic the appearance of an iPhone 3G/S) may have resulted in "antenna-gate". OR, at least given bloggers (and commenters, ahem) something to speculate about.

I do hope that they've tested it as well. And I hope that testing is extended beyond just Steve Jobs a few choice employees. Unfortunately, I highly doubt that to be the case...

Nothing labeled -gate since Watergate has amounted to anything other than media froth.

What about angel-gate!??!?!

The only gate near and dear our little-HN hearts!

     (Other than nand-gates)

Probably uses something like robust hashing (Technically impossible) along with matching metadata tags to match songs/artists.

I wonder what you think you mean by "technically impossible" since decently good audio matching software already exists (e.g. Shazam).

I mean it is formally impossible to generate a robust hash that cannot be attacked. In english, You'll get false positives. Shazam and the like are simply good enough, rather than correct.

I know the music industry were interested in robust hashes in order to identify songs on file sharing networks. It turns out to be technically impossible. The file sharers would trivially obfuscate or generate false positives which ended up as a legal liability to them.

