As an example, notice that Wikipedia does not require people to send modified articles back to the Wikimedia Foundation, or to allow them to use the data as they see fit (this is clause D. 3. d. in your license). They just require people to contribute under a copyleft license, and they can thus incorporate derivative versions published elsewhere if they want. This is nice because it ensures that the Wikipedia content can be useful even if the Wikimedia Foundation disappears.
Anyway, awesome work, congrats!
One possibility is so that database can be created on the backs of the users, then the database owner can slam the door and turn it proprietary, like the CDDB did.
Their stated goals don't lean that way, but I don't see a clause that either permits or forbids the database owner this action. Lawyers will be required to figure that out. I suspect there is an implicit ability for the grantor to terminate the license, but that is what lawyers are for.
(I did notice that the termination effects reference clauses that don't exist.)
Good luck though :/
It's probably the biggest thing to happen in this space since the release of the iPhone version of Shazam!
The main reason I use Shazam over SoundHound is simply that SoundHound doesn't have a Windows Phone 7 app.
It looks up the data (as it should be), so you rarely will have to enter anything yourself. It sometimes helps to add data for 1 track for horribly mistagged stuff, but after that you can usually drag and drop the rest of the album.
MB is a bit of a monster. If you're tagging a lot of music you may want to install it locally. My experience with this is documented at http://acooke.org/cute/Installing3.html
Personally, I was not too impressed with MB itself - ended up using LastFM's API instead. But this was for generating playlists, not tagging.
Also, I understand json is very easy to use, etc. But those big dumps cry for a binary format. Or at least add zlib/lzma compression so people don't waste bandwidth on uuencoded binary data in json.
The code data is compressed using zlib (and then base64'd.) It's all on s3-- in our experience, big data dumps like this get relatively little traffic after the original hype dies down and a torrent doesn't make sense. We're pretty sure amazon can handle the load :)
Distinction: we handle small pieces of audio from anywhere in the song (20s), ours works "over the air" via microphone, we have a huge database via our content partners.
The codegen is very different (instead of relying on echo nest chroma, it does its own onset detection) but the back end side is almost the same.
We are working on something that will benefit both parties (us and you guys) immensely. Is there a way to contact you? No info on your HN profile
I think pHash also has functions for fingerprinting music but might not be as precise since pHash is not strictly focused on music.
Looks like their research has taken them a lot further!
All the fingerprint comprises is a bunch of abstract statistical data about the music, its not qualitatively different from the output of say, a visualisation algorithm in a music player. Just much more useful.
But then again, copyright law makes no fucking sense, so who knows.
In any case, even if this was technically a copy, I can't imagine this possibly failing the fair use test.
- game: a baseball game is not a "work" under the Berne convention
- cast credits: they are descriptive, but not derived from the movie.
One of the criteria to assess "fair use" is "was anything creative added in the process of making the derivative", which is per definition not true in the checksum or fingerprint case. I don't find your argument compelling - it's not a closed and shut case.
But the tech looks incredible. Good work for releasing this!