"We’ve built a very simple application on our labs website where you can listen to music and help us improve the state of the art in tempo estimation."
No. We are giving you data that you use to improve your tempo estimation algorithm. If you wanted to improve the state of the art, you would share the data you collect.
In particular, you note the MIREX competition (http://www.music-ir.org/mirex) earlier in your post. Share your data with the MIREX competition, and make it open for use by others, if you want to improve the state of the art.
given that there is no stated end date to this experiment, when exactly will you post the data? ("once it's all in" doesn't say anything).
will you post all of the data (in its original, unadulterated form) collected for free, unrestricted download by anyone for any purpose, including competing with last.fm?
intending to share, and actually making a public promise to do so, are quite different things.
The exact details are out of my hands -- I'm just a tech guy -- but we're active members of the music information retrieval community and always have been.
No-one even knows if it's possible to crowdsource good enough BPM data like this yet, so even demonstrating that it's feasible would be progress :-)
I sampled enough of the songs to get about 500 points(20-30 minutes) and they really need to increase the variety of samples. Most of the stuff was older pop hits (50's pop), some country, 1 rock song (Disturbed - Ten thousand fists), and a couple of old rap songs.
I'm used to listening to metal, industrial, hardstyle, house, etc. All my answers are going to be biased towards slow because even though the song may be at 120bpm it still feels slow to someone who normally listens to things closer to the 150-200bpm range.
A better sample to test will produce better results.
I haven't done the test, since I'm at work, but not having any concrete evidence never stopped me from having an opinion.
It may be that their algorithm is good enough to find highly-periodic "pulses" that are candidates to be considered as a beat. However, believe it or not, the beat is really a matter of individual perception (although people agree in many/most cases).
To me, this is most obvious in music typical of the "power metal" genre. This music frequently is annotated as "double-time feel", and its drumming alone would tend to indicate a beat twice as fast as what would be indicated if you concentrate on the vocals and other instruments. Thus you've got an ambiguity.
Training it so that it can decide whether people tend to perceive the doubled beat or the slower one might be what they're after.
Also, it's common in progressive rock and progressive metal to have frequent changes in time signature. When the beat is changing, what do you say the beat is? They might also be looking for a way to make this choice.
Of course, the genres of music that you mentioned tend not to have those kinds of variations, so maybe I'm just blowing smoke.
It would also be interesting to see if people can tell apart very close BPM difference. Like if you play a song that is 110 BPM and then one that is 120 BPM, and and someone can reliably say which is faster, then your actual beat measuring becomes a lot less important. You can just build off a bunch of known values and have everything else relative to that.
Even with the app it seems really hard to find the beat. It's really easy to find the bass hit though, so I wouldn't be surprised if a lot of the tempos are off by 1/2 or 1/4.
Yes, definitely. That's part of what I was getting at with my reply elsewhere in this thread (http://news.ycombinator.com/item?id=2540866 ). The perceived beat can actually be different depending on who is listening, and what they're focusing on.
No. We are giving you data that you use to improve your tempo estimation algorithm. If you wanted to improve the state of the art, you would share the data you collect.
In particular, you note the MIREX competition (http://www.music-ir.org/mirex) earlier in your post. Share your data with the MIREX competition, and make it open for use by others, if you want to improve the state of the art.