Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: I have 89,830 sound clips of rare records. What should I do?
155 points by kumar303 on Dec 3, 2013 | hide | past | favorite | 57 comments
I collect records. I rarely buy anything blindly by name or label; I like to discover new stuff by listening to it first. I usually do this by spending hours in grimy bins with a record player but I also found that sometimes Ebay sellers post sound clips of their records. Specifically, what I noticed is that sellers are more inclined to make the effort of posting a sound clip if the record is rare -- i.e. you can't find a clip elsewhere -- and the record sounds really interesting. It helps them make a sale.

So I made an Ebay API spider that gets all the sound clips and record release data. It's been running like a dream since 2007 and I currently have 89,830 sound clips of rare records (probably some dupes in there). I used to watch the feed all the time but I kept finding really cool records and was spending a lot of money! I got busy. I had a good test suite so anytime there was a Unicode error or some bug I patched it pretty quick.

I steadily accumulated a pretty amazing collection of sound clips seeded by these search terms: soul, funk, reggae, ska, country, breaks, disco, psych, afrobeat, jazz, rocksteady, garage, indie, library (as in library music), new wave, electronic, brazilian, and boogie.

I made a frontend for listening to clips all Ajaxy-like and got it working on mobile phones and major browsers. I was sort of happy with it but ran into some database bottlenecks and got busy again so I never launched it. It was too slow to be usable.

What should I do? I don't really have time or money to finish it out but I'd like people to use it to discover music. Since the clips are short and will link to music for purchase, either MP3 (if it exists) or original vinyl on Ebay/Discogs, there shouldn't be any copyright problems. It's fair use to promote music for sale with a clip.

You can comment here or reach me by email: kumar.mcmillan@gmail.com

Get in touch with the Internet Archive, perhaps?


Definitely do this. They will host it so that others can access it freely and forever and they'll also make a torrent for it. Just get in touch with somebody who works with the collections or feel free to PM me.

Specifically, Brewster Kahle! He's awesome.

You can find his email address here: http://brewster.kahle.org/about/

Cool, thanks for the info! Amazon S3 costs are getting high so it would be nice to get some relief

What are you spending on S3? I can very nearly guarantee you could get a bare metal setup from just about anywhere for about 1/10th the cost.

He has now responded to this post


Dammit. After posting, I caught up with the feed of clips and just bought six records :( It turns up records like this: http://www.youtube.com/watch?v=zCC0wgVXNpc Wat??

mate, I think there is good reason this particular record is rare... :p

What?! That's a damn fine record.

I wasn't feelin' the vocals. Just making fun though - keep rockin' if you enjoy it.

Dammit. After listening, currently on repeat.

Same here

Dammit I'm so eager to see the whole collection now !

Please find a way to make it accessible

I have a sign-up thing that collects emails in case I ever launch it http://recordbox.com/ As I said, has some speed issues.

omg that one is amazing, please post more!

The Internet Archive can provide a home for the files-- custom UI's would come from others and could be fun. info@archive.org

-brewster Digital Librarian

Seed it on the Pirate Bay?

+1 to this. The least work for uploader, the least work for the downloader to get them all.

Please do this

You should team up with http://echonest.com/ they have the team and the experience and could probably help you do something cool with it.

Given the seed terms; I'm imagining an infinite dub mix, where you pick and mutate beats and fade clips into each other. Too bad our legal system makes such a thing impossible.

Another vote for this option. These guys are ace.

Hah. That would last for days on end.

I second contacting them. They hope to build a more expansive public data base of song metrics but need the community to help contribute http://echoprint.me/data -- your collection would add 0.5x the number of tracks. Though probably not the type of stuff people will be using Over The Air detection for but still, worth while to support their effort. Considering what you have I would think/hope that they would be happy to ingest the data with little work required from you. On the other hand they may tell you due to the antique nature of your collection that they would rather not "pollute" the DB since OTA detection often comes with false positive issues.

Unrelated but since you're into rare/old/obscure recordings, you might appreciate the Cylinder Recording collection at UCSB.


As an aside, my university really needs to advertise this more. I first heard of it in passing in http://hummingadifferenttune.blogspot.com/2009/03/chimes-of-... .

Not long after that, like all good things, it kept coming up in my general electives.

Have you checked to see if they're on http://what.cd already? It's the de-facto bittorrent tracker for music. It is possible that many of those clips you have are already on there.

And if not, you could seed them yourself :D

I like where your head is at, but come on man, first rule.

I'm letting him go on this one. :)

He's only collecting sound clips not the complete record/album. You are not allowed to upload these to wcd ;)


Asking for invites in public isn't allowed. Read http://www.whatinterviewprep.com/

But how does one get in? O.o

Wow, talk about an inflated sense of self-importance.

Is that site just an elaborate joke?

No, it's not. But that discussion is a bit off topic here.

Give the files to the Internet Archive (www.archive.org), definitely! They'll host it all for free, and make a collection out of it. E-mail them directly or contact Jason Scott (@textfiles on Twitter) who manages a preservation group for them called Archive Team.

Find your favorite DJ and send it to them. DJs are always looking for exotic sounds to add to their mixes.

He is a DJ :P

But is he his own favourite DJ?

Every DJ is their own favourite DJ.

Donate them to http://freesound.org/

Copyright prevents OP from doing so.

Sound clips are priceless for music producers if they are in a good format (320 kbps mp3 or lossless). Well, if they have no license which prohibits their usage. But music producers are not really "careful" about this, haha.

Hi, we are interested in. www.musicinfo.io. Our database is huge and its free to use.

Why not open source and share with the world. You can wack an amount of advertising on the site and make some nice income. Obviously just be careful of copyright.

I'm sending you an email. I know someone at IMMuB (Brazilian Musical Memory Institute) [1] that will be interested in the Brazilian clips you have.

Thanks for posting!

[1] http://www.memoriamusical.com.br/ (in Portuguese)

I collect records too...this sounds incredible, but it would bankrupt me within twelve months...;)

I run BitShuva.com and we build custom internet radio stations (think Pandora clones). If copyright isn't an issue, I'd gladly put a custom radio station up for you. Shoot me an email, contact@bitshuva.com

Send to Wikimedia.

No, it's better for Archive.org. The Wikimedia Foundation mostly deals with Creative Commons licensed content.

I think it could provide an interesting way for producers to discover unique and interesting samples to use in their tracks.

It totally would! Licensing is pretty tricky with this sort of thing though. Oftentimes there's some mogul who owns the rights but they won't budge on a price. Then if you use the sample anyway and they'll sue you.

This is not my home turf, but this is beautiful and exciting. Is your ajaxy frontend in a state worth open sourcing?

Soundcloud, please.

Bung them up online so DJs can have em

Put it all up on hearo.fm!

You should go dancing! No, seriously archive.org that stuff for posterity.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact