A year or two ago somebody posted a list of open/free/accessible datasources for hackers to download and play around with. I thought this was a great resource, so I saved it, but heck if I can find it now.
Does anybody have such a list? Things like zip-codes for the US, locations of Starbucks stores, current weather forecasts, list of major newspapers, list of publicly-traded stocks, etc. I know there are tons of open/free databases waiting for us to mashup, just can't seem to find a list of them.
EDIT: The goal is a downloadable chunk of data to mashup, reformat, and use. That means CSV/XML/etc format and a public/anonymous FTP or something.
As I continue to research this today, I grow more and more amazed at the lack of public data sets. There are what? A million different websites that keep some kind of list of something? And I'm finding only a dozen or two databases publicly accessible? It's crazy. Everybody either wants to charge or actively wants to prevent you from getting their data -- even when the data is common knowledge, was gathered at public expense, etc.
Anybody from Google on? There has to be some shortcuts here somewhere.
How about weather - I have been looking around for an API to access historical weather data with no luck so far. Anyone familiar with the landscape of weather data options?
We are running SRTM and NED elevation datasets, and have a web service (sinatra FTW!) exposing the data. I setup a variety of interpolation algorithms, which are fun to play with. If someone needs access to it and has a cool project, we are happy to provide it. Get ahold of me via email for info.
The SRTM data is < 20 gigs compressed, and I downloaded it from some source, can't find the link now. It was a SLOWWWWWW download, so took many days. Reminded me of my old 2400 baud modem.
the NED dataset is just shy of 500 gigs. We shipped a drive off to the USGS and they populated it and shipped it back. The 2 month wait sucked, but it was the only way to get it without paying a thousand bucks from a private company.
I am happy to share either the SRTM or the NED dataset we have, but sharing it is pretty much limited to shipping hard drives around :(