
Ask HN: List of open/public databases - DanielBMarkham
A year or two ago somebody posted a list of open/free/accessible datasources for hackers to download and play around with. I thought this was a great resource, so I saved it, but heck if I can find it now.<p>Does anybody have such a list? Things like zip-codes for the US, locations of Starbucks stores, current weather forecasts, list of major newspapers, list of publicly-traded stocks, etc. I know there are tons of open/free databases waiting for us to mashup, just can't seem to find a list of them.<p>EDIT: The goal is a downloadable chunk of data to mashup, reformat, and use. That means CSV/XML/etc format and a public/anonymous FTP or something.
======
randomtask
You might find the datasets/opendata sub-reddits interesting.

<http://www.reddit.com/r/datasets> <http://www.reddit.com/r/opendata>

~~~
sidmitra
Delicious is a good place for the same. Eg.

<http://delicious.com/sidmitra/datasets>

------
robin_reala
Freebase is a big one: <http://www.freebase.com/> .

Also, if you’re in the UK the government datasets might interest you:
<http://data.gov.uk/>

------
IgorPartola
<http://themoviedb.org>, <http://thetvdb.com/> and my own API on top of these:
<http://igorpartola.com/projects/disciddb>

------
brown9-2
[http://flowingdata.com/2009/10/01/30-resources-to-find-
the-d...](http://flowingdata.com/2009/10/01/30-resources-to-find-the-data-you-
need/)

US Govt data on the stimulus:
<http://www.recovery.gov/FAQ/Pages/DownloadCenter.aspx>

------
DanielBMarkham
As I continue to research this today, I grow more and more amazed at the lack
of public data sets. There are what? A million different websites that keep
some kind of list of something? And I'm finding only a dozen or two databases
publicly accessible? It's crazy. Everybody either wants to charge or actively
wants to prevent you from getting their data -- even when the data is common
knowledge, was gathered at public expense, etc.

Anybody from Google on? There has to be some shortcuts here somewhere.

~~~
tszming
Amazon Public Data Sets: <http://aws.amazon.com/publicdatasets/>

It is fast and easy to use if you are already using EC2.

------
zackham
How about weather - I have been looking around for an API to access historical
weather data with no luck so far. Anyone familiar with the landscape of
weather data options?

~~~
joubert
<http://www.ncdc.noaa.gov/oa/mpp/freedata.html>

------
gspyrou
You may try taking a look at Microsoft Codename "Dallas"
<https://www.sqlazureservices.com/Catalog.aspx>

------
helwr
[http://www.datawrangling.com/some-datasets-available-on-
the-...](http://www.datawrangling.com/some-datasets-available-on-the-web)

<http://archive.ics.uci.edu/ml/datasets.html>

<http://data.worldbank.org/developers>

------
joubert
I have links to a few open (gov) data sets at the top of <http://elev.at>

------
cullenking
We are running SRTM and NED elevation datasets, and have a web service
(sinatra FTW!) exposing the data. I setup a variety of interpolation
algorithms, which are fun to play with. If someone needs access to it and has
a cool project, we are happy to provide it. Get ahold of me via email for
info.

The SRTM data is < 20 gigs compressed, and I downloaded it from some source,
can't find the link now. It was a SLOWWWWWW download, so took many days.
Reminded me of my old 2400 baud modem.

the NED dataset is just shy of 500 gigs. We shipped a drive off to the USGS
and they populated it and shipped it back. The 2 month wait sucked, but it was
the only way to get it without paying a thousand bucks from a private company.

I am happy to share either the SRTM or the NED dataset we have, but sharing it
is pretty much limited to shipping hard drives around :(

------
iamelgringo
<http://infochimps.org/> <http://www.data.gov/> <http://data.gov.uk/>

------
truiu
The openstreetmap project lets you download its whole dataset:
<http://wiki.openstreetmap.org/wiki/Planet>

They also got a API ... see the wiki.

------
lpolovets
www.factual.com has a lot of data, much of which is freely downloadable.
There's also an API for data access.

Examples:

zip codes in the US: <http://www.factual.com/t/MwwkkU/US_Zip_Codes>

stock symbols: <http://www.factual.com/t/jyqdWC/market_symbols>

I work at Factual, so if you have feature requests/comments/complaints please
let me know! I'm leo at factual dot com

------
vili
You can get all of Wikipedia here:
<http://en.wikipedia.org/wiki/Wikipedia_database>

------
what
<http://dataincubator.org/>

Datasets about some Canadian cities:

<http://www.toronto.ca/open/datasets/web-map-services/>

<http://data.edmonton.ca/>

<http://data.vancouver.ca/>

------
lostbit
Lottery in Brazil (MegaSena) allow download of all results ever:
[http://www1.caixa.gov.br/loterias/loterias/megasena/download...](http://www1.caixa.gov.br/loterias/loterias/megasena/download.asp)

------
Cherian
Stackoverflow(includes sister sites) dump:
[http://blog.stackoverflow.com/2010/07/creative-commons-
data-...](http://blog.stackoverflow.com/2010/07/creative-commons-data-dump-
jul-10/)

------
warp
MusicBrainz has postgresql dumps available (partly public domain, partly cc-
by-nc-sa):

<http://musicbrainz.org/doc/MusicBrainz_Database>

------
cjoh
<http://data.sunlightlabs.com> <http://services.sunlightlabs.com>

------
tszming
Open Directory RDF Dump (e.g. Google Directory using it)
<http://rdf.dmoz.org/>

------
tomhaney
Try <http://www.programmableweb.com/>

------
binarymax
<http://data.worldbank.org/>

------
kinlane
A list of several data sources and tools to work with the data:
<http://www.kinlane.com/data/>

