Ask HN: List of open/public databases
64 points by DanielBMarkham on July 7, 2010
A year or two ago somebody posted a list of open/free/accessible datasources for hackers to download and play around with. I thought this was a great resource, so I saved it, but heck if I can find it now.

Does anybody have such a list? Things like zip-codes for the US, locations of Starbucks stores, current weather forecasts, list of major newspapers, list of publicly-traded stocks, etc. I know there are tons of open/free databases waiting for us to mashup, just can't seem to find a list of them.

EDIT: The goal is a downloadable chunk of data to mashup, reformat, and use. That means CSV/XML/etc format and a public/anonymous FTP or something.

You might find the datasets/opendata sub-reddits interesting.

http://www.reddit.com/r/datasets http://www.reddit.com/r/opendata

Delicious is a good place for the same. Eg.


Freebase is a big one: http://www.freebase.com/ .

Also, if you’re in the UK the government datasets might interest you: http://data.gov.uk/

As I continue to research this today, I grow more and more amazed at the lack of public data sets. There are what? A million different websites that keep some kind of list of something? And I'm finding only a dozen or two databases publicly accessible? It's crazy. Everybody either wants to charge or actively wants to prevent you from getting their data -- even when the data is common knowledge, was gathered at public expense, etc.

Anybody from Google on? There has to be some shortcuts here somewhere.

Amazon Public Data Sets: http://aws.amazon.com/publicdatasets/

It is fast and easy to use if you are already using EC2.

How about weather - I have been looking around for an API to access historical weather data with no luck so far. Anyone familiar with the landscape of weather data options?

You may try taking a look at Microsoft Codename "Dallas" https://www.sqlazureservices.com/Catalog.aspx

I have links to a few open (gov) data sets at the top of http://elev.at

We are running SRTM and NED elevation datasets, and have a web service (sinatra FTW!) exposing the data. I setup a variety of interpolation algorithms, which are fun to play with. If someone needs access to it and has a cool project, we are happy to provide it. Get ahold of me via email for info.

The SRTM data is < 20 gigs compressed, and I downloaded it from some source, can't find the link now. It was a SLOWWWWWW download, so took many days. Reminded me of my old 2400 baud modem.

the NED dataset is just shy of 500 gigs. We shipped a drive off to the USGS and they populated it and shipped it back. The 2 month wait sucked, but it was the only way to get it without paying a thousand bucks from a private company.

I am happy to share either the SRTM or the NED dataset we have, but sharing it is pretty much limited to shipping hard drives around :(

The openstreetmap project lets you download its whole dataset: http://wiki.openstreetmap.org/wiki/Planet

They also got a API ... see the wiki.

www.factual.com has a lot of data, much of which is freely downloadable. There's also an API for data access.


zip codes in the US: http://www.factual.com/t/MwwkkU/US_Zip_Codes

stock symbols: http://www.factual.com/t/jyqdWC/market_symbols

I work at Factual, so if you have feature requests/comments/complaints please let me know! I'm leo at factual dot com

You can get all of Wikipedia here: http://en.wikipedia.org/wiki/Wikipedia_database

Lottery in Brazil (MegaSena) allow download of all results ever: http://www1.caixa.gov.br/loterias/loterias/megasena/download...

Stackoverflow(includes sister sites) dump: http://blog.stackoverflow.com/2010/07/creative-commons-data-...

MusicBrainz has postgresql dumps available (partly public domain, partly cc-by-nc-sa):


Open Directory RDF Dump (e.g. Google Directory using it) http://rdf.dmoz.org/

A list of several data sources and tools to work with the data: http://www.kinlane.com/data/

