Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Where do you get data?
6 points by roundsquare on Sept 4, 2009 | hide | past | favorite | 7 comments
I am interested in playing around with popularity algorithms, graph theory algorithms, trying out some map coloring algorithms, etc...

All of these require data though, and I'm not sure where to get it. So I was wondering if people on HN have good places to get data.




Usually the governing body that oversees that industry will have tons of data. In the US, there is a governing body for everything, either a gov't department, or a similar organization. I'm not sure what type of data you need, but here's where I have gotten data to work with:

Population data - US Census (http://www.census.gov/main/www/access.html) Food data - USDA (http://www.ars.usda.gov/Services/docs.htm?docid=8964) SAT stats - College Board (http://professionals.collegeboard.com/data-reports-research/...)

I could go on and on...try a search for {subject} statistics and you should get some ideas


Amazon offers public datasets at http://aws.amazon.com/publicdatasets/

They include census data, as well as a dump of freebase's data (which mirrors things like wikipedia). They also have some genome data.


This is a good place to start: http://theinfo.org/get/data


I treat the www as one giant database...

The nice thing about that is that it is all real world data, so quite messy. That gives you a good grip on what it takes to massage data before it becomes usable, and it also gives you some control over the level of abstraction.



I asked a similar question here recently and got a number of valuable links, see thread here:

http://news.ycombinator.com/item?id=764982


Wow... boatloads of data. Thanks all!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: