
Ask HN: Is there a database dump of HN? - daenz
I&#x27;m interested in running some analysis on posts&#x2F;comments, and I would prefer not to scrape the site with a bot.  Does HN offer a periodic database dump?  If not, can it be considered?  I&#x27;m sure I&#x27;m not the only one who would be interested in it.
======
verdverm
BigQuery has a copy you can run queries against. One of their provided, shared
datasets.

~~~
daenz
This seems to be the dataset you are referring to
[https://console.cloud.google.com/marketplace/details/y-combi...](https://console.cloud.google.com/marketplace/details/y-combinator/hacker-
news?pli=1)

Hasn't been updated in awhile but better than nothing.

~~~
harrisreynolds
It is fairly up to date. Last stories were added on 3/3/2020 and today is
3/31/2020.

Not sure how frequently it updates... maybe monthly?

See screenshot here.

[https://www.dropbox.com/s/6h7qk04s0b4mggj/Screenshot%202020-...](https://www.dropbox.com/s/6h7qk04s0b4mggj/Screenshot%202020-03-31%2014.09.14.png?dl=0)

~~~
daenz
Oh cool, thanks. I mis-interpreted the "last updated" date on the BigQuery
dataset page.

------
itronitron
hn.algolia.com has an API for getting JSON records...
[https://hn.algolia.com/api](https://hn.algolia.com/api)

