
Where can I get a data dump of all HN comments? - 1123581321
I&#x27;m looking for a data dump of HN comments. It looks like Hacker News API[0] had one and disabled it. I haven&#x27;t been able to find any other. It doesn&#x27;t really need to be up to date -- anything in the last ~3 years would be fine.<p>[0] http:&#x2F;&#x2F;api.ihackernews.com&#x2F;
======
sheff
Instead of scraping HN directly, have you looked at the HN Search API, from
the old HN search engine ? (
[https://www.hnsearch.com/api](https://www.hnsearch.com/api) )

The current search engine looks like it also has an API (
[https://hn.algolia.com/api](https://hn.algolia.com/api) ). Presumably they
have a full dataset on hand, which you may be able to get if you contact them.

------
acosmism
Scrape it

~~~
ivan_ah
Surely there's a better way than hitting the HN servers O(7169040) times...

Does anyone have an old scrape they can post somewhere or as a torrent? Note,
for my purposes bag-of-words (i.e. word counts) would be enough.

~~~
tptacek
I think you just suggested that scraping HN was a constant time operation.

