
Ask HN: Can I download my HN data? - conroy
Github recently announced the ability to download all your Github data[0]. Facebook, Twitter, Instagram, and Google all offer similar services. Is there a way to do this for my comments, votes, and submissions on HackerNews?<p>[0] https:&#x2F;&#x2F;blog.github.com&#x2F;2018-12-19-download-your-data&#x2F;
======
tlrobinson
FYI it's from 2015 but there's a dump in the BigQuery public datasets:
[https://bigquery.cloud.google.com/dataset/fh-
bigquery:hacker...](https://bigquery.cloud.google.com/dataset/fh-
bigquery:hackernews?pli=1) (along with a bunch of other interesting things)

------
nicolashahn
I wrote a small python library to assist in querying the HN search API a
little while back:

[https://github.com/nicolashahn/py-search-
hn](https://github.com/nicolashahn/py-search-hn)

I also have an example of how to get all of a single user's comments:

[https://github.com/nicolashahn/py-search-
hn/blob/master/get_...](https://github.com/nicolashahn/py-search-
hn/blob/master/get_user_comments.py)

You could use a similar method for submissions. I'm not sure how getting all
the stories or comments you've voted on would work, however.

------
hoppelhase
The stuff you can't scrape is the interesting stuff.

~~~
sgillen
What stuff can you not scrape?

~~~
hoppelhase
Sessions, IP addresses and User Angents associated with oneself.

~~~
sigjuice
Do we even know whether HN stores this information?

~~~
hoppelhase
Well, no.

------
beokop
If you’re in the EU you can refer to the GDPR and request all the data they
have on you.

~~~
yorwba
Sure you can, but HN is not GDPR-compliant, so you'll just get pointed at the
public API.

~~~
twtw
How can HN just choose to be noncompliant? Aren't there penalties?

I don't understand the jurisdiction of GDPR very well, but I thought it
applied to all EU users.

~~~
beokop
It does apply to all EU users and they risk fines if they don’t comply.

~~~
mattr47
It does not apply as HN is not in the EU. If an EU citizen does not want their
data collected then they can choose not to participate, as the EU has no
jurisdiction over HN.

~~~
mdekkers
That isn't how the GDPR works. There are many GDPR primers on the web, here is
a random one. [https://www.recode.net/2018/5/16/17360944/gdpr-us-
business-e...](https://www.recode.net/2018/5/16/17360944/gdpr-us-business-eu-
european-union-data-protection-privacy)

~~~
zenexer
The EU claims that's not how it works. Everyone else claims that is how it
works. It's highly unlikely that the EU will actually be able to enforce it
globally.

~~~
mdekkers
_It 's highly unlikely that the EU will actually be able to enforce it
globally_

If you want to do business in some way with the EU, or have your business
officers visit the EU, then that is how it works. The EU took a leaf out of
the USA "global jurisdiction" book.

~~~
freehunter
Yeah if Y Combinator chooses to never do business in the EU (considering one
of their companies is Afrostream who does business in the EU, this may be up
for debate), they may be able to get away with it. My company only targets US
citizens and EU citizens would never get any value of any kind in any way at
all from my business, so GDPR is not on my radar. But YC might have a harder
time making that claim.

------
jaredsohn
As others say, you can get the comments and submissions via the API.

To start with you can try this for the public information:
[http://hnuser.herokuapp.com/user/conroy/json](http://hnuser.herokuapp.com/user/conroy/json)

I wrote hnuser ([http://hnuser.herokuapp.com](http://hnuser.herokuapp.com)) a
long time ago to show karma over time and haven't looked at in awhile but at
first glance it seems to still work. It also exists as an npm package if you
want to run it from the commandline.

------
activekerrar
Not directly, but you may find their API useful:
[https://github.com/HackerNews/API](https://github.com/HackerNews/API)

------
megous
Scrape it? It's in your profile.

