Hacker News new | past | comments | ask | show | jobs | submit login

What if someone leaks info is that public now? What if someone shares my private info without permission? Should Google be allowed to train on it? How would their systems know? What about pirated other illegal content? The lines are not quite as clear to computers.



IANAL, but leaking is a crime, accessing once public is not. There was a lot of this back during the height of Wikileaks where there were questions about reading classified material being a crime. It is, but once it’s leaked, it is no longer classified, so public.

So the crime would be on the leaker, not on Google for training on it.


IANAL either, but I have held a security clearance and this is very dangerous advice.

Unless explicitly declassified the leaked information remains classified, and those who hold security clearances are legally required to avoid all classified information outside of their need-to-know [0].

Now if you have never held a U.S. security clearance, you're less likely to be prosecuted but the history and precedent is murky [1]. The average Joe Sixpack checking out Wikileaks is probably safe, but if I were a journalist publishing the next round of Pentagon Papers I would much rather have a small army of lawyers and a friend or two in Congress.

EDIT - Google has federal contracts, so they are probably bound by similar agreements to at least make an effort to avoid any such leaks in their training data for public-facing models.

[0] https://www.csmonitor.com/USA/Foreign-Policy/2010/1207/US-to...

[1] https://www.npr.org/sections/thetwo-way/2017/03/22/521009791...


I have a security clearance and the guidance I received (and was explicitly asked about during my recertification), is that I cannot view Wikileaks at all, and any classified material leaked there.

So I didn’t view Wikileaks or any material as I don’t want to lose my clearance.

But the question was about whether it’s legal or not to view leaked material. Security clearances are a different matter and are going a step beyond what’s legal or admissible in court or what someone would be prosecuted for.


But the question is should google train against it not if it’s legal. Just because it’s public doesn’t mean you have the right to use it, I think that’s the real question we’ll need to figure out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: