Hacker News new | past | comments | ask | show | jobs | submit login
Swiss journalist re-implements ClearViewAI in two weeks with OS software (twitter.com/grssnbchr)
147 points by dukoid on Feb 8, 2020 | hide | past | favorite | 55 comments



The head of swiss data protection (Eidgenössischer Datenschutzbeauftragter) was on the news program [1] and scolded them since according to him it is illegal in Switzerland to scrape data of the internet and then do facial recognition on it to identify people.

He said because they are media it may be ok but the average joe and company should not get any ideas of doing the same thing.

He also said it is facebook/instagram's responsibility to keep the users data safe and prevent it from being scraped.

I'm not sure how anyone can prevent data scraping and I also don't see how it can be illegal if the data is on the public internet. Can anyone elaborate why it would be considered illegal to scrape data in Switzerland? I can see there being a conflict between the privacy laws and doing identifying recognition on public images but downloading the images themselves I don't see being illegal?

We deal with scrapers taking our data all the time and this is the first time I heard that it may be illegal.

[1] https://www.srf.ch/play/tv/10vor10/video/fokus-heikle-gesich...


> Can anyone elaborate why it would be considered illegal to scrape data in Switzerland?

Speculation but I think it can be as simple as Art. 4, p. 3 FADP [0]: "Personal data may only be processed for the purpose indicated at the time of collection, that is evident from the circumstances, or that is provided for by law."

When users upload data to Facebook, they don't consent to have that data used by third parties for facial recognition.

[0]: https://www.admin.ch/opc/en/classified-compilation/19920153/...


You grant Facebook a nonexclusive sublicenseable transferable licence for all purposes.


Copyright and data protection are separate issues and even so, FADP doesn't let people off that easily. From the same article 4:

> Personal data may only be processed for the purpose indicated at the time of collection, that is evident from the circumstances, or that is provided for by law.

> The collection of personal data and in particular the purpose of its processing must be evident to the data subject.

Note that it adds the qualifier "for the purpose indicated at the time of collection". I don't think there's a realistic argument that Facebook's UI indicates or makes evident, at time of uploading/posting a photo, that the photo can be used by third parties performing facial recognition.

And then, even if you give Facebook consent to do certain things, that consent does not extend to third parties scraping it.


Yes, but (a) one can be in pictures without having any agreement (b) fb license cannot grant you rights contradicting swiss law (my assumption, neither lawyer or swiss), ...


I believe slips this knot by making any uploader responsible for securing rights from every person in any picture they upload, no?


You are not supposed to upload pictures showing non-users. Facebook upload wizard clearly tells you that.


How is Facebook supposed to enforce that?

I suppose they could use facial recogniton to prevent those uploads...


Licensing and processing are entirely orthogonal rights. The former does not grant the latter under the GDPR framework.


Does the GDPR apply to Switzerland though?


No, only European Union


Actually, it’s more nuanced than that. GDPR applies anywhere if you offer a service in the EU or monitor EU subjects’ behavior in the EU.

(PDF link) https://edpb.europa.eu/sites/edpb/files/files/file1/edpb_gui...


It is not subject to GDPR, but Switzerland implements everything in GDPR by themselves. Ukraine, former Yugoslavia, and Turkey do the same.


Parts of former Yugoslavia are actually in the EU (Slovenia, Croatia).


I meant the parts that are not in EU, such as the Republic of Serbia (which is different from the Serbian Republic).


Scraping is generally not illegal in Switzerland. He considers it a Persönlichkeitsverletzung. A violation of a person‘s privacy which are very strong in Switzerland. According to the tv report, the photos were uploaded without the consent of the people on the photo. Photos were taken while some were walking by in the background. You cannot just photo/video people here without their approval and publish the photos.


But IMHO on a public event that rule doesn't apply, right?


>“He said because they are media it may be ok but the average joe and company should not get any ideas of doing the same thing.”

Or else what?

How would they even know this was done?

All facial recognition tech benefits from plausible deniability. You could just easily say a face was recognized by an individual who saw the photo online and sent an anonymous tip. They have no leverage.


the value proposition of companies like ClearviewAI is that they're building an automated system, which is of course why they precisely don't try to pretend to be some kind of private eye detective firm and why they are now on the hook for it.

If the government gets serious it also could simply come knocking on the door and demand a tour.


> I also don't see how it can be illegal if the data is on the public internet.

It can be seen as an issue of intent. By posting something, that datum might be there for personal use, and is not intended for further commercial use, especially by companies outside the one hosting the information originally.

The act of data aggregation is one that I personally think should be heavily considered for regulation, especially since it can de-anonymize people. This in my opinion is a privacy invasion, performing extra actions to access additional private data when no access to the more private details was granted, and those details did not exist publicly in an accessible way. Especially at scale and yielding personally targetable results, this isn't something we want organizations or governments to use against people unchecked.


>I'm not sure how anyone can prevent data scraping

Is rules/legislation against scraping really meant to prevent it, or merely provide a mechanism to punish the perpetrators when discovered?


Author here. The argument the data commissioner makes is that, while the images are public, users can assume they are protected from other purposes, e. g. facial recognition, because of the ToS which disallow bulk download. Thus, by violating the ToS, we also violated their personal rights. However, you need to be aware that this is an interpretation of the data protection law, we don't know of any court ruling that would confirm such an interpretation, yet.


> I'm not sure how anyone can prevent data scraping

Not the first time politicians proposing something technically off limits. It’s all just magic to them anyways.


That's not how laws work, though. Just because something is technically possible doesn't mean it has to be legal.

There's no way to technically prevent murder or assault, or dumping chemicals in a river, but it doesn't make them any less illegal.


Depending on the national law (no idea about Switzerland) data can be protected by copyright laws. Eg even if the base data is free to use the combined dataset might be protected as it's a unique 'creative work'


GDPR limits the use of personal data to the specific usages the user explicitly agreed to. Any other use of a person's data would be illegal.

If I remember correctly, broad terms like "for anything" are not permitted.


> He said because they are media it may be ok but the average joe and company should not get any ideas of doing the same thing.

That's not the first time I've seen someone claim that a journalist claim is allowed to perform some kind of information processing that's off limits to regular people. These claims are specious and ridiculous. Journalists do not form a distinct class with special privileges, even if they claim to be one.


“Journalist” isn’t clearly defined, but they do have some special privileges in many countries. https://en.wikipedia.org/wiki/Journalism#Legal_status:

”Journalists in many nations have some privileges that members of the general public do not, including better access to public events, crime scenes and press conferences, and to extended interviews with public officials, celebrities and others in the public eye.”

https://en.wikipedia.org/wiki/Reporter%27s_privilege:

”Reporter's privilege in the United States (also journalist's privilege, newsman's privilege, or press privilege), is a "reporter's protection under constitutional or statutory law, from being compelled to testify about confidential information or sources."”

Also, in some (¿many, I would think, but I don’t know stats?) countries the justice system will weigh the right of the public to know against law infractions when deciding whether to prosecute journalists who performed crimes in order to gather information.

For an example, see the pentagon papers (https://en.wikipedia.org/wiki/Pentagon_Papers). All concerned knew the information was top-secret and stolen, but in the end, nobody got convicted for publishing them.


We'll, there definitely are distinct classes of journalists that have special privileges. These classes are not defined by law but are implemented and supported by institutions (such as when special access is granted to events or killings generate international attention).

There is no such thing as a "licensed journalist" because such licensing has been historically used to limit freedom of the press. However, there is an implicit understanding of what constitutes being a journalist that is similar and it is displayed when Judges choose whether or not to force a journalist to reveal their sources.

The core idea is that intent matters. Building such an app to create public discourse by someone with a clear history of journalistic actions is likely to face different regulatory penalties than an individual with no such history who builds such an app for entertainment or profit.

Edit: There is argument to be made that this "soft licensing" serves to extert control over journalists.


They're nonsensical. There's no legal definition of journalist. You don't get a license from the government


This is Switzerland, not the United States. In Europe at least in many countries journalists do indeed have special privileges and they do have recognised legal status and licenses, usually represented by private/public institutions that oversee ethical standards in the industry and so on. In Germany for example this would be the German Press council, in Sweden it is the Opinionsnämnd etc..

For the love of god can we please stop pretending that every country on the planet is the US.


I sure wish those journalists would have any more credibility than regular people though.


No licence needed, true, but governments issue press cards for certain privileges peculiar to professional journalists.


GDPR has exemptions for the purposes of news reporting, not for specific people.


Not claiming this is the result of, but this article is the wet dream of Facebook lobby groups. Facial recognition won’t go anywhere, that ship has sailed, but regulation might restrict which data you’re allowed to use for it and oh boy would Facebook love monopolizing their throve of data and use regulation to fend off any threat to their leading ability to profile individuals.

In the topic of restricting access to others the author says:

But I ask myself: How difficult can it be for a billion dollar revenue company?

Pretty difficult when the law prohibits you[1]. Hopefully that forces regulation to cover the weaponization of technology not access to data. Right now this comes off as; you’re not allowed to enrich uranium, unless you own the land.

[1] https://www.eff.org/deeplinks/2019/09/victory-ruling-hiq-v-l...


"Mark Zuckerberg: Then I guess that would be the first time somebody's lied under oath." - The Social Network


While this demonstrates the capabilities of easily accessible ML software, it also shows the power held by large corporations and governments who have bulk access to this kind of data (images from social media, crawling) and how ML makes them able to abuse it. It's really scary to think they know where and when I was photographed or filmed in public (doing whatever) by random people while I don't...


This specifically shows that you don't need to be a large corp to do this.

Everyone can do this at home today and on a budget, with publicly accessible data.


It’s not easy to obtain a database of all the publicly available photos of yourself. The concern is that organizations with modest resources can generate a better picture of individuals’ public footprints than the individuals can.


As an individual, you have to try your luck with collections of images that happen to be publicly accessible, whereas large entities have access to many, many more images (all private collections in the cloud for example, or security camera material) and can search them systematically. That's a big difference IMHO.


At the moment (as someone who works at a large company), it honestly seems easier to scrape->do something sketchy with the data than it would be to overcome the internal privacy protections that are in place at a large company.



Do the internal restrictions apply to everybody at Google (my guess from your past comments) equally, or are there departments with easier access, government contracts etc.?


I can’t speak for all of Google (maybe there is something secret I don’t know about, although I doubt it given how much leaking there has been), but for the sensitive data I have had to deal with for my job, I’m not aware of anyone having a way of bypassing the controls.


These days, your public life is under continuous corporate surveillance.

You may have not noticed, but entire cities are under 24/7 video surveillance. https://ring.com

You may have not noticed, but the ubiquitous mobile phone is morphing into a 24/7 surveillance radar. https://atap.google.com/soli


> But I ask myself: How difficult can it be for a billion dollar revenue company?

Am I the only one who has trouble believing that someone who was at a technical level involved in scraping and running facial recognition on 200k photos would genuinely answer "Not very?".


I’ve implemented a few scrapers for various purposes, and run bulk data analysis over the results.

It was years ago, and it was not very difficult.

I’d guess this would be an easy summer intern project for a bright undergrad.


Would the same bright undergrad be likely to believe that stopping scraping on a service with a billion users is an easy problem?


I got a privacy notice update from my bank today. In the past I was generally OK with my bank to generally share data and all that because it goes toward credit scores and a variety of other things.

I'm not so sure any more. I feel like we've reached a point where a massive, trustable, profile of me is for sale. I don't think I want a bank that is allowed to do it any more.

Does anyone have any recommendations for a bank focused on privacy? (and for that matter credit cards?)

I know I know - probably impossible, but I am pretty sure I just created a market, who's with me?


That banking wet dream used to be subject of this article, Switzerland! As long as you had enough money.


It seems hard if not impossible to prevent illegal use of profile pictures. If a web browser can access them, collecting them can be automated. Even if they are only accessible through an app on iOS or Android, you could still, on a device with root access, capture the API calls and reverse engineer how to get those pictures. So we might just assume any intelligence service already has that. The only protection against a more widespread development of those products would be legal I think.


There's a simple, practical way to do this: don't make the images public. Limit them to some degree of separation by friend connections.

I think this should satisfy most friend-finding use-cases while eliminating comprehensive large-scale scraping.

You can still reconnect with that guy you saw at a party a few years ago or that teacher from high-school. Chances are they're within 2 or 3 degrees of separation from you.


>Limit them to some degree of separation by friend connections.

Haha, oh sweet summer child. How do you think Cambridge Analytica got all that data? All you need is 1 friend with lousy security practices.


By the same argument you might also be a similar distance from an intelligence agent.


Related reading recommendation: "The Dead Past" by Isaac Asimov.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: