Hacker News new | past | comments | ask | show | jobs | submit login
Facebook sues two Chrome extension devs for scraping user data (fb.com)
142 points by alexrustic 49 days ago | hide | past | favorite | 63 comments

Interesting, now Facebook is trying to portray themselves as some sort of legal watchdog over users' data, suing weak, non-customer defendants at random. No doubt they'll be citing this as evidence of their benevolent corporate mission in the next round of regulatory actions against the company.


Users' data is Facebook's main asset. They created the danger posed by these defendants by intentionally collecting user data on such a massive scale. They're not protecting users, they are protecting their business of creating pools of similarly situated users for advertisers to target. Facebook will never protect users from advertisers. They will not sue their own customers.

At least with these defendants one can get a rough idea of what they were trying to do, the methods they were using to collect data. With Facebook, the internal operations of the company, how they solicit, collect and use people's data, is deliberately withheld from public scrutiny. "We're using your data to make our service better." For whom? Their customers, which may include political campaigns. Facebook is free. The user is not the customer. "We cannot reveal what we are doing because that would put us at a competitive disadvantage." Obviously they are doing much more than just storing your data and making it available to your friends list. You are not the customer.

When Facebook customers, i.e., purchasers of Facebook advertising, filed a class action lawsuit against Facebook recently for fraud^1 (more specifically, https://leginfo.legislature.ca.gov/faces/codes_displaySectio... ), Facebook tried to place their amended complaint under seal, so the public could not see it. In part, because it disclosed what goes on behind the scenes at Facebook, the very things that the plaintiffs are suing over. Facebook has much to hide and litigation is one way some of their shady practices may eventually find themselves under the spotlight of public scrutiny. What goes around comes around.

1 https://cases.justia.com/federal/district-courts/california/...

You should have noticed by now that there is widespread, emphatic agreement that Facebook should protect users' data from third parties.

This is why what Cambridge Analytics did was bad, right?

Exactly. A lot of people think this is hypocrisy but this is exactly what you want Facebook doing.

It seems like a week doesn't pass that there isn't news of yet another scraping/dark-adtech firm exposing hundreds of millions of these types of records[0]

If you think Facebook are bad, these companies are an order of magnitude more evil, and you're never going to hold any of them accountable because they don't care for regulations.

edit: The threat model here is really concerning. To build user databases legitimately takes a lot of effort and funding. To do it via extensions and scraping requires finding browser extensions that have a lot of users and loose permissions (usually https://*/\*) and acquiring them for cheap, pushing an update and then just watching the data roll in.

The only recourse is further lockdown of browser extension capabilities (which also punishes good apps like uBlock Origin), purging extension stores (which also usually traps innocent players) and/or taking legal action as Facebook are doing

[0] https://www.safetydetectives.com/blog/socialarks-leak-report...

I don't understand this, though. It seems to be driven by a misunderstanding of how the data is being collected in the first place. It isn't some security vulnerability with Facebook. And Facebook isn't distributing the data to a third party. It's just sharing the data with users, who are then passing it off to the third party (intentionally or otherwise).

It's like expecting keyboard manufacturers to sue developers of keylogging software.

OK but in isolation there’s nothing wrong with going after malware developers. They might be "weak, non-customer defendants" but I’m certainly not going to feel sorry for them.

Also it’s not clear what fb is supposed to be doing then? Just let people write malware and steal user’s data? Be damned if you do and damned if you don't.

No, I think you are misunderstanding the point of the comment. It is the news releases. It is that Facebook is publishing about these cases on its website. Facebook is always involved in litigation to protect its business. It is that they are now highlighting these cases as if they are acts of "stewardship" or some defence of user privacy.

It is generally good that Facebook is taking these actions.^1 But the point I am making is that it is likely to be used as an argument by Facebook to try to hide the fact that Facebook created the problem in the first instance. And they have historically failed miserably as "stewards". And they are the much larger threat to user privacy than anyone they are suing. Their interests are not aligned with users. Facebook has reasons to keep others from obtaining user data that have nothing to do with user "privacy", a concept Zuckerberg is in fact actively trying to destroy.

1 But I am wondering what Facebook would do if a user was "scraping" her own data and the data their friends have shared with her. In other words, imagine the user writes her own "bot" to automate her Facebook usage and reduce the amount of behavioural data she gives to Facebook, i.e., the data she does not get if she "downloads her data" from Facebook. Clearly this is not "malware". TOS would surely be interpreted by Facebook to not allow any sort of automation. As anyone can see reading the public filings they make with courts and regulators, Facebook lawyers are heavy on the over-the-top rhetoric and arguments that border on the absurd. The user is not the customer so no reason for them to hold back on going after users.

I don't care what FB's motivations are as long as they scare the living daylights out of people who trick users into installing spyware. It's the least they can do, considering what their company embodies.

Who knows what their exact motivations are, we can only make a guess. Yes in any case it's still a company and they are going to do whatever is good for their business, and it turns out that this time it's also good for their users.

This is just a masqueraded effect from Facebook to stop these devs from eating their lunch. Facebook makes money selling user data. If someone else makes money or gives that value away for free, then this is loss of revenue for them. Just follow the money. Facebook sucks and I wish there was an alternative like Signal for WhatsApp, but for Facebook.

Theres a difference between collecting data with consent VS a spyware designed to secretly collect data.

These extensions didnt just scrape from fb, they did so from every website the user went to.

Monopolies and power blocs being the arbitrators of which is good or bad for our society is the new status quo.

It's not surprising that this press release focuses way more on the extensions mining ancillary 3rd-party information available via the extension (stuff outside of FB) than just what they took from FB - which otherwise FB gets unlimited and unscrutinized access too.

Instead of taking the opportunity to look within at how they are part of the problem, they push enforcement you'd normally expect more from the extension publishers, Chrome/FF/etc browser extension stores. Instead the massive data silos themselves want to run the show while still eating their cake.

They realized tncentives have changed as the interest groups and likewise the regulators (whom have the latter increasingly have their ear) have become increasingly concerned about blowback from the large pile of data they vacuum up every day, and their own self-proclaimed stewardship of that data.

They realized they can control it domestically for their own interests and fool the public into thinking the only concerns are Russia, China, and in this case malicious fringe nobody political groups with zero real-life power.

These are the new sheriffs in the 'wild-west' with inherently vague privacy rules, everything goes internally and for internally 'verified' 3rd parties. Meanwhile they create a smokescreen via engaging in token rule enforcement against tiny firms or the odd domestic authoritarian group so they can showcase themselves as being pro-privacy (ignoring the mountain of despots still using the services or blatant bipartisan double dealing).

IRL it's most likely not some top-level master plan or conspiracy as many claim, these sorts of moral inconsistencies happen naturally in these firms when they set up teams to control priority narriatives and have their loyalists moderate content, with a) primary focus on the fringes and easy hapless targets while b) willfully ignoring the bigger players or 'politically acceptable' groups (and their own businesses over-arching models).

History is littered with examples coming from the 'morally superior west'. Just look at northern Pakistan on the borders of Afghanistan for these sorts of compromises. Engaging in a decades long phony-war while ignoring geopolitical elephants in the room. These same inconsistencies are rampant in our new politically acceptable reality.

These inconsistencies are often obvious by default until some super egregious cases get 'caught' or more importantly politically unpalatable use-cases get exposed, then all of a sudden they are the good caretakers over your data - the people they long promised they would be!

Phony inconsistency is their calling card and easy to sport for anyone who is paying attention.

I sometimes wish I too was dumb enough to engage in hyper-partisan politics. Like blindly backing a sport team. And not see the long term deterioration of democracy, transparency, and rise of misguided authoritarianism by people who just discovered politics and think flailing about with new forms of top-down censorship will not simply backfire. Or is somehow better than the universal rejection of the (already) tiny fringe negative forces which we saw immedately after the capitol riots from bipartisan sides.\\

No, FUD convinces them we have to destroy the rights, rights created to protect them, to get there...

While I get why they are doing this, there is a certain amount of irony here. Facebook fighting to prevent the collection of users data.

It made me laugh a little

Agreed. EXTRA funny since facebook was originally entirely powered by data scraped from harvard student directory.

Everyone trying to be funny here except that you’re all missing the point.

That data was public in the first place. These extensions can scrape anything, including your messages.

If that's true then there should be precedence here with https://en.wikipedia.org/wiki/HiQ_Labs_v._LinkedIn I think, right?


Also in early times they gave quite open access to the data via API, so that everybody ties to them and everybody uploads data. Once they were big they reduced the APIs and monopolized the data they collected. (Not giving broad access via API is a good thing, see discussions around Cambridge Analytica and others)

Break the rules until you're big enough to make them.

Obviously this doesn't prevent you from breaking other rules. Actually, once you make it into the big leagues, you can now afford to break more rules than ever before.

Even more funny since fb also scraped data from email accounts by misleading to obtain logins and passwords and spammed people.

Facebook is not suing them for stealing users' data, they are suing them for stealing THEIR data.

> demanding that they delete all Facebook data in their possession

I know "Facebook data" is not quoted, but I gather that is the way they feel about it, they aren't protecting the user and the user's data, they are protecting themselves and their data!

Well, _they_ want your data and they want to sell it for profit or make use of it in a profitable venture I imagine. They don't want anyone else getting it for free.

Like a mob boss sending goon squads to defend their turf

The developer indeed looks pretty shady. Weird screenshot security certificates and knockoff drop-shipped electronics, on top of their plethora of browser extensions and desktop apps. They seem to be based in Portugal.



Sounds like it's perfect for Facebook, land of the MLM and foreign and domestic propaganda campaigns.

Every day the browser becomes less of a User-Agent and more of a Vendor-Agent

Malicious browser extensions are a very serious problem though. The user has almost no way to know if their data is being stolen.

But they do. It's called installing reputable extensions and software. Anything non-reputable is stealing all your data with decent probability.

What is reputable? So many once reputable extensions get sold to malware makers for huge sums of money so you not only have to find someone who will not add malware, but someone who will reject $1M purely on ethics.

It's malicious actors all of the way down /s

Note that this is different from the last time Facebook sued two Chrome extension developers for scraping, three months ago:


So, my understanding is that Facebook happily scrapes users' contact lists from their app. How exactly is that legal, and this not? The most obvious difference is that Facebook likely claims users are informed of Facebook's scraping, but I'm sceptical of the number of people who actually understood that. If these extensions said in their ToS that they did this, would it be okay?

Yep, lying about it is why they're being sued: "They misled users into installing the extensions with a privacy policy that claimed they did not collect any personal information."

"We are seeking a permanent injunction against defendants and demanding that they delete all Facebook data in their possession"

Facebook data. Not user data. That pretty much sums it up.

If it was phrased as “user's data”, then couldn’t the app developer get the case thrown out on the basis of “we haven’t taken facebook's data, therefore facebook has no right to sue us”?

No, I think it was pretty much a Freudian slip. They caught themselves in the next paragraph: " This case is the result of our ongoing international efforts to detect and enforce against those who scrape Facebook users’ data"

Any sincere effort to change the narrative/direction hopefully won't be met w/ criticism. However they should understand that it's a long path to making people believe it.


Reminds of the recent greatsuspender issue that's been going on recently (https://github.com/greatsuspender/thegreatsuspender/issues/1...)

I found myself needing a facebook for the first time in about 10 years (I deleted my facebook a long time ago). So I decided to sign up using a fake email and a fake name.

I was extremely shocked to learn that facebook requires you to upload a real picture of yourself in order to use the service[1]!

I tried several times creating new accounts. Each time the account was locked and it required 1.) my phone number and 2.) a real picture of myself. Why would this ever be a requirement for using social media?

1. https://imgur.com/KvsPiFY

>So I decided to sign up using a fake email and a fake name.

Facebook requires you to use your real name when creating an account.

>I was extremely shocked to learn that facebook requires you to upload a real picture of yourself in order to use the service[1]!

Nope, Facebook requires you to upload a real picture because you got caught creating fake accounts. You know who else uses fake names and emails and creates multiple accounts once the previous one runs into a checkpoint once it appears to be fake? Scammers and bots. Use your real name, and then lock down your privacy settings to where only your friends can see info about you. It'd be annoying to your friends trying to remember who "John Smith" actually was all the time anyways.

just feed it some thispersondoesnotexist fake person image...?

I actually did upload something that was basically a deepfake. My account is still locked while they "verify" it. Really bizarre.

Just uninstalled whatsapp, because I got a warning that facebook which owns whatsapp was about to scrape even more data. I'm not sure who forced them to issue that warning. But thanks!

Is this thread being botspammed?

Seems to be a bunch of comments that look like they've been generated with some kind of GPT3 type setup, almost sensical and somewhat related.

If you could link what you're referring to, it would help us in responding. What I'm reading could charitably be considered related and written by a human.



Were the ones I saw, re reading them a few times I could see them as being human written. Maybe just someone in a hurry to post.

You are poisoning the well. One of those accounts is old and the other is active.

Just low quality human comments we always see on FAANG posts.

Of course this thread is being botspammed.

Bad actors hate when you talk about punishments for their actions.

> Four of their extensions — Web for Instagram plus DM, Blue Messenger, Emoji keyboard and Green Messenger were malicious and contained hidden computer code that functioned like spyware.

Headline is missing context. I run a Chrome extension that's a non-spyware web crawler so I appreciate the distinction that they're not just going after any scraping tool.

I just wrote a fb scraper, using a headless browser: https://github.com/niczem/trawler

Still a d*ckmove from FB, especially considering that a browser plugin can not really be used to scale up. Anyway, looking forward to hear from FB :)

You won't be sued (I'm not a lawyer) as long as you're not breaking and ToC, or misleading users into using it and sending you their data.

That's the reason the actor in the article is being sued, not because they're scraping, but because it was sending it to a 3rd party without consent or knowledge.

> You won't be sued (I'm not a lawyer) as long as you're not breaking and ToC, or misleading users into using it and sending you their data.

You can probably mislead them all you want in most jurisdictions. After all, that's what Facebook is already doing - how many of its billion+ users do you think are fully aware of how much external data Facebook gets access to?

Outright lying on the other hand, that's what's going to get you in trouble:

> a privacy policy that claimed they did not collect any personal information

... that actually seems pretty legit^^

These lawsuits are interesting. It raises questions about who owns what data. FB says your data is yours but then if I share with or tag someone, do they then get a right to that data? I'm not sure they are alleging copyright infringement. It's not clear to me what the law is here or what the law should be.

It's never to late to do the right thing, FB has fucked up in the past but calling out software that is doing harmful data collection of their users is what they should do here.

Didn’t LinkedIn do the same thing and lose?

oof glad i didnt implement my plan to do this...

whew. glad I sold early. I knew Zuck wouldnt allow it forever. I expect linkedin to follow suit.

Remember kids, when you scrape FANG, always use prepaid cards, tor + vpn. You might as well be up against a state actor.

> whew. glad I sold early.

Sold what early? Stock? Were you involved?

Not with these guys of course.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact