
How LinkedIn exfiltrates extension data from the browser - asaasinator
https://prophitt.me/articles/nefarious-linkedin
======
nwsm
Not only that, but the article itself is a retelling of a story that would
have happened two years ago [0], and the GitHub repo he links at the end is a
clone of the repo in [0] (which the author himself is the sole contributor
to).

The whole thing is weirdly deceptive.

[0]
[https://news.ycombinator.com/item?id=18853607](https://news.ycombinator.com/item?id=18853607)

~~~
swrobel
Not only what? Was this supposed to be a reply to something?

------
codezero
Let's put this into context. It looks like LinkedIn is trying to detect
specific plugins which affect the page/UX. The plugins the author links to are
for scraping personal emails for email marketing "leads." This is totally
common, so I'm not making any kind of judgement here, but I think LinkedIn has
a right to protect its service and users. I can imagine the support requests
that happen where the LinkedIn the person is seeing isn't the same as LinkedIn
would be pretty confusing. Not all people are smart enough to know what all
their extensions do, especially if they use them for their job and were told
to install and use them.

With all that said, I also think people have a right to use the extensions
they want to scrape or block content on sites they visit, so catch 22 I guess.

It's worth thinking of this from both angles before getting angry at LinkedIn,
or the author of this post :)

~~~
tw04
>I also think people have a right to use the extensions they want to scrape or
block content on sites they visit

Do you also think people have a right to record movies they watch in the
theater?

I think people are free to run whatever extensions they want in their browser.
I think companies are also free to tell people they aren't welcome if they're
going to use that extension.

Why should a company like linkedin be FORCED to serve anyone? It's not a
public service, they aren't a government entity, you have no right to their
service.

~~~
CodeMage
_> I think companies are also free to tell people they aren't welcome if
they're going to use that extension._

Or a particular browser. Or a specific operating system. Or the wrong brand of
device.

You can probably see where I'm going with that, but in case it's not perfectly
clear, I'm trying to say that a company might want to dictate a lot more than
just what extension you're using in a browser. Not everyone is going to agree
that they are right to do that.

Bear in mind that there's more to consider here than the legality of what
they're doing. The openness of the Internet, the interoperability of different
operating systems and devices over a particular somewhat-standardized protocol
-- these aspects aren't regulated by a law, but they're still important. I'm
one of those people who remember the Browser Wars and I don't remember them
fondly.

Of course, it's not just about technical aspects, either. There's the whole
grey area of whether it's okay for LinkedIn to metaphorically rifle through
our proverbial pockets, looking for stuff they don't like and don't want to
admit on their virtual premises.

So no, I wouldn't agree that things are as clear cut as you present them.

~~~
paulryanrogers
There is a balance. A theater probably shouldn't discriminate based on the
shoes you wear, but may not offer a rewards app for FirefoxOS. Or a restaurant
may not accept customers without shirts.

Of course where to draw the lines is important and up for debate

~~~
zcid
Not providing an app for FirefoxOS is not the same as refusing entrance
because your phone runs FirefoxOS. The LinkedIn case seems to be closer to the
latter.

------
baxtr
In addition to that I can’t help but to feel completely energy depleted after
5min of LinkedIn usage. The feed is just horrible. Everybody is doing great,
no real discussions going on, trashy superficial insight videos. I only log in
to read messages, but I am not able to ignore the feed.

~~~
opportune
Yeah, I don't really understand the social network features of linkedin. I
guess I add people as connections sometimes but even that doesn't really seem
important.

The only utility I get out of it is that I can put up my resume and get
recruiters to contact me. Which more than makes up for everything else since
it has led to job offers

~~~
davedx
How many of those offers have actually resulted in you taking a position
though? I know my LinkedIn signal:noise ratio (where signal is an actual
commercial deal of some kind) is extremely terrible

~~~
treis
>How many of those offers have actually resulted in you taking a position
though?

2 out of my last 4. There's a huge amount of spam. Mostly from big Indian
firms looking for cheap labor to staff out a contract. But if you filter that
out most of the other messages are legit that have a realistic potential of a
job offer.

------
rangersanger
I used to work on one of those extensions. The lengths linkedin goes to
"protect" public data is kind of absurd. If they poured those resources into
building tools for salespeople the site would be far more valuable. Or, they
could reopen their now closed developer program. It has also been established
by lawsuit that those extensions have a right to do what they do.

Those extensions would break the site far less frequently if linkedin would
stop trying to break them. And, our users definitely knew when it was us
breaking linkedin.

I'm not sure it's as much a catch 22 as it is a self inflicted wound.

~~~
murkle
"public data"? I think you mean "my data"

~~~
chatmasta
It's data _you_ made _public_ by publishing it to LinkedIn, where default
privacy settings is "anyone who can log into linkedin can view this content."

It gets complicated when you start thinking of things like blocking other
accounts from viewing your profile, or adjusting visibility settings. Arguably
LinkedIn has a duty to protect the integrity of their privacy controls, which
would entail implementing anti-scraping measures.

~~~
chris_va
> It's data you made public by publishing it to LinkedIn, where default
> privacy settings is "anyone who can log into linkedin can view this
> content."

... under specific terms of use that scraping technically violates.

~~~
rangersanger
There’s actually quite a bit of data that’s viewable to anyone, logged in or
not, that’s public. All we did was speed up what was happening anyway. Sales
people all across the world copy and paste data from LinkedIn to sales force
every day. They either do it manually, using the app I worked on, or one of
the literally hundreds of competitors.

~~~
msh
So the argument is that you are abiding their days because other people are
also doing it.... Great morality...

~~~
notfromhere
If you don't want your data scraped for collection, you wouldn't put it on
Linkedin. Really the only argument is, is it more moral for Linkedin to do it
instead of a plugin?

------
grezql
LinkedIn epitomizes everything wrong with todays front-end.
Npm,grunt,gulp,es6,ts,babel, webpack, yeoman, browserify, reactjs, reacts
mother and its dog, yarn, bower, jsx.. and so on.

13 MB of JS/css/html:
[https://imgur.com/a/oehQQzJ](https://imgur.com/a/oehQQzJ)

~~~
theshadowmonkey
Thats ember over there. Almost Everyone who works on or maintains EmberJS
works for LinkedIn.

~~~
stocktech
Ember's beefy, but you can't blaim 13MB on it.

~~~
therein
I can certainly chime in on this. Yes it is Ember but I'd blame that on the
way it is "abused".

At one point it had become so bad that we had purge the excess whitespace from
the HTML at the traffic layer with middleware. It actually had megabytes of
whitespace.

Not to mention the server side rendering mess.

However when it comes to the subject matter of this thread, I don't think this
is as sketchy as the OP makes it sound to be. This is LinkedIn's anti-scraping
team at work, and nothing nefarious is going on.

~~~
inlined
> This is LinkedIn's anti-scraping team at work, and nothing nefarious is
> going on.

Are you a primary source or do you have a source to cite?

~~~
therein
Primary source. I used to work at LinkedIn.

This would be the application security team's work. They have a pretty
extensive anti-scraping initiative and I know for a fact that these are used
to determine if the account is scraping or not.

Someone on a different comment mentioned the "email-hunter" extension. That's
exactly the kind of extension they are targeting. I remember many requests
sent to support, asking why their account is terminated, and the response was
usually "oh you used email-hunter" etc.

------
AlexandrB
LinkedIn quietly continues to be one of the creepiest products around. I still
remember their dark patterns around obtaining your contact list[1]. It's not
surprising that little has changed under the _new_ Microsoft which has equally
little respect for user privacy and choice.

[1] [https://www.quora.com/Does-LinkedIn-access-your-email-or-
con...](https://www.quora.com/Does-LinkedIn-access-your-email-or-contact-
list?share=1)

~~~
snarfy
I have suggested connection on linkedin that is my wife's maiden name. It's
super creepy. We have zero shared connections.

~~~
gooftop
In the relationship graph that Linkedin builds, suggested connections could be
based on a lot of things other than shared connections. Just speculating here,
but it could be based on shared email address, shared IP address, shared
physical address, membership in same groups, linkedin messages by you and wife
to other people within 1-2 degree of either of you .. or some combination
thereof for higher confidence.

------
zrobotics
The real question here is _why_ LinkedIn should even need this information.
This represents significant engineering work to develop, so obviously at some
point they decided that knowing which extensions are present had value.
However, I cannot think of a single non-malicious reason to want this
information; the malicious reasons that spring to mind are browser
fingerprinting and ad targeting.

~~~
codezero
I can think of tons of non-malicious reasons.

1\. Support requests because site is broken, but it turns out you are using an
extension that breaks the site.

2\. Extensions are exfiltrating data to the extension owners, against
LinkedIn's TOS, and they are trying to protect their users, or rather, they
don't want competition :)

OK, that was two.

They aren't blinding probing for any and all extensions, only a specific set,
which shows restraint and implies to me they are having a sort of arms race
with extensions that scrape contact info.

~~~
imglorp
Wait, so they try like crazy to scrape YOUR contact list but fight like crazy
to keep THEIRs from getting scraped.

~~~
visarga
Yeah, just like Google bot crawls the web putting enormous load on people's
servers (G-Bot can pull hundreds of thousands of pages per day from a single
server), while being very sensitive to automated searches and banning your ass
(IP) in a heart beat. They sure don't want to be crawled by anyone.

~~~
orf
If your site cannot handle the moderate 3-5 requests per second it would take
for google to make hundreds of thousands (500k) of requests a day, then I hate
to break it to you but you have bigger problems at hand.

------
et1337
Previous discussion:
[https://news.ycombinator.com/item?id=18853607](https://news.ycombinator.com/item?id=18853607)

LinkedIn doesn't have a great track record, but in this case they might just
be trying to prevent abuse.

~~~
peteretep
No, they sell the same service to people. LinkedIn are trying to protect their
profits, not their users.

------
data_spy
For what it's worth, I read an academic paper that said browser extensions are
a strong signal in identifying a person when combined with geo data.

~~~
nwsm
I’d be interested in reading it if you know the name or author.

Maybe this: [https://www.securitee.org/files/xhound-
oakland17.pdf](https://www.securitee.org/files/xhound-oakland17.pdf)

------
westoque
Not only that. I believe LinkedIn also harvests data from your calendar. I
remember seeing LinkedIn notifications about the person I’m meeting without
even asking for it. Kind of creepy and unwarranted unless they asked me to.

~~~
obenn
To be fair I'm aware that they keep asking me to sync my calendar but deny it
every time, I'm quite sure that function would have to be explicitly allowed.

~~~
ce4
what if the other person has allowed LinkedIn access to their calendar?

------
dylz
I find it interesting that the author of this blog post does not openly
disclose that they write/own a service that effectively does mass mailing,
scraping, bruteforcing of email.

~~~
swebs
Second to last paragraph.

------
_bxg1
"Exfiltrates files from your system" is a very alarmist way to say "checks the
list of installed browser extensions". Not that it isn't creepy, but let's
calibrate, shall we?

------
imiric
This is worrying, yet unsurprising. LinkedIn has become a necessary evil for
most professionals, unfortunately. The quality of IT opportunities isn't as
high as on other smaller job boards in my experience, but I still keep my
LinkedIn profile up to date, to get a feel of the market mostly. I look
forward to the day I can disable that social network as well.

In the meantime, we should build and use simpler web browsers, without
extension support for one. I've found surf[0] to be the most usable of all
WebKit wrappers. Without much C experience, I've managed to use my own fork[1]
for a few months now, which wasn't much work thanks to the lean sub-3KLOC
codebase of very readable C code and helpful comments.

I imagine that an experienced group of C programmers could take surf as base
and easily build a secure and user-friendly web browser with most of the
features of the big boys. WebKit is still a concern, but with some work it too
could be abstracted away and made easily replaceable.

For LinkedIn specifically, I use a separate cookie file, and with the surf
process isolation it gives me a degree of sandboxing similar to Chrome. A
modern browser should be built on sandboxing principles for web content, and
expose this functionality for each site by default.

[0]: [https://surf.suckless.org/](https://surf.suckless.org/)

[1]: [https://github.com/imiric/surf](https://github.com/imiric/surf)

------
legitster
LinkedIn offers a marketing feature where users can opt to prefill forms and
etc using their LinkedIn data. But not OAuth based (a quick and dirty
workaround, perhaps).

Our privacy team took one look at the code and said to stay 100 feet from it.

~~~
nwsm
Are you implying websites that include the “Apply With LinkedIn” widget could
gain access to a user’s LinkedIn info before they click the button?

------
chvid
Linkedin is at war with spammers and scrapers. The mentioned plugin is a tool
for quickly finding email-addresses on a webpage. Who uses that?

~~~
icebraining
Customers of LinkedIn. LI is just mad that they aren't paying.

------
mehrdadn
> How would you feel if you opened a program and the program started to check
> your file system to see what other programs you had installed?

Slightly tangential but does anyone know if this is what Chrome does? It has a
software reporter tool. Also Windows seems to do this too :/ though I'm not
100% sure.

~~~
Piskvorrr
Not happy about other programs deciding to become spyware, either. "But they
also!" is not a valid defense.

~~~
mehrdadn
Yeah I didn't mean it as a defense either. More like we should've been
freaking out quite a while ago.

~~~
Piskvorrr
I've been freaking out for _two decades_ now. First the response was "yeah,
that's tin-foil-hat stuff", sometime around FB it changed to "yeah, that's old
hat, so what?"

------
plotteddancer16
Wow, this is disturbing. Lest we forget that LinkedIn has been a Microsoft
subsidiary for three years now. I wonder if Microsoft is doing this elsewhere
too?

~~~
filoleg
LinkedIn has been doing this (and a few other shady dark patterns) since way
before being acquired by MSFT. And MSFT seems to be letting the recently
acquired companies do their own thing and not interfere, which is what
employees of those startups would most likely prefer. Not saying that this
kind of hands-off approach is the best idea here (because i dont have a strong
opinion on that), but i dont think it is fair to extrapolate the shady
behavior of LinkedIn (that has been documented since a very long time ago) to
the rest of MSFT

------
darkcha0s
Isn't this just another way to fingerprint the user?

------
jvagner
I haven't had LinkedIn for almost 10 years now, and I haven't had Facebook for
at least 6 months now.

Explaining why I don't have either is a burden I live with in my professional
life, but the degree to which even other technical professionals don't
sympathize with not having accounts on LinkedIn is pretty amazing.

I guess I'm relegated to a bit of sub-culture-ness. I'm self-employed, so I'm
okay with that, but I guess others might find it challenging.

~~~
reaperducer
Lately I've been telling people who question me about this, "I know some
people who work there. They won't say why because of NDA, but they tell me
it's better that I don't have an account. They say they wouldn't have one if
they didn't work there. I trust their judgement."

That's worked so far.

~~~
asdf21
Is any of that true?

------
runeks
Why is it even possible for a website to fetch _chrome-extension: //_
resources?

Seems like something that shouldn’t be accessible by a website.

~~~
codezero
I believe it is because those extensions have intentionally exposed themselves
so they can do bidirectional communication with the page.

------
Fogest
The author makes an email scraping extension... his intentions against
LinkedIn are not pure. His extension scrapes LinkedIn for user info.

------
ballenf
This addresses the two greatest threats to their business model:

\- users associating spam received to their use of LinkedIn

\- undermining the value of LinkedIn paid services

I'm more upset at browser vendors for creating such an obvious
security/privacy hole than at LinkedIn for using it. And now Chrome will use
this as subterfuge for nerfing adblock. This is why we can't have nice things.

~~~
icebraining
How are browsers supposed to prevent the page from detecting that an extension
changed the page's DOM, or that the extension explicitly made URLs accessible
to it?

As the author points out, there are mechanisms for showing extension UIs that
don't rely on DOM manipulation.

------
x0n
Dude builds possibly nefarious extensions for a living. Dude writes article
about LinkedIn nefarious anti-nefarious extension code. Conspiracy voting
commences. Article at #1

~~~
dang
I took a look and saw no evidence of conspiracy voting. I also don't know what
basis you have for the first sentence there.

------
codedokode
Some websites (for example Aliexpress) scan local ports to check whether you
are running something like RDP, SSH or VNC server. They try to open websocket
connection to those ports and measure how much time it takes to establish (or
to be rejected).

------
WellDressed
Sites we interact with may become adversarial towards us at any time they see
fit. I wish my browser and the extensions I use were sacrosanct and outside
the purview of other companies/sites.

------
bichiliad
I wouldn't be surprised if LinkedIn sent this data back, but browser plugin
detection sounds like a common ingredient in browser fingerprinting, which can
be pretty useful for things like A/B testing signed-out users or detecting ad
fraud. I don't place LinkedIn in very high regards, and I'd be surprised if
they ever asked a user if they could fingerprint their browser in an
unambiguous way, but I don't know if I find this particular thing to be
exceptionally evil.

------
chefandy
Here's a permanent public archive of this article in case it goes down for
some reason.

[https://perma.cc/23ZX-JZZB](https://perma.cc/23ZX-JZZB)

(FYI: Perma.cc, an anti-link-rot service run by Harvard Law School, is free
for up to 10 links per month. The project is run by my department, but I'm not
on the Perma team.)

------
hhs
What are the alternatives? LinkedIn is pretty powerful with network effects.
At one point, I thought angel.co would gain traction.

I wonder if there will be a successor to LinkedIn in the near future?

~~~
LargeWu
I just don't use it, and somehow I'm doing just fine.

~~~
eridius
Same. Many many years ago I straight-up deleted my LinkedIn account (in an
attempt to cut down on recruiter cold-calls, which didn't work) and I've never
suffered as a result.

------
Tepix
I'm glad I don't use LinkedIn.

But one day I may have to.

Can the URL be blocked in uBlock Origin so that the uploading of the collected
data will not take place?

------
taf2
Might this be a fingerprinting technique they use to ensure you're not abusing
their system?

------
emrekzd
_How would you feel if you opened a program and the program started to check
your file system to see what other programs you had installed?_

Not true. You definitely figured out how to market your scraper by lying. Nice
job!

------
caiocaiocaio
You've got to admit, though, being the most scummy social networking site is a
pretty big achievement.

~~~
dimaryz
Have you ever heard of FB? :)

