
The Fight to Mine Your Data and Sell It to Your Boss - artsandsci
https://www.bloomberg.com/news/features/2017-11-15/the-brutal-fight-to-mine-your-data-and-sell-it-to-your-boss
======
wfo
The poignant part of the headline here is "sell it to your boss". I think we
have already lost the battle over data mining. It is happening, and will
continue to happen and it seems to me there is really nothing we can do short
of drastically restructuring the entire Internet.

The biggest problem is when you combine intrusive massive surveillance with a
ultra-powerful entity that can be capricious, malicious, and often abuses and
punishes its underlings. This is why I am far more concerned (as a selfish
American, thinking for now only about the concerns of Americans) about the FBI
having unlimited surveillance powers than the NSA: the FBI has a history of
committing crime, horrifying suppression of speech, dissents, political
movements, of destroying the lives of innocents, etc. The FBI can knock on
your door with guns and throw in you a cage for decades. It has massive power
over us so it is important we do not give it massive information about us at
the same time.

The same is true with bosses. One technique to deal with this is to try and
firewall the information off away from the powerful. This is what we did to
the FBI before parallel construction (bills that say data will be collected
"only for terrorism" \-- a lie, obviously, as it always is, but one that made
the collection palatable). The article focuses on this approach, which is in
my view a lost cause, but a lost cause worth fighting for nonetheless.

The other approach is to take away power from bosses. Right now a boss has
complete and total power over his employees without labor unions or worker
protections (i.e. the current state of affairs in the US). It is this
combination with the information that is so disgusting and dystopian. So we
could, instead, talk about taking away the _power_ , rather than taking away
the _information_ \-- which would have the added benefit of solving many other
social problems along the way.

~~~
elhudy
An approach we can take on an individual level is to be mindful of the data we
are giving up. If you are employed and in the process of job hunting, then
stop "liking" the article headlined "10 Reasons why you deserve better from
your company".

~~~
Trav5
I call BS on the comments that say doing what you suggest is giving up... To
me, "Liking" something implies expressing your opinion publicly, probably via
Facebook. When I receive a job application, the first thing I do is check to
see what the person has publicly posted. That's not surveillance. Before the
general population starts to worry about surveillance, they need to think
about the public image they are portraying online. I can't hire someone who
posts stupid shit publicly. It could damage my business if my clients look
them up.

~~~
gaius
_That 's not surveillance_

Yes it is. No less than if you got their postal address from their CV, parked
outside their house and watched who came and went. After all they volunteered
their details and the street outside their house is public... right?

~~~
Chriky
I mean, it is clearly "less" isn't it?

This continuum fallacy stuff really just makes privacy advocates look like
kooks.

You have to use the language other people use if you want to convince them,
and normal people do not draw an equivalence between someone googling their
name, and someone staking out their house, because there is an ocean between
them.

~~~
gaius
_This continuum fallacy stuff really just makes privacy advocates look like
kooks_

You've clearly not read the article, in which a judge says that planting a
tracking device on a car is materially different than tailing it.

------
giantsloth
I'm going to posit that most of us are spoiled by how much attention we get
from recruiters, which is why the attention on this article seems to be on the
legality of scraping someones site are not.

The focus should be on how disturbing it is that a company is using metrics
like "independence from employer brand" to take the power out of the hands of
the worker and put it into the hands of the corporation who already wields so
much power and influence over our society.

Programmers are lucky, a last bastion of decent treatment by corporations.
Companies like HiQ are looking to tip the scale back in favor of the
corporation, run by people like Mark Weidick who want to be a useful and well
kept pet and identify as a "Silicon Valley entrepreneur. Hollywood wanna-be."
(twitter)

I fear secretive preemptive firing and hiring. "Talks" from your manager,
based on encroachments into your private internet browser. This will force
developers to combine their personal identity with their corporate identity
(which far too many developers do already wearing their respective companies
t-shirt like a big walking free advertisement) and curate their online life to
reflect how grateful they are to the lords of their fiefdom.

~~~
QAPereo
Give it time, and _Snow Crash_ is going to look like a utopian vision.

~~~
bdamm
The day I wipe my ass with dollar bills will be one of the saddest days in my
life. (Note: Employer, NSA, FBI, DHS, this one's for you!)

------
whack
I think the two main questions raised in this case:

1\. Is it always ok for someone to build a bot to do something which can be
legally done by hand? Example: Building a LinkedIn scraper that tracks all
public data. Or using a GPS tracker to track a car, instead of manually
following it

2\. Is it anti-competitive practice, and a violation of anti-trust laws, for
LinkedIn to allow the general public, and other companies like Google, access
to its public data, but ban others such as HiQ?

On question 1, I tend to lean towards LinkedIn's position. Just because
something can be legally done by hand, shouldn't automatically mean that we
should allow it to be done at massive scale by automated scripts. I wouldn't
want companies having the right to surveil the movements of every citizen
24/7, just because they have the right to follow someone on foot, and I think
a similar argument can be made against HiQ.

On question 2 though, I agree with HiQ. LinkedIn's attempt to ban HiQ doesn't
seem like an attempt to protect their users, but rather, an anti-competitive
attempt to kill off a potential competitor, and secure the market for
themselves.

~~~
lovich
Haven't the courts already decided that if things can be done by hand then
they can be automated? For example license plate scanners were ruled legal
because officers can view that information in public, but the automation
allows the police to follow everyones movements with only a few machines set
up. This ship has sailed

~~~
matt4077
Automated license plate scanning is probably legal for local municipalities
(Neil v Fairfax County, SC appeal pending).

That doesn't mean it's legal for private entities to do so, nor is necessarily
legal for such data to be, for example, aggregated nation-wide.

See [https://www.theatlantic.com/politics/archive/2014/02/mass-
su...](https://www.theatlantic.com/politics/archive/2014/02/mass-surveillance-
of-all-car-trips-is-nearly-upon-us/283922/) for Conor Friedersdorf's article
which is attributed with bringing down the last attempt for a "Homeland
Security" database of where you were last summer.

~~~
mjevans
The correct solution for this concern is to make it legal to obscure the
IDENTIFYING part of the license plates (but require that state and 'tabs' are
still exposed) while it is parked.

Moving the obscurity would constitute modification of private property and
should require either a valid documented probable cause or a warrant for the
search.

------
CalChris
This approach strikes me as confusing _precision_ for _accuracy_. It is
possible to use this sort of information to formulate an extremely precise
model. That apparent precision becomes believable because ... it's precise and
complex and uses a lot of data. That then becomes a salable product.

However, whether this all is actionable and accurate is another and
unfortunately _later_ question. The promoters and customers of that approach
might want to read this chapter in the CIA's _Psychology of Intelligence
Analysis_ :

Chapter 5: Do You Really Need More Information?

[https://www.cia.gov/library/center-for-the-study-of-
intellig...](https://www.cia.gov/library/center-for-the-study-of-
intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-
analysis/art8.html)

But then they might not have a product to sell or buy.

~~~
alanfalcon
I dunno, I was accutely aware that as I prepared to find a new job by updating
my resume, updating my LinkedIn profile and contact list for the first time in
years, etc., that it would be very obvious to anyone watching what I was
doing. I correctly assumed my employer wasn’t watching (which would have made
things potentially uncomfortable for me), but with tools like this it becomes
so low effort to watch for this stuff that they can afford to do it.

No, not everyone is like me and ignores LinkedIn while actively and happily
employed, but I imagine enough people are and that it’s not so hard to build
models of other user types that work as advertised (as long as they have
access to the public data).

HiQ is a threat to LinkedIn on two fronts: as a direct competitor and as a
reason for some people to opt out of using the service (though it appears
LinkedIn offers similar services, so perhaps the threat is more the Streisand
effect at work). That doesn’t mean HiQ should be locked out of the data of
course, but it makes for an interesting and complicated situation.

~~~
CalChris
I'm not saying that social media doesn't exist and that anything you say can't
and won't be used against you in a court of law. It will be. What I am saying
is that relying on a complex model based on this is just rank silliness.
Still, I guarantee that that rank silliness will be a product, it will be sold
and and it will be bought.

For a big market with small bets (ads) this makes sense. For a small market
with big bets (employment) it doesn’t make sense, to me at least.

~~~
21
I think you make the mistake of not thinking in statistics.

It doesn't matter if it's not always right, it only matters if it's better
than what they have at the moment (ie: almost nothing).

Like the quoted bank said in the article, improving retention by 1% can lower
costs by 100 mil per year. Offering a raise to people polishing their Linked
in page sounds like low hanging fruit to me.

And of course, employees will try to game the system, which will become
smarter and so on.

~~~
CalChris
A statistician went duck hunting. His first shot was a foot high. His second
shot was a foot low. When asked about it he said, _On average, that 's a dead
duck._

As I said, there are markets, large markets like ad auctions, where this
approach makes sense. But for HR, I'd have to see this being a demonstrable
success story before I'd touch it. I think there are better and easier
approaches to retention like treating your employees well rather than looking
at a statistics dashboard and following its sage if soul-less advice.

------
id2531513
What HiQ are doing will be illegal in Britain when the new data protection
regulations are introduced in May 2019.

You will no longer be allowed to collect and store personal data without
content.

------
atmosphereiv
Well now it is time to delete my LinkedIn account. There terms and services
say that would not allow people to steal my information and that my
information is my own. I have never agreed for my info to be resold by HiQ.
Now that it is illegal for LinkedIn to protect me from HiQ's theft the only
option I have is to drop LinkedIn.

~~~
avh02
I believe it's done only on publicly visible accounts, modify your privacy
settings _if_ you don't want to go all the way to deactivation.

------
tempodox
Now I'm just glad I never signed up for LinkedIn or any of these. Can't trust
any data repository that's not my own.

------
default-kramer
You know, I think LinkedIn actually has a point that blocking certain scrapers
is in the interest of their users' privacy. Of course, they are really just
trying to stifle a competitor (glad the judge called them out on it) but I
still think they should have a right to block scrapers at will.

~~~
id2531513
The judges argument is fine, if HiQ had to contact every profile owner and ask
for their consent to store and use their data. Otherwise, how does a
individual know where their information is going?

------
DannyBee
It's interesting to see EPIC and EFF on different sides of the same issue. I
don't feel like i've ever seen that happen.

~~~
Analemma_
I wonder if it might've been best for the EFF to just stay out of this one
entirely, because it seems like sticking up for either side is defending and
normalizing a shitty proposition. I see their point that LinkedIn is abusing
the CFAA to stop competitive scrapers, but on the other hand, evil as LinkedIn
may be, at least the users on it agreed to its terms. If I sign up for
LinkedIn and give them my data, I agreed to that. If HiQ slurps it up, I'm not
consenting to whatever sinister things they decide to do with it. That feels
wrong and not something the EFF should be sticking up for.

(Disclosure: I used to work for Microsoft, which now owns LinkedIn, although I
like to think I'd have this opinion either way)

~~~
21
Yes, I don't quite get EFF's point.

So they are saying that anybody should be allowed to scrape anything found
online, and use that data for any purpose?

In this case it can be argued they use the data against the physical's person
about who the data is.

So they wouldn't object if banks or credit companies would use the data to
reject non-desirable people.

~~~
mjevans
My armchair impression is that the EFF doesn't like the prescient this might
set. I think you are correct in them wanting individuals to be able to scrape
or otherwise mechanically process anything which they can normally view.

The real chilling effect is from the lack of actual privacy on the data it's
self; and the existing asymmetry of power in the employee / employee model as
we know it today.

------
irrational
I am so thankful I don't have a LinkedIn, Facebook, etc. account to mine for
data.

~~~
psychometry
I assure you that Facebook knows all about you whether you have an account or
not.

~~~
tjoff
Highly doubtful has any idea of even gender and age. They have a broad idea of
localization and perhaps a pretty laughable attempt at "interests". Absolutely
guaranteed to be nothing of value though.

~~~
improbable22
Oh they have much more than this.

Messenger asks nicely to upload your whole address book, and since half your
college buddies clicked yes they know when and where you studied, and who you
knew, and your parents landline thus some idea of their location (and how
wealthy it is). Ditto your colleagues at each place you worked, who all still
have your old corporate email, and some have newer info...

And that's before anyone posts anything on facebook.

~~~
tjoff
And yet they can't show even a semi-related ad if their entire existence
counted on it.

They have the data, yes. But they will ruin society before having any idea on
how to use it.

Most importantly. They can not correlate that with my IP or my browsing
history. So that information is beyond useless when trying to track me and
only slightly useful for researching the society in which I grew up (which
facebook isn't interested in anyway).

~~~
improbable22
This is also true. I don't see how they're making money off their extensive
knowledge about me.

When I make the mistake of buying something online in a browser logged into
facebook, then I get adds for a week trying to sell me what I just bought...
this does not lead to me buying another one! And all this strategy seems to
need is basic cookies, not a shadow profile.

------
haxel
There's a positive framing of the idea of mining (or collecting) information
about yourself and selling it to your boss: it can help get you raises and
promotions.

After all, when you ask for a raise you're in a wonderful negotiating position
when your value is clearly quantified. When you mine/collect the information
yourself, you get to choose how and when you present it. Thus you are selling
the information to your boss and reaping the rewards yourself.

~~~
speedplane
At first, it will help you get raises. Then over time as everyone does it, you
won't get a raise unless you participate. Over time, you won't get hired at
all unless you participate.

------
V2hLe0ThslzRaV2
The official opinions of EFF and EPIC on the case:

* EFF: [https://www.eff.org/deeplinks/2017/08/judge-cracks-down-link...](https://www.eff.org/deeplinks/2017/08/judge-cracks-down-linkedins-shameful-abuse-computer-break-law)

* EPIC: [https://epic.org/amicus/cfaa/linkedin/](https://epic.org/amicus/cfaa/linkedin/)

------
s73ver_
Why does HiQ have any right to my data in the first place?

~~~
meritt
Because people self-submitted their own information to LinkedIn with the
desire for it to be publicly accessible.

~~~
jongisli
And what is the reason people want their own information to be publicly
accessible you think?

My public information on LinkedIn is public - to market myself. I didn't make
it public so that another company could store it in their own databases,
analyse it and sell it to "my boss".

This is a discussion touched upon in the article "What were the public squares
and private rooms of the web? Who got to determine access? Should data be
protected as speech?"

Anyway, as mentioned in another comment, the highly unethical work HiQ is
doing will be hard to capitalize on with the GDPR coming up.

------
deckar01
I think LinkedIn's mistake was letting the lawyers fight over it. They could
have just blocked the bots with a captcha and called it a day.

~~~
speedplane
Captcha's are easy to beat these days, even fancy reCaptchas. They make the
process harder and a bit more expensive, but they don't solve the problem.

------
mythrwy
Good.

My Boss is going to find out I volunteer at soup kitchens and help old ladies
across the street. Oh, and I also love his favorite band and agree with him
politically and didn't even have to gratuitously mention this around the
office like a big suck up. A promotion is just around the corner!

Wasn't it just last week we saw articles on the front page almost every day
about how we can't trust information from the Internet because Facebook and
Google are big lying spying monopolies and sell clicks to anyone and don't
responsibly curate like they are supposed to and we should all go back to
reading the newspaper to get the real facts on life? Ya, pretty sure that was
the gist of them. Anyway, hopefully my boss didn't read any of those. We'll
just leave him believing that data mining and scraping and twitter sentiment
analysis have super secret powers to reveal the hidden truth about the world.

------
megamindbrian2
This is stupid. While my boss has vested interest in my data, I am the only
one that can benefit or change because of it.

~~~
dsp1234
If your employer knows you are looking for other opportunities, they could
stop giving you projects and start transitioning you before you are ready, or
just fire you.

That's a benefit to the employer, that probably is not a benefit to you, since
you would likely want to quit on your own terms.

Another quick example is knowing that your employees are researching how to
unionize, and thus knowing to bring in anti-union resources.

I'm sure there are more.

~~~
mjcl
Or even create a self-fulfilling prophecy. What if someone is listed as "may
leave" when they have no intent to do so. If the employer then starts freezing
the employee out, that could lead the the employee eventually leaving, even if
they didn't originally want to.

