
Think You’re Discreet Online? Think Again - tysone
https://www.nytimes.com/2019/04/21/opinion/computational-inference.html
======
akyu
Theres another downstream issue of this that many people seem to be unaware
of.

I've gone down the road of trying to create a super private browsing
experience.(www.privacytools.io for a good start).

The issue is the web experience degrades rapdily and quite frequently is
unusable. You have to block CDNs for a start if you really wanna go full
paranoia mode. And there's too many other issues to list here.

But the biggest issue, is that Google's CAPTCHAs become unsolveable. I mean
literally they wont let you pass. You will get served a large sequence of
captchas. I've experienced up to 6 for a single verification. And not only do
you get more, but they deliberately add an excruciatingly slow delay to each
image. It can take 10-15 seconds to solve a single image set, and then you
still need to do 5 more.

Also interesting side note, it seems they are injecting adversial noise into
the hard captcha, as I noticed faint distortions in most of the images.

Bur even with your best effort to solve them you will not get through. Ive
repeated this experiement many times. It seems that preventing browser
fingerprinting is the thing that really makes them put up a redflag, but its
hard to know for sure. And I'd like to emphasize that none of the privacy
modifications I was using make any significant change to normal browser
functionality. You still have JS, you still have cookies (through Firefox
containers).

Anyway the scary thing here is its very easy to extrapolate this further and
get to an internet where you either opt in to be tracked, or you are
effectively throttled/gated on huge parts of the web.

Google CAPTCHAs have gone from asking the question "Are you human?" to "Which
human are you?".

And the beautiful, incredible irony is how this contrasts with Google's very
public stance on net neutrality.

"Internet companies, innovative startups, and millions of internet users
depend on these common-sense protections that prevent blocking or throttling
of internet traffic, segmenting the internet into paid fast lanes and slow
lanes, and other discriminatory practices. Thanks in part to net neutrality,
the open internet has grown to become an unrivaled source of choice,
competition, innovation, free expression, and opportunity. And it should stay
that way. "

Do as I say. Not as I do.

~~~
dcolkitt
I could be wrong, but I assume that Google mostly does this because of the
huge amount of malicious garbage bot traffic on the web. Once you go fully
anonymous and private, there's no way to distinguish you between the hordes of
malicious traffic.

But there seems like there should be a technological solution. Google knows
what a "legitimate" user looks like. Either because they come from a safe IP,
or they're registered Google users, or they have a normal established browser
history. But using your legitimate identify compromises your privacy, because
then your IP/user-account can be correlated to your personal identity.

So, let's create zero-knowledge proof identity schemes. Using your legitimate,
but non-private, identity you "register" an ephemeral, private anonymous
identity. But it's done using zero-knowledge proof, such that Google has no
idea about who a particular anonymous identity besides that it must belong to
one of its several million legitimate identities.

~~~
akyu
I would love for something like that to happen, and given all of the activity
in the blockchain space, someone will probably come up with a clever solution.
I just have a hard time imagining a motivation for Google to switch to using
such a thing even if it were invented. They way it works now benefits them.

But even just looking at it from a purely technical perspective... If CAPTCHAs
are intended to be "completely automated public Turing test to tell computers
and humans apart", then its quite clear that they no longer accomplish this. I
am a human and I can't pass. The model is wrong, build a better one.

~~~
sandworm101
>> If CAPTCHAs are intended to be "completely automated public Turing test

That's what google says the are for, but they aren't used that way by
everyone. Many websites use them not to detect machines but to _slow down the
humans_. They want to make something burdensome so they add the captcha. Maybe
they want you to subscribe, pay money, to avoid it. Maybe they just want you
to spend a few more seconds looking at of some banner ad. Or maybe including a
captcha ticks a useless security box on some compliance form.

------
gatherhunterer
Ironically I opened this is private mode and was turned away. The New York
Times is trying to have it both ways. I can log in and be tracked by their
advertisers or I can be discreet and open articles from another source. I
don't mind paying a small subscription but let me choose to deny your
advertisers information on how I consume your content. If being tracked is
part and parcel of being a subscriber then these pro-privacy articles look
almost unethical.

~~~
CharlesColeman
> The New York Times is trying to have it both ways.

No, you need to think of the NYT (and any reputable newspaper) as being two
_separate_ entities: the newsroom and the advertising department, with a
firewall between them [1]. You really _don 't_ want the newsroom killing a
story because it makes the practices of the advertising department look bad.

[1]
[https://en.wikipedia.org/wiki/Chinese_wall#Journalism](https://en.wikipedia.org/wiki/Chinese_wall#Journalism)

Also, demands that the people who expose bad privacy practices have perfect
privacy records themselves is to demand for privacy advocates to kill
themselves a circular firing squad [2], so no one wins but the advertisers and
privacy-invaders. Personally, I want these exposés carried by the publications
with the greatest numbers of readers, and I'm not going to gripe too much
about practices of those publications as long as the message gets out.

[2]
[https://en.wikipedia.org/w/Circular_firing_squad](https://en.wikipedia.org/w/Circular_firing_squad)

~~~
closedgrave
> being two separate entities

They aren't really though, they both make money on my privacy with relatively
little consent. If that's the bad thing, I'm not sure I care that one of two
parties says "sorry, I know this is bad" while still doing it. This could
actually make them the worse of the two parties.

> [1]. You really don't want the newsroom killing a story because it makes the
> practices of the advertising department look bad.

this feels like bait, no one is asking for the newsroom to kill the story,
they are asking for the newsroom to kill the practices of the advertising
department.

> Also, demands that the people who expose bad privacy practices have perfect
> privacy records themselves

Its not a demand that they have perfect privacy records. Its kinda just
pointing out that these authors have control of who they publish for and that
its quite hollow to warn people about a poisonous medium in a way that draws
more people through that medium.

This isn't to say they are bad people, just that they aren't great for
pointing out bad practices they take part in 10 years after society has
already baked them in.

I do warn that my opinion is colored by a belief that news orgs are
responsible for a large part the of the normalization of our lack of privacy
and current relationship with marketing. Which I see as sort of proto you-
gotta-beleive-me methods.

[edit for formatting]

~~~
CharlesColeman
>> being two separate entities

> They aren't really though, they both make money on my privacy with
> relatively little consent.

They are and they aren't: they're two separate parts of one business. The
advertising department makes money for the owners, while the newsroom writes
the stories (and is hopefully insulated enough from the interests of the
owners and ad department that it can be honest).

> this feels like bait, no one is asking for the newsroom to kill the story,
> they are asking for the newsroom to kill the practices of the advertising
> department.

The newsroom doesn't really have that authority. The firewall is there to
protect the newsroom _from_ the advertising department, because the ad
department is naturally more powerful (due to newspapers being businesses and
the ad department being the part that actually collects much of the revenue).

> Its kinda just pointing out that these authors have control of who they
> publish for and that its quite hollow to warn people about a poisonous
> medium in a way that draws more people through that medium.

The world is more complicated than that. It's more poisonous to distract from
the warnings just to point out they weren't published in some low-reach niche
publication whose privacy practices satisfy some random internet commenter.

If you actually care about privacy, you should celebrate these articles
because they might reach a wide-enough audience to actually cause real action
to fix the problems.

~~~
bryan_w
>The newsroom doesn't really have that authority.

They could refuse to publish their articles for a corporation with such an
"immoral" advertising department, but they don't, because they aren't genuine
in their beliefs.

~~~
CharlesColeman
> They could refuse to publish their articles for a corporation with such an
> "immoral" advertising department, but they don't, because they aren't
> genuine in their beliefs.

That's an un-nuanced and extremely uncharitable statement. Have you ever heard
the saying "choose your battles?" Don't you think people have to prioritize
the actions they take to support their multifarious beliefs? You can have
genuine beliefs without being destructively unreasonable.

I'm pretty sure the people in the New York Times newsroom value public good of
having a functioning "fourth estate" [1] over having a tracker-free
nytimes.com website, and thus are unwilling wage some destructive pyrrhic war
with their employers over something as trifling as the latter. _Especially_
when they can instead write a widely read series of articles that bring light
to those practices, and perhaps lead to wider change.

[1]
[https://en.wikipedia.org/wiki/Fourth_Estate](https://en.wikipedia.org/wiki/Fourth_Estate)

~~~
closedgrave
But the fourth estate existed before the internet super powered predatory ad
practices. I get that predatory advertisement was baked in from the beginning
for news papers and that a lot of work has been done to mitigate those roots,
but I feel your pushing a false dichotomy as nuance here. Its not an all or
nothing we need to tare down the system and no one can ever advertise or write
news papers again problem. Its a "this system is becoming more toxic, lets
hope the people who built that system will help us transition away from that
toxicity" combined with a "damn I wish it wasn't so difficult to get someone
to understand something, when their salary depends upon them not understanding
it" problem.

I would suggest

> the New York Times newsroom value public good of having a functioning
> "fourth estate"

AND it being well funded

AND that they are a part of it both individually(authors) and as an
organization (NYT brand)

> over having a tracker-free nytimes.com website

or tracker-free alternatives/competition.

~~~
CharlesColeman
> but I feel your pushing a false dichotomy as nuance here.

No, not really. The GGP was basically pushing "we could destroy the village in
order to save it" logic." The NYT is one of the few newspapers that may be
able to weather the economic maelstrom that journalism is in the middle of, so
it makes no sense for its journalists to go to war with its management over a
_niche issue_ , to satisfy a few strident people in the internet peanut
gallery. I think it's pretty obvious that the turmoil the GGP's idea would
cause would have far more negatives than positives.

Running all these stories definitely has more positives than negatives.

> AND it being well funded

> AND that they are a part of it both individually(authors) and as an
> organization (NYT brand)

Those are things you need for a functioning forth estate. Some people used to
think that blogs (e.g. sites that lack the things you listed) could replace
newspapers, but they were wrong.

~~~
closedgrave
>> refuse to publish their articles for a corporation with such an "immoral"
advertising department,

>> GGP was basically pushing "we could destroy the village in order to save
it" logic."

Why do I feel any criticism of reporters reads that way to you? I don't think
they want to destroy reporters/the 4th estate at all it seems like what they
want is the content creators to try and work for "moral" people and use the
power they have as content creators to do so. This is not hugely simple and i
agree with you on that. The false dichotomy your pushing into it is that any
change which is hard is equivalent to destruction, or that any change that is
fought for is someone going to "war" with the 4th estate.

The 4th estate existed before they rebuilt their village on sand. We would
rather a solid foundation over them complaining about the sand they built the
village on, while also claiming that attempts at change are just not feasible.
Complaining about a thing and then saying but its okay as long as i get mine
is exactly the thing about it seeming hollow.

>> Those are things you need for a functioning forth estate. Some people used
to think that blogs (e.g. sites that lack the things you listed) could replace
newspapers, but they were wrong.

I mean the fourth estate is not defined as being any individual reporter or
brand. So this is a claim on your part that no reporter should lose or leave
or protest their job and no paper should ever go defunct if we have a
functioning fourth estate... I don't even really know how to address this
claim, but it seems like its probably not what you mean? The point of adding
those two things was to draw attention to the incentives of the individuals.

As for blogs, they also generally have trackers, the reason they couldn't
replace newspapers was not the lack of trackers on them and people don't go to
the new york times to see the ads.

It really feels like your painting a picture of this all or nothing situation
with no individuals in it. while this person

> I don't mind paying a small subscription but let me choose to deny your
> advertisers information on how I consume your content.

seems to want to pay reporters instead of being tracked and this other one

> They could refuse to publish their articles for a corporation with such an
> "immoral" advertising department, but they don't, because they aren't
> genuine in their beliefs.

seems to be a "stop claiming your so good please" situation as opposed to the
"destroy them all muhhaha" thing your claiming.

> niche issue, few strident people, peanut gallery, pyrrhic war, low-reach
> niche publication, some random internet commenter.

Common now, really? we are good enough for them to write a bunch of articles
about our issue just not good enough to actually try and do anything? And we
can't point out the hollowness of that position?

I guess I'm sorry I responded, I felt your original comment was informative
but of two side channels that weren't really a response to the content of the
comment you where responding to. That is to say while both things you stated
in your OP where true neither seems to contradict the idea that the NYT wants
it both ways, they actually seem to explain the method by which they achieve
having it both ways. I hoped we could get to a better shared understanding of
the positions but now I feel like your seeing my and the others arguments as
"burn it to the ground" as opposed to "Its nice that they started talking
about it being bad a little more frequently, it would be nicer if they chose
not to do it, or at least tried to support some alternatives".

------
malvosenior
What I don't understand is with all of this tracking, ML profiling... Why
don't I get ads that are relevant to me at all? I'm very active all across the
internet, proactively rate ads as "not relevant to me", _would actually be
interested in_ learning about products that I might like. Have disposable
income... Yet nothing, just a sea of hot garbage ads that might as well be
noise. I can't recall ever seeing something relevant enough to click on.

I think the effectiveness of all of this stuff is way overrated. I have money,
I will spend it, but with all the tech in the world the industry still can't
put a relevant ad in front of me.

~~~
tomjen3
I am mostly in the same situation (except Facebook has _maybe_ found me
something I would be interested in buying), for all their information, for all
their smart people, etc, they have 1 product...

My guess is that it is because most ads suck, because a) 90% of all products
are crap, and b) good products don't need that many ads.

If somebody comes up with a better whatever, I will probably hear about it,
but most new products are not revolutionary enough for that, but they are also
not revolutionary enough for me to be interested in them as an ad.

~~~
malvosenior
The thing that gets me is I have a bunch of hobbies that I enjoy but don't
really have time to dive super deep into. They aren't anything weird and there
are many companies servicing their markets. I'm sure I have very little
awareness of a bunch of things I'd like to buy for these.

If I was being shown ads for products around my hobbies I would:

1\. Definitely discover companies and products I wasn't aware of.

2\. Buy stuff I probably didn't need but would be tempted into owning.

------
blfr
Laymen try to be careful online in completely unproductive ways: they keep
logging in and out of Facebook without even deleting cookies, they disable
data transmission on their phones when at home, etc.

The only way to keep privacy online is to compartmentalize. Use one proxy and
profile browser for discussing politics, another for HN/Reddit, yet another
for LinkedIn. And always block everything you don't need. Start with ads since
you virtually never need ads. Use various nicknames and avoid Facebook if you
can swing it.

Beyond that you only need a secure (up to date) OS, Signal, and maybe
occasionally Tor. It may or may not hold up against an NSA-level adversary but
you will easily lose most advertisers and corporate surveillance.

~~~
aristophenes
How does this work when you can be individually identified by analysis of the
things you write in comments or your mouse movements? And that can be
correlated across HN, Reddit, LinkedIn. I suspect that almost all of us are
the laymen you first described, to someone.

~~~
blfr
HN doesn't correlate mouse movement tremors with LinkedIn so it holds up very
well against actual online threats today. It may or may not hold up against
scifi threats of tomorrow.

~~~
acqq
> HN doesn't correlate mouse movement

The "fingerprint" of the writing style, the interests and "likes" and
"dislikes" patterns is often enough, the more one writes the more "uniqueness"
can be determined.

------
username223
Laws controlling the use of data, like those for credit reporting agencies,
would help. So would laws on collection and sale of data. Tracker-blockers
help a bit more, but they're an eternal battle.

What we need is new social norms. Most people don't point video cameras at
their neighbors' windows, not out of fear of being caught, but because it's
just _wrong_. Similar norms should apply online, and people and companies who
do it should be shunned.

------
Mirioron
I think we're not really there yet. I think it's entirely possible for you to
avoid most of this type of tracking if you so choose, but it does mean not
using services such as Facebook, Instagram, Whatsapp and not giving out your
phone number to people who use these services.

What bothers me much more is that it might be possible to identify somebody
through text analysis.

~~~
thatoneuser
Deliberate obfuscation. Change the way you talk over time. Change your users
and passwords. Even go so far as to say things that aren't the slightest bit
true but can't be differentiated one way or another. Create a matrix of
plausible personas you could possess and never let on to which sum of them you
really are.

~~~
Nextgrid
Wouldn’t it be better to shape the world in such a way that those techniques
aren’t needed? Don’t work for nasty companies, don’t invest in them, don’t do
business with them and lobby the government to outlaw tracking & stalking.

~~~
Mirioron
I don't think that's going to work in the long run. You can't stop
technological progress.

~~~
Nextgrid
You can’t stop progress, but you can stop it being used for certain purposes,
and laws mostly work as long as they’re enforced correctly.

------
telesilla
I've had an idea for a long time, what would happen if you had some silent
tabs running alongside you, randomly opening and closing browser pages,
clicking etc - acting like a human but with the purpose to create noise. It
would also involve submitting articles and social media posts using your name,
to make it harder to be found on search engines with any real clarity. If
anyone knows of a project like this please let me know of it. I'm aware it
would break all kinds of ToS but all is not fair in this privacy game.

~~~
CharlesColeman
[https://adnauseam.io/](https://adnauseam.io/)

~~~
telesilla
Thank you! There is only one feature that keeps me on Chrome (I've moved most
of my browsing to Firefox and Opera), and that's the built-in translation as
I'm on a lot of non-english websites during the day. This will make my daily
Chrome use a lot more enjoyable.

~~~
eikenberry
The built-in translation works on Chromium and might work on some of the
privacy oriented Chromium based browsers as well. I normally use Firefox and
use Chromium when I want this feature.

~~~
telesilla
I tested a few Chromium browsers today with privacy features and the
translation feature is disabled. I will try raw Chromium - I'm curious how
buggy it is to run it straight off dev.

Epic: translate is disabled Iridium: 403 on using inline translate Chromium:
403 on using inline translate

API key errors abound. After wasting way too much time I went back to Opera
and finally found inline translation with this extension:
[https://chrome.google.com/webstore/detail/google-
translate/a...](https://chrome.google.com/webstore/detail/google-
translate/aapbdbdomjkkjkaonfhkkikfgjllcleb)

Now everything is well. Thanks for your tip however!

------
mrw
Every article/report about online privacy reminds me of how many people I know
ignore these facts, don’t know them, or brush them off as they ‘don’t find it
harmful’/‘have nothing to hide’. Can anyone give me some suggestions/point to
a resource on how to present these facts to the not-tech people I care most
about in a way they will actually care?

------
jocoda
From the article:

>Such tools are already being marketed for use in hiring employees, for
detecting shoppers’ moods and predicting criminal behavior.

Surely, it's just a matter of time before we start gaming systems like this?

You already have online reputation management for products and services. So
how long is it before someone offers shaped online profiles for individuals as
a service?

~~~
rolph
its almost now.

------
o10449366
"But today’s technology works at a far higher level. Consider an example
involving Facebook. In 2017, the newspaper The Australian published an
article, based on a leaked document from Facebook, revealing that the company
had told advertisers that it could predict when younger users, including
teenagers, were feeling “insecure,” “worthless” or otherwise in need of a
“confidence boost.” Facebook was apparently able to draw these inferences by
monitoring photos, posts and other social media data."

Ironically, that's exactly what the NYT has started doing recently![0] I think
they've realized just how lucrative selling these interferential insights can
be; regardless of how accurate these metrics are, advertisers _eat them up._
How long until NYT writers start receiving pressure from upper management to
produce content that maximizes emotional output from their readers? It's
possible that proper restraint and separation of departments can be
maintained, but I think history has shown that when profit motives are
involved, greed often usurps ethics.

[0] [https://investors.nytco.com/press/press-releases/press-
relea...](https://investors.nytco.com/press/press-releases/press-release-
details/2018/The-New-York-Times-Advertising--Marketing-Solutions-Group-
Introduces-nytDEMO-A-Cross-Functional-Team-Focused-on-Bringing-Insights-and-
Data-Solutions-to-Brands/default.aspx)

~~~
CharlesColeman
> [Facebook] had told advertisers that it could predict when younger users,
> including teenagers, were feeling “insecure,” “worthless” or otherwise in
> need of a “confidence boost.”

> Ironically, that's exactly what the NYT has started doing recently!

 _No_ , that's not what the NYT is doing at all, and it's disinformation to
say that it is. Facebook was inferring the emotional state of its _users_ ,
the NYT was inferring the emotional content of its _article content_ [1].
Those are very, _very_ different things.

[1] From your link: "The result was an artificial intelligence model that
predicts emotional response to any content The Times publishes... Perspective
targeting allows advertisers to target their media against _content predicted
to evoke reader sentiments_ like self-confidence or adventurousness."
(emphasis mine)

~~~
Bartweiss
> _content predicted to evoke reader sentiments like self-confidence or
> adventurousness_

I'm not convinced that is different from Facebook's model - it's just less
powerful.

Advertisers have always tried to place their ads alongside content that favors
their brand identity, sure, but that's about the style of the _content_. That
looks like Burton sponsoring top-tier snowboarders or Chanel advertising in
_Vogue_. Watching Olympic snowboarders might make some people feel
adventurous, but Burton's placement is also aspirational and cultural, a way
of simply forming an association between the brand and high performance.

The NYT model didn't just put content alongside stories about adventure, it
put content alongside stories expected to evoke adventurous sentiments. If a
story about a daring Arctic expedition makes you feel relieved to be comfy at
home, it could still associate a brand with adventure, but it's outside the
sentiment target. That _is_ inferring the emotional state of users, rather
than of content. The main difference is that the NYT was making session-level
judgements, rather than long-term ones. I find that much less objectionable
(even if it's only out of a lack of data), but it's still in the category of
mind-state targeting rather than content alignment.

~~~
Wowfunhappy
Advertising based on the type of content is nearly as old as advertising. Some
TV commercials air during dramas, others during comedies, others during
sporting events. If you want to get more specific, you can elect to have your
ad air during a certain show.

The only difference between this and what the NYT is doing is the former
requires slightly more research on the part of the advertiser, to learn what
the show or article is about.

~~~
Bartweiss
Again, though, my point is that the NYT _didn 't_ just sell advertising based
on content type. That wouldn't be new. "Project Feels" was an ML initiative to
study how readers felt after reacting to stories, and create an ad-targeting
tool based on that. It's very specifically about offering advertisers the
chance to choose stories based on predicted reader demographics, behaviors,
and emotional response, _instead of_ simply targeting stories by category or
topic.

To decide whether to air your commercial alongside a drama or a comedy, all
you need to do is watch the show (and perhaps collect viewer demographics). To
decide which section of the print NYT to advertise in, all you need to do is
read it. But Project Feels was only possible by studying the behaviors and
emotional responses of readers. It didn't try to alter those emotions in
specific ways, so it's not equivalent to Facebook's project, but it's also not
the same as content-based targeting.

~~~
Wowfunhappy
The NYT is replacing a human reader determining a story's emotional value with
an ML program determining a story's emotional value.

You're implying that the NYT's ML software is capable of finding some kind of
secret, subliminal emotional traits that wouldn't be detectable to a human
reader. I don't find this believable at all.

~~~
Bartweiss
I'm not. If the NYT had allowed advertisers to handpick stories to appear
alongside, I wouldn't consider this substantially different.

What I'm interested in is the switch from choosing to appear alongside a
category or keywords (content targeting) to appearing alongside a _type of
story_ , with its more specific impact. The impact of ML isn't better-than-
human parsing, it's almost certainly worse-than-human. It's just a question of
adding story-level targeting which wasn't previously available, along with
access to user studies that go beyond demographics and engagement to self-
reported emotions.

------
smartbit
[https://archive.is/lZq6M](https://archive.is/lZq6M)

------
qrbLPHiKpiux
Does it matter if I use the completely block list for Facebook et al on
github?

