
Intelligent Tracking Prevention - thmslee
https://webkit.org/blog/7675/intelligent-tracking-prevention/
======
codedokode
I am surprised how complicated it is, they even use machine learning. It will
look like a bug to developers when the third-party cookies will suddenly stop
working without obvious reason.

Why don't just block third party cookies except where it is enable by the
user? I think 100% of the sites I visit use third-party cookies only for
tracking.

And of course this is not enough. Using a combination of an IP address and
browser fingerprint allows tracking a user without any cookies.

And websites can still track users if they use a redirect through an analytics
website (when a user visits a site for the first time he is redirected to an
analytics domain, that redirects the user back adding an identifier to URL).

~~~
dsr_
"Self-Destructing Cookies" is an extension for Firefox that automatically
deletes cookies 30 seconds after you close the last tab associated with a
site, unless you put it on a whitelist.

It is the most sensible and useful policy I have ever seen. Sadly, it can't be
done as an extension in the new Firefox model.

~~~
arviceblot
[https://addons.mozilla.org/en-US/firefox/addon/cookie-
autode...](https://addons.mozilla.org/en-US/firefox/addon/cookie-autodelete/)
is compatible with the new add-on model.

~~~
hansjorg
That one looks great, it even supports the new container tabs with per
container whitelists.

------
dzink
Unintended consequences of this will be the complete dominance in affiliate
and ad networks of major consumer sites like Google, Facebook and Amazon who
already own products visited daily. It also means larger focus of all ad
networks to aquire consumer sites or buy redirects or hijack users somehow to
get their ads to show/track. Ads and addictive sites are going to feed off of
this rule.

~~~
scarlac
Although your point is valid, currently, Google uses quite a lot of other
domains for analytics and tracking than google.com. For analytics, some
(unknown to me) things are served from google-analytics.com, and I'm seeing
cookies from googlesyndication.com and doubleclick.net as well.

As with many things in ML and NN, the premise of the tech is that it won't be
a 100% perfect solution, but it'll hopefully be good enough or better than
what you have.

~~~
pilif
_> Google uses quite a lot of other domains for analytics and tracking than
google.com_

... for now. It's just a matter of time before analytics and tracking will
move. Just like maps has moved to google.com in order for search to also get
access to the current location (which is a permission I'd gladly give to maps,
but not to the search engine)

~~~
jstoeffler
They can do that, but it's a delicate choice: if google.com gets flagged as a
"tracker domain", the google.com cookies will be deleted after 30 days of no
interaction with google.com.

Probably not an issue for most of the users, but for those who don't use it
very often, they'd get logged out. Not sure Google wants that.

~~~
fake-name
I think the original point here is that almost everyone interacts with
google.com daily.

As such, if a company with a non-add business attaches a tracking business to
their existing domains, they can circumvent the deletion policies.

------
arielm
I get why this needs to exist, but it feels more complicated than it should be
and that sort of “magical” decision that’s based on ML is going to confuse
both web developers and web users. It might be worth it, but I don’t think it
is just yet.

On a side note though, it seems Apple is doubling down on making decisions for
its users. This is one example, another is do not disturb when driving. These
sort of features make a lot of assumption and use complex logic to decide for
users, even when the users don’t need that. I think that’s concerning.

~~~
nickjosephson
Both of these things can be turned off in settings and DNDWD will ask you if
you'd like to enable it.

~~~
jclardy
Yeah, these aren't forced decisions. The driving mode is opt-in the first
time.

------
anilgulecha
Why is "cookies can't be used in 3rd-party context" not turned on from day 0?

Right now isn't it a cat and mouse game of holding many domains/subdomains,
and passing the cookie flag across them. It seems technically possible to
cancel out the protections of ITP. However, deciding cookies can never be used
outside of 3rd party context may mean I have to make some additional logins to
services from time to time, but have much better tracking protection.

Is my assessment faulty?

~~~
matt4077
> Why is "cookies can't be used in 3rd-party context" not turned on from day
> 0?

It's been the long-standing default of Safari to block 3rd-party cookies. As
is described in the excellent post this links to, there is some functionality
that 3rd-party cookies enable that can be beneficial. They use single-sign-on
as an example.

~~~
codedokode
Single-sign on can be implemented without third-party cookies using only
redirects. That is how OpenId and OAuth work.

But social network widgets (Like buttons, comment form) would break without
third-party cookies.

~~~
floatboth
If I understand correctly, "Single sign-on" means that when you sign in at one
site, you're _automatically_ signed in at others. Like when you sign in to
Gmail, you get logged in on YouTube without any extra clicks.

OAuth is typically a user-initiated action, like "Sign in with GitHub" or
"Sign in with your domain"
([https://indieweb.org/IndieAuthProtocol](https://indieweb.org/IndieAuthProtocol)).

Though I think SSO _should_ be possible with OAuth — maybe with a hidden
iframe that does the auth process, or something with CORS requests… Or maybe a
custom redirect-based protocol would be better.

~~~
codedokode
OAuth can be modified so that it would not require any user action. When user
visits a site, he gets redirected to the authorisation domain that checks
whether the user is logged in and makes a redirect back to the original site,
adding authorisation result to URL.

------
_pmf_
The untold story here (that is, however, indirectly acknowledged by Brave) is
that user tracking is required by websites to prevent being completely
defrauded by ad providers / advertisers being defrauded by click bot farms.
Without providing viable alternative incentives, this will just move the ad
driven part of the web into closed networks (Twitter/FB) where everything is
single domain.

Is this really the intended outcome?

(It's nice to see Brave have an effect, though.)

~~~
ForHackernews
> user tracking is required by websites to prevent being completely defrauded
> by ad providers / advertisers being defrauded by click bot farms.

Maybe don't pay for clicks, only pay for sales.

~~~
spiderfarmer
How would you keep track of the sales without cookies? Most people don't buy
on the first visit.

~~~
ForHackernews
Unless I'm misunderstanding, the technology described in the article is
designed to fight against cross-site tracking (i.e. following users all over
the internet to different sites)

I suspect few users would object to a single web property remembering them
across multiple visits.

~~~
spiderfarmer
So you'll have to rely on the webshop to keep track of purchases? I'd rather
have a third party involved that has to keep both the publisher and the
advertiser happy.

------
a_imho
I'm running a fairly old browser here, I have the ability to set when to
accept cookies/third party cookies and which sites to whitelist and to set the
retain policy whatever I wish to. Also, I can add extensions like Self
Destructing Cookies on a whim.

To me it looks more like Overengineered Tracking Prevention to be able to
shovel in 'Machine Learning Classifier' for buzzword compliance.

~~~
Arnt
You can, but are you better than the classifier at actually doing? Have you
done it for every web site you visited today?

Besides, there is some value in getting widespread blocking. When you're the
only one who blocks, that's a signal. When Apple's new doodah blocks mostly-
the-right-thing for 10% of the users, that's quite different. You get to be
part of a crowd.

~~~
a_imho
I would think so, yes, mainly because I don't have to take advertisers into
consideration. And I can do a lot more for my browsing experience blocking
wise, much much more.

Invalidation is hard. Once you accept you have to tread down the road of
accepting and deleting cookies and local storage I see no reason to wait 30
days. Why not purge them right on (tab)closing?. If you really care about
privacy that is. Which could have been set to be default on the next update
btw.

~~~
ams6110
My browsers are set to purge all cookies upon exit. And I typically exit at
least daily.

~~~
Arnt
Then you log in to HN again, which enables HN to tie your new cookie to your
old cookies and tell all its advertising and tracking partners.

HN has no such partners, of course. But many other sites do, and correlate
cookies in order to tie devices together.

------
spiderfarmer
What prevents a tracking script to set a third party cookie first and then set
a first party cookie that references the third party cookie. And when the
script notices the third party cookie is gone, it sets it again on the next
visit? With enough websites using it, you would only lose a couple of visits.

~~~
mixedbit
Cookies are called third party based on a context in which they are sent. A
cookie itself is not third or first party, it can act as both depending on
this context.

~~~
spiderfarmer
Yes, but only one of them is removed after a day.

------
kodablah
I have so many questions that I'm sure will be cleared up over time.

I am very happy to hear this "classification" happens on-device. I am curious
how opaque it is. I assume, like the rest of WebKit the classifier is open
source? In Safari, both desktop and mobile, is there a way to see which
domains/cookies are blocked/purged or what the time remaining on them is
(assuming I don't revisit the page)?

Also, I assume this can easily be beaten be smart ad networks. "Hey,
publisher, just use our JS snippet which loads your first-party cookie we may
have set previously and then loads a JS URL with that identifier as a query
param in the URL along with other fingerprinting info, and we may set that
first-party cookie after load too."

I would personally love the option to not send any third party cookies not on
the same domain as the frame (or maybe an opt-in) on a per-site basis as I can
only see it breaking social logins and other lazily-dev'd items like comment
board integration (most SSO uses redirects anyways). Surely if a website
wanted to offer its own domain's cookie store, it could (just like you can
with local storage).

------
lwlml
You know, we put a "padlock" on the address-line for a reason. Why can't we
just make it easy for anyone to inspect the persistent state associated with a
website. Websites already make site-maps to tell search engines how they work,
why can't websites tell end-users what data they create and use with your
browser? Treat them like capabilities and let the user decide if your website
gets to treat your localStorage as a stalking ground of your behavior or if
they'll have to work even harder.

This kind of contract can do in both ways. The user can set up a permission
set to give to the website. If the website balks then the user can go
somewhere else or negotiate with their own principles how much they want to be
"the product."

~~~
pdimitar
Always wanted something like what you suggested. It would be fair to the
users, hence why it's not implemented (lol).

------
cyborgx7
I theory cookies are great. They put the choice on the user to be remembered
as the same person by either sending the cookie or not. The problem is that
cookies were designed intransparently leading to them being percieved as non-
consensual tracking rather than as the consensual mechanism that puts the user
in control they could have been. These solution feel like a makeshift stopgap
workaround when I think the actual solution involves creating a way to put the
user back in active control if and how a site remembers you.

------
panic
_A machine learning model is used to classify which top privately-controlled
domains have the ability to track the user cross-site, based on the collected
statistics. Out of the various statistics collected, three vectors turned out
to have strong signal for classification based on current tracking practices:
subresource under number of unique domains, sub frame under number of unique
domains, and number of unique domains redirected to._

What's preventing ad networks (or whoever this is meant to protect against)
from gaming this system?

~~~
ForHackernews
Presumably the model will be updated to adapt to a changing threat environment
from ad networks.

------
blowski
What's the difference between this and the EFF's Privacy Badger? It's not a
complaint, as I realise the EFF wants browsers to adopt such technology
natively.

~~~
cyborgx7
I haven't been able to find such an explicit explanation of he policies
Privacy Badger uses. It is all very vague machine lerning talk.

I would like to see a clearer comparison as well.

~~~
floatboth
Privacy Badger uses simple heuristics like cookie entropy estimation, domain
prevalence (how many other domains embed that domain's stuff):
[https://github.com/EFForg/privacybadger/blob/master/src/js/h...](https://github.com/EFForg/privacybadger/blob/master/src/js/heuristicblocking.js)

and static block lists
[https://www.eff.org/files/cookieblocklist_new.txt](https://www.eff.org/files/cookieblocklist_new.txt)

~~~
blowski
So this new WebKit one is more complex?

~~~
floatboth
Of course. They shoved goddamn _machine learning_ into the WebKit one.

------
thanksgiving
Does this affect Google analytics?

~~~
algesten
This is my main problem. GA loads a script that sets first party cookies.

I really want Safari to have easy integration points to make a good cookie
whitelist interaction using plugins.

Sure it's painful the first few days having to gradually build up that list,
but like little snitch, once you're passed that stage, you just set it to say
no to any new domains setting cookies.

Then users could cooperate to build cookie whitelists...

~~~
bigbugbag
> This is my main problem. GA loads a script that sets first party cookies.

Only when scripting is enabled.

------
criticabug
Does anyone know if this break/affect affiliate networks? They use third-party
cookies, I think. Any purchases after more than 24h since the click cannot be
attributed to it.

------
superlupo
I've had third party cookies disabled since > 5 years and never had a real
problem. I'm not convinced that such a complicated solution is necessary.

------
dilliwal
I already disable 3rd party cookies, its always a good idea, believe me.

Why I say that? 1\. No retargeting adv. 2\. No site-to-site tracking 3\. Less
chances of identifying you uniquely.

------
ep103
So is this a personal suggestionby a developer in the webkit community? I see
nothing that says this has landed anywhere?

