Hacker News new | past | comments | ask | show | jobs | submit login
Google Chrome Incognito Mode Can Still Be Detected (bleepingcomputer.com)
440 points by nradov 12 days ago | hide | past | web | favorite | 192 comments
 help




Can this fight ever be won?

If you've been browsing the internet for more than 5 minutes you already have cookies from some of the major ad networks. Therefore if you do not have cookies from the major ad networks, you're either a brand-new device or an incognito browser. All that is left to do is get in bed with the ad network to ask them if they have good cookies for this session. As it so happens most of the companies trying to bust the incognito mode are already in that crowded bed.

The next loops in this spiral are: 1) an incognito mode that seeds good-looking ad cookies 2) ML models trying to distinguish synthetic/cloned/5-seconds-old cookies from genuine ones 3) matching ML models trying to out-fox the models from step 2, and so on.

The last fit of madness will be a hidden session in Chrome browsing the web on your behalf, building up a bogus profile for the ad networks, and the ad networks trying to figure out if the clicks in your fake session are sufficiently human-like.

And whoever wins in this war, the ad networks will end up collecting even more data than they do now.


I remember around 10 years ago people were calling Stallman paranoid for this. Now he seems more like a prophet:

>I generally do not connect to web sites from my own machine, aside from a few sites I have some special relationship with. I usually fetch web pages from other sites by sending mail to a program (see https://git.savannah.gnu.org/git/womb/hacks.git) that fetches them, much like wget, and then mails them back to me. Then I look at them using a web browser, unless it is easy to see the text in the HTML page directly. I usually try lynx first, then a graphical browser if the page needs it (using konqueror, which won't fetch from other sites in such a situation).

https://stallman.org/stallman-computing.html


I mean, if you want to be super paranoid like that there's no need to use a script to fetch webpages into email format. Use some sort of vnc-over-ssh session to a disposable virtual machine hosted 9500 km away, to browse interactively in a GUI, running firefox+sensible privacy extensions inside xfce4/xorg. With the right automation you could have the vm overwrite itself from an image file every day.

This could be done on the CPU+RAM+disk resources of a $4.50/month vm.

For the purpose of not being tracked by advertising networks (not a nation-state intelligence agency), a stateless vm located on a static ipv4 /32 in bulgaria that saves absolutely no data would be rather hard to track back to its actual user.


Well, that is a bit paranoid.

It's not though? If your goal is to avoid having your every move online tracked by companies hoping to earn money from spying on you, what Stallman does is one of the few strategies which work.

You can take various actions to improve your privacy. At some point it becomes a paranoia. If you are fetching web pages on a remote server and email stuff to yourself then I would say it is becoming paranoia.

And I would say what Stallman does probably gives him a very unique fingerprint (wget ting web pages from a unique server). I doubt he is doing that for privacy.


I don't think paranoid is the right word, since it implies some kind of (usually unfounded) fear. I don't get the impression that Stallman is actually afraid of anything in particular; he's just doing this to thwart what he thinks are bad practices. So I'd call it extreme, but not paranoid.

Alternatively, Google starts crawling the web in incognito mode; punishes websites that hide content.

I understand the point of using a special user-agent to crawl webpages for indexing, but search engines should use a "regular" browser UA string (full JavaScript, etc. to simulate an actual browser) occasionally. From a different IP range too, of course. If the contents of the page are wildly different, penalise the site.

They do both or at least google does.

Even if they do both, if the bots always follow what is entered in robots.txt and humans do not, it won’t be long before that’s the primary factor.

And that's totally fine - if they add their articles (for example) into their robots.txt, it would cripple their SEO. It wouldn't happen.

I've seen stuff in the robots.txt get crawled anyways if enough people link to it. In Google's results it will still only show up without any contextual information though.

Google won't crawl it, but they can still include the link in search results, usually with the title guessed from the way it was referred to on another page, and no description.

The Google bot mostly ignores the paths specified in robots.txt

What about sites which display different content depending on origin?

Google is an advertising company and everything they do is to improve the core ad-business. If Google had it their way every website would be using AMP, allowing google to have more control over the web, allowing them to become the only ad-network.

>The last fit of madness will be a hidden session in Chrome browsing the web on your behalf, building up a bogus profile for the ad networks, and the ad networks trying to figure out if the clicks in your fake session are sufficiently human-like.

The first parts of this have already been tested by Mozilla: https://trackthis.link/


> https://trackthis.link/

It opens 100 tabs? With third-party ressources and javascript blocked, it will do little in terms of shaping my ad-profile. But maybe 4GB are not enough to open 100 tabs, so I skipt this.


Sites don't want to bust the experience for brand new devices I think. That would include things like internet cafes where you log into a clean desktop environment, or browsers you use rarely like maybe one in your car. I think if Google can just get a real disk-backed implementation working for Incognito mode, such that there actually is no difference between Incognito and a new regular browser, this feature will be valuable.

If they want to block incognito mode, wouldn't they also want to block someone from an Internet cafe? If the reason you block incognito mode is because you can't track the user, you don't want a user coming from an Internet cafe that you can't track.

Just don't use sites that dont respect your privacy. If it doesn't like you blocking cookies, walk away.

I highly recommend Temporary Containers for Firefox. It can be set up so that every time you open a new tab, that is a new context, new logins, new cookies.

No reason to block sites that set cookies, since in a few minutes they will be deleted anyway.


that's something I miss in chrome/opera

So, don't use the internet? Because that's what you're describing.

Better cut the power cord, too, just to be safe.


I whitelist cookies; only sites that I have a known relationship to (e.g., HN, for login) get to set cookies.

The overwhelming majority of the web still works just fine. It's trivial to pick out what doesn't, as it either tends to:

a. require cookies for some inane task that doesn't need them, and it tells me this

b. breaks horribly. Typically, JS trying to access LocalStorage, but not checking whether the call was successful or not.

The grandparent is also wrong about,

> Therefore if you do not have cookies from the major ad networks, you're either a brand-new device or an incognito browser.

No, or you're whitelisting all or third-party cookies. (The latter being significantly easier to do, and causes much less breakage. I don't think I've ever seen a third-party cookie cause breakage.)

N.b., I'm not necessarily recommending what I do to others; it takes a lot of work, and I really need better tooling around my flow. But it gives me the context on how cookies/persistence causes or does not cause site breakage.


Or you can just freely allow all cookies from any website that wants to set them (sites will be happy and working), but only for current browser session. You have to remember to restart the browser every now and then, though.

Then use whitelist to selectively allow cookies from some "friendly" sites to be stored permanently.


If the intent is no tracking, then this defeats the purpose.

Often the browser stays open for hours and you'll have identifying tracking cookies very quickly.

A great, really underused feature in Firefox is first party cookie isolation: it isolates all cookies set by a site to the same domain, preventing all cross site tracking.

Set privacy.firstparty.isolate to true in about:config.

Some more info: https://www.ghacks.net/2017/11/22/how-to-enable-first-party-...


Or use an extension like Cookie Auto-Delete. It clears all cookies set by a given tab when that tab is closed, and lets you whitelist domains which can set cookies which won't be deleted.

Thanks, that's much better than my approach. This is exactly the extension I needed, as it makes cookie management more straightforward and transparent to me.

I switched yesterday and couldn't be happier. :)


I clear all browser data upon exit.

Overkill? Maybe, but it works for me.


Plenty of sites respect user privacy. I should know; I run dozens of them.

Define plenty.

Are among those "plenty" the actual important pages that people want to use? Or some irrelevant pages here and there?

And how do you know there are plenty?

In fact, even if it is so, how can anyone verify that, even just about your sites? We can merely just trust you.


Most of the articles I click through to from HN work just fine with all JS, including first party, blocked by uMatrix.

>the actual important pages that people want to use?

Where is this special list of important pages? Are the sites I want to use not important? Does your comment need to be so simultaneously defeatist and hostile?


>Where is this special list of important pages? Are the sites I want to use not important? Does your comment need to be so simultaneously defeatist and hostile?

Or, as I like to call it, pragmatic.

Visits are a power law distribution, 80% of people's visits go to 20% of sites, and so on, recursively. s

So unless e.g. the top 1000 (which may vary depending on country) people want to use are there, e.g. the social media, news sites, booking, video, shopping sites, banking sites, you're just talking about a number of niche websites.

Sites that "still work with JS disabled" are in the minority on those lists.

Essentially you're saying "don't use all those sites with the content/services you want", use all those others that don't have tracking (but which you don't really care for).

E.g. pointing to Diaspora vs Facebook...

The best I've seen people come up with on this front is DDG vs Google.


> Sites that "still work with JS disabled" are in the minority on those lists.

You'd be surprised how many sites are still viewable without JS enabled.


You’d be surprised how many sites are more viewable without JS enabled.

Damn straight! Disabling JS fixes more websites than it breaks.

I feel like this is a reference to ‘Arrested Development.’

Haha, it was indeed my inspiration. But it's also true! Dozens! :)

And if it is not, it should be.

DOZENS.


There are plenty of ways to track that involve zero javascript.

If plenty is "a few" than I would agree (and I am glad you respect privacy). I am still searching for one site that I could give an example for GDPR and they are all blatantly violating it. In most cases they just give you fake impression they respect privacy (by setting banner to "opt-out" (which is violation on its own) after the tracking 3rd party scripts are already loaded).

The whole privacy deal on the web just shows moral corrosion. It is putting IT neck to neck with scammers.


> I am still searching for one site that I could give an example for GDPR and they are all blatantly violating it.

I think the reason you can't find one is likely because you are disqualifying all the ones that aren't violating it. There a lots of websites that don't violate GDPR: they don't record any information. Perhaps we can quibble about server logs and whether or not IP addresses are PII, but let's stick to at least the broad strokes here since you are saying all sites blatantly violate GDPR.

I think what you are probably trying to say is that amongst the websites that are trying to harvest your data, (virtually) all of them do so in a way that violates GDPR. This is not surprising to me, because at its heart GDPR is trying to encourage companies not to harvest personal information.

I don't think we will ever get around that. The question is not whether or not many (most? virtually all?) companies will try to get around GDPR (they will). The question is whether or not GDPR will have an positive influence on the use of private data. I can say in the company I do work for, it has completely transformed how we deal with PII. We now actually have gatekeepers that tell marketing what they can and can't have access to. If there are problems, then people actually get chewed out and we put real, emergency resources into fixing them. I mean it is absolutely night and day.

For us there is always this fight between people and departments that would like unrestricted access to information and people who are protecting it. Without GDPR, there was no defense! There was no argument you could make -- "It's wrong!" "By whose definition? The lawyers are fine with it" Now we can legitimately say, "You can't legally do that". Not only that, but I've even had marketing managers being very concerned that we might be handling data incorrectly. I've never, ever seen that before in my 30 year career.

Yeah, there are lots of problems and I don't see them going away ever, but man GDPR is really helping in a lot of areas.


Ok, I wasnt talking about protecting the data part.

>The question is not whether or not many (most? virtually all?) companies will try to get around GDPR (they will)

They arent getting around, they are violating it, based on GDPR beeing doe as a concept, you cant workaround it.

The question is, when it will be enforced.

I dont have anything against tracking, targeted ads etc. but if GDPR is followed, which means opt-in consents, no "lets stuff everthing under legitimate interest" and so on. Under GDPR conditions I am even prepared to turn off ad blockers.

And I wont even start talking about mobile applications.


But, "lets stuff everything under legitimate interest" is totally valid if it is actually legitimate interest. Opt-in consents under GDPR is probably your worst strategy. The lawful basis you want to be under is contract basis: you gather the information you need for the contract. You hold it until the contract is up and then you delete the information. That's the best for everyone.

Legitimate interest is the next best for everyone. You collect the data for contract purposes and you retain it beyond the contract period, or you use it for something other than the contract, but it's for a legitimate reason. You must tell the user that you are using the data for the legitimate reason and what that legitimate reason is!!! It's a very good way to use data. If the user objects, then they can object and you can't use the data (you have 1 month to respond).

After that (and ignoring lawful basis, etc) you have consent. Consent is an awful reason to collect and retain data. You don't need it for the contract. You have no legitimate reason to have the data or to use it. You just want it. So you ask the user if it's OK.

No company should choose consent. It's horrible, even for the business. As I've written before, if the user opts out, there doesn't seem to be a way to opt them back in if they change their mind. So if there is any way for you to turn consent into contract basis, you really, definitely should! If there is some reason that the user would like to consent, they you shouldn't be using consent. You should offer them a service.

It's super frustrating to me that people harp on about consent, because that it really going against the grain for GDPR.


You got the legitimate interest wrong. I wont bother explaining, as I am sick of downvoting (would love to discuss recitals), here is presentation from Tim Walters, check the legitinate interest (or the whole, you might be surprised): https://www.youtube.com/watch?v=-stjktAu-7k

Bottom line, "the grain" of GDPR is user interest. Not "user expirience", not bussines interest.

Users interest.

And it is HARD to decide instead of him, I would rather pop up consent dialog with opt-in than showel everything under legitimate interest.

As it is so easy to make it wrong: sure, you are sending a packet to the customer, you need (legitimate interes) address, phone number comes handy (requiring it is fishy), forcing it to protect login on a social network? I wouldnt do it. For me, as a security aware person, you would crawl trying to prove I am in danger with 15 letter random generated passwords generated for each and every site. Unlike for John Doe. So, it becomes optional, while forcing it, in my case, violates GDPR. It was just one example.

But anyway, check Tim Walters.


I don't think we are at odds with what you are saying. I'm 45 minutes through Tim Walters' video and there is absolutely nothing new for me so far. I suspect I'll get to the end and there will still be nothing new for me because I'm starting in the same place he is.

As for your example, I totally agree! Forcing you to log in to a social network to send a package is crazy. I order cheese making supplies on the internet because I have no other way to buy them. Not a single supplier of cheese making supplies even offers to make me log into a social network.

You're making the statement that all sites are blatantly disregarding the GDPR and I think it's because you just don't pay attention to the sites that aren't.

I'll give you an example (which is is cheese making again). I wanted to check the shipping costs for cheesemaking.com. I don't like the fact that they make me fill out all of their order forms before they tell me the shipping cost, but they do. They have a newsletter which they use to do their marketing, but for now I've not signed up for it. When I didn't complete the process, they sent me an email. They asked if something went wrong and said they will hold my order for 48 hours. After that, they will delete all of my information.

And these guys aren't even in the EU (and neither am I, although I work on contract for a company that is). This kind of behaviour is exactly what I expect and I think it is completely in line with the directives. The only thing they were missing is telling me under which lawful basis they were operating in each case.

Is it contract basis? Keep in mind that as far as I can tell, "contract basis" does not actually require a contract to be in place (i.e. you don't have to have consideration), so I think there is an argument for saying that since I contacted them and started to initiate a purchase, following up on why I didn't finish (for a limited time period) is within the directive.

Even if it weren't, it is almost definitely within legitimate interest. To really qualify for that, they would have to offer to let be object, but since they will delete my data after 48 hours I think they are following the spirit of the directive (because you only have to respond within 1 month).

I don't know. I think the reason you keep getting down voted is because you seem to be focussed on something that is different than what everyone else is talking about. It's absolutely true that there are a lot of companies who don't give a flying monkey's about GDPR. But it is untrue that there isn't anyone. The rest is details and as Tim Walters is at pain to explain the GDPR specifically is not prescriptive because they want you to follow the principles not a check list of rules.


I dont think we are talking about the same thing.

I am talking about:

“There might well be a market for personal data, just like there is, tragically, a market for live human organs, but that does not mean that we can or should give that market the blessing of legislation. One cannot monetise and subject a fundamental right to a simple commercial transaction, even if it is the individual concerned by the data who is a party to the transaction." (https://edps.europa.eu/sites/edp/files/publication/17-03-14_...)

Anyway, I was talking about social network requiring your phone number, not market requiring to log in with social network id. And you are talking about bussines where there is a bussines transaction. I am talking about site you surf to.


> Anyway, I was talking about social network requiring your phone number, not market requiring to log in with social network id. And you are talking about bussines where there is a bussines transaction. I am talking about site you surf to.

OK. That was not clear at all to me! Now that I understand that, I understand what you were trying to say a lot better. I still don't think we materially disagree with each other, though. There are lots of sites that are good examples for GDPR. I think it is absolutely true that none of them are trying to harvest and sell your data! I don't see how that could be the case. If you use that as your criteria, I don't think it is possible that you will find an example. Should those sites be banned from the web? I'm not sure, but it wouldn't bother me, that's for sure!


How about passing laws which forbid tracking users like this and levying steep, business-ending fines against the companies who don't step in line?

No problem. All your favourite online services are no longer free though, what a bummer. Will it be a lite, regular or premium Google maps subscription? How about Facebook Messenger?

You don’t need detailed tracking to run ads. Newspapers did it in the time before the internet.

What did occur, is that advertisers could not be sure of performance so placed more value on the prominence and reputation of the media source. National well reputed papers got the best deals, regardless of real world performance.

It will lead to tight localization again, as advertisers can’t rely on Google just selecting the right people to display ads to, but would have to seek specific sites to advertise on.

The world will be fine.


This it could have the side effect of killing independent journalism startups. Big media is already too entrenched. These big newspaper websites do already sell ads directly too - DoubleClick can serve both Adsense (usually as a backup) as well as their own inventory.

There are drawbacks to every option. Given the choice between a surveillance state and a propaganda state, I will always choose the latter - at least you get some agency over what media you consume and which statements you believe. Surveillance is forced on you, and invisibly at that.

They actually have ways of tracking ad performance in newspapers.

Remember the "mention this ad and get 10% discount" ads? There you go.


> You don’t need detailed tracking to run ads. Newspapers did it in the time before the internet.

Newspapers did it to some extent, and they were not competitive businesses against methods where tracking does exist.

The reality is that newspapers are still very strong lobbyists, especially in newspapers, as they frequently sink politicians who don't toe their line with unrelated scandal or just plain fantasy. If Google pushes this they are likely to find legislation, particularly in the EU, mandating quite the opposite to what you want.


I stopped using both of these BECAUSE of the ad network, years ago. A lite / regular / premium option, that I could trust to not also sell to the ad network (just because you pay doesn’t mean your info isn’t being sold...), would be a most welcome upgrade. Bonus: my friends and family become protected from this ad network too, which nets even better privacy for me (since analytics can know a lot about you just from the people you associate with).

That sounds like heaven, honestly.

If the lite subscriptions are comparable to the current content it'll be pennies per day, and only on days I actually use the service. That's acceptable.

And then I can buy a 'regular' subscription for google products that comes with actual support? It's like a dream world.


It's not as if Google maps is the only game in town. Really the only thing I use it for over openstreetmaps is business hours and transit directions, but there is no reason the later couldn't be done. I use osmand on my phone and have offline maps for everywhere I've been. Grabbing some gtfs files wouldn't be difficult.

As for messenger, there will probably always be some "free" messaging service out there. Free in quotes as it'll come with phone, isp, or email subscriptions.


Premium Google thank you, Facebook I will love to see emplode, Twitter will slowly die away and I will be able to go back to RSS feeds to peoples blogs.

I already pay for newsblur.


I use OSM and IRC, so that's fine by me!

You also put your time where your mouth is and compete against one of those free services (Github) :-) I don't even use SourceHut (yet!) and I send you money because what you are doing is awesome (admittedly a tiny amount -- if anyone can convince Boris Johnson to walk away from Brexit, I'll be able to up that ;-) ).

Sounds good. At the minimun these tracker giants should be regulated to offer the option to use their services as a paying customer without tracking.

That'd be a dream coming true! But I'm sure it wouldn't be the end of "free", ad-ridden content.

Ads aren't the problem. Tracking, malware, and other user-hostile activities are the problem.

I dislike both things for different reasons.

In your model I have a choice to opt out of tracking. Where is the downside?

I would absolutely pay for this but I don't even have the opportunity.

Not a bummer, great.

The fight can be won if you block 3rd party JavaScript. Just install uMatrix and enjoy browsing the web in half the time it takes without it. No ads, no trackers, no Facebook widgets, just the web we learn to love. Sanity.

Do you know of any way through the about:config editor to do this? I'd hate to have to install yet another browser plugin.

uMatrix is a must have add-on and nothing else comes close to matching its functionality. It pairs well with uBlock Origin (same dev). Add-ons like Decentraleyes is also good as it prevents persistent tracking from shared web libraries, like Google ajax, that everyone seems to use... it's free and widely available for a reason.

Google is also working to update Chrome APIs that will cripple privacy centric add-ons and those that allow ad-blocking, like uMatrix and uBlock origin soon, so enjoy it all while it lasts.


which is why you should switch to firefox asap.

These addons work fine in Firefox.

It’s native in a number of browsers but quite hidden these days. Set your JS security settings to (ah I’m making this up as I don’t have the browser window in front of me) “Prompt” or “Allow (Per Site)” or “Blocked” - https://support.google.com/adsense/answer/12654?hl=en

Then in browsers like Chrome, click the site icon, SSL icon, or little plugin icon in the address bar — some icon usually appears, which when clicked gives access to enable/disable blocked content for this site.


>If you've been browsing the internet for more than 5 minutes you already have cookies from some of the major ad networks

This does not really work without 3rd party cookies, does it?


That arms race is winnable because its a question of economics, not capability. As long as the cost of unmasking people remains higher than the expected profit from doing so, the fight is won.

As for cookies against ad networks, I browse with all third party cookies blocked. Some cookies from third parties are allowed. So that would not be the effective against people with high security settings.

And there is also the possibility of a brand new computer with no cookies. Then that kind of detection will not necessarily work for every one.


The owners of these websites don't care about false positives (like people on new computers). Their interest is in maximising the number of ads shown and nothing else.

>If you've been browsing the internet for more than 5 minutes you already have cookies from some of the major ad networks.

I don't, as I block all third party cookies. No exceptions.


Interesting. Are there any legitimate uses for third party cookies (i.e. functionality that you as the user actually care about) or are they only ever used for tracking?

For me fair amount of sites break where they have login authentication scattered around different domains.

As someone who also blocks third party cookies - Captchas. But nothing else important IME.

Yes, Google is committed to patching out anything used to detect incognito mode. If they make good on that promise detection will become harder and harder until it becomes practically impossible.

Incognito mode can be made to look exactly as if you opened your browser after creating a new profile and immediately opening the site. As long as they can’t afford to block these users it can be made to work.


You can remove the as long out of the last sentence and it still makes sense.

I feel as though your argument is predicated on browsers that leak tracking data such as cookies, etc.

If the browser vendors would tighten their products up and start considering things like profiling the user's installed fonts as a security vulnerability, then we might see some progress. Unfortunately Google has a financial incentive to make Chrome trackable to advertisers.


Safari has already started introducing protections for attacks like font profiling and knowing the Chrome security team I would be surprised if they haven’t at least considered that as well.

Don't forget that the advertisers don't get income directly from tracking you. They get income based on how good they have a profile of your interests (and things that you would buy). If they can get a good enough profile without tracking then maybe they'll stop tracking. Or maybe it would be easier to block the few that continue taking than wave war on an entire industry....

There's already been some talk about changing things so that tracking is less necessary (https://webkit.org/blog/8943/privacy-preserving-ad-click-att...) and I wouldn't be surprised if a lot more comes out by the end of the year.

Anyway, anyone that knows the details of the web could have seen this coming. [Private mode is not private](https://brave.com/private-mode-is-not-really-private/). Websites can calculate fingerprints from users just by running JavaScript (they can easily get the processor speed, GPU version, local IP, and other details) so they can identify you even if you clear cookies. And the funny thing is that all of these APIs have reasonable intended uses and getting rid of them would large portions of content on the web.


>they can easily get the processor speed, GPU version, local IP, and other details

This is what I'm referring to. The browser should not leak this info to random websites.


I think you accidentally a word

What is nytimes.com using now to detect incognito? Enabling filesystem in incognito used to work until a few days ago.

The article says they are using the quota trick.

Interesting on Firefox

>https://www.bleepingcomputer.com/PoC/incognito-detection/inc...

= NO Incognito

Menawhile on NYTimes

= You’re in private mode.


IIRC, the trick is to time filesystem access and deduce you're running from RAM rather than disk, thus in incognito mode.

Sucks for you if you got ramdisk/optane, or have fsync disabled, I guess.

Interesting, if that's what they're using it should be possible to inject a JavaScript to replace the filesystem access function with a wrapper that induces delay.

> All that is left to do is get in bed with the ad network to ask them if they have good cookies for this session.

What's funny is that this is pretty much what ReCaptcha v3 is. Asking how "authentic" the Google cookie looks.


Maybe they should start randomly sharing adtech cookies through the cloud.

This is similar to what I'd like to see. Sharing cookies is somewhat dangerous since they could have login/user data, but it would totally work to avoid browser fingerprinting. With all the fingerprinting methods (screen resolution, GPU, HW, font-list, etc.) it's a losing proposition to attempt to remove all traces of a unique fingerprint.

Instead I'd like to see a browser that generates such a noisy fingerprint that it is useless: Each time I start an 'anonymous' session, grab a fingerprint from a pool that is sufficiently similar to mine that things render properly (matching resolution for example) but that has also been used by thousands/millions of others.


The big idea here is that the web advertising economy is user-hostile, and has to go. Someone - Apple, Chrome extension, whatever - will go nuclear, and craft the AI ad blocker that will do to ads what Gmail did to spam. God willing.

I think the solution lies in a system that doesn't use HTTP. Been thinking about it for years.

Why? Would you please be more specific?

Ultimately, if you block ads isn't the tracking useless?

Sure, you can figure out the perfect ad to show me, but if I'm never going to see it, you're wasting your time.


In the same way that ublock is a must-have extension for many people, I won't browse without cookieAutoDelete [0]. Mostly I don't bother authenticating with websites anymore, and (I think about 3 maybe? And I just reauthenticate when I want to use those) and for mobile browsing I mostly use Firefox Focus [1]. The ad networks are down to browser fingerprinting now, which will be devastatingly accurate but realistically be the next frontier in the privacy war. I habitually roll IP addresses from different VPS service providers but I'm not sure there's much more an end user can do currently. Fascinated to see how this front will develop - if things like GDPR are coming into effect, then it's only a matter of time before a gratuitous misuse of data is closed in on, though I suspect it's too late for the current generation.

[0] - https://addons.mozilla.org/en-US/firefox/addon/cookie-autode...

[1] - https://support.mozilla.org/en-US/kb/focus


The browser shouldn't be providing a User Agent anyways. No matter what device I use, my user agent should have no bearing on the HTML or JS I receive.

Random user agent to throw off the fingerprinting. Decentraleyes to defeat CDN tracking.

Another fun one is viewport tracking.

> .. Chrome browsing the web on your behalf, building up a bogus profile for the ad networks

Chrome is an ad network itself, well, a biggest player. I'm browsing in default Private mode on Safari for years, so yes, my fight is over, no cookies for me, lol. And on always-on VPN.


This is annoying so I just use fresh browser profile every time I encounter such site, i.e. have a short-cut for:

  $ cat ~/bin/chrome-new 
  #!/bin/sh
  TMPDIR=`mktemp -d /dev/shm/chrome-XXXXX`
  google-chrome --user-data-dir=$TMPDIR --no-first-run --no-make-default-browser "$@"
  rm -rf $TMPDIR

In FF, this seems to be solved with the mechanism of containers and extensions like ‘Temporary Containers.’

Temporary Containers is a savior when it comes to paywalled articles.

Firefox Containers¹ are nice for that. Put the website in its own sandbox.

1: https://addons.mozilla.org/firefox/addon/multi-account-conta...

My only gripe is that the containers won't be in sync accross systems, which is already time-consuming to setup..


This is really an awesome command line hack.

For those like me who need a bit more detail:

1) make a text file called chrome-new

2) put the following contents in the file in a POSIX-like OS

  #!/bin/sh
  TMPDIR=`mktemp -d /dev/shm/chrome-XXXXX`
  chromium-browser --user-data-dir=$TMPDIR --no-first-run --no-make-default-browser "$@"
  rm -rf $TMPDIR
3) make the text file executable:

  chmod u+x chrome-new
This also uses the chromium executable for chrome, which I think is the default on debian-based systems. If this isn't applicable to you, change "chromium-browser" on the third line to whatever the executable is for chrome on your system.

Very nice!

Workaround for now:

1. Open the profile menu. (This is the icon in the top right, just to the left of the three vertical dots.)

2. Click "Manage people", click "Add person" (lower right).

3. Type "Darned Newspapers!" and click "Add".

4. When you get blocked, copy URL, use the profile menu to navigate to open a new "Darned Newspapers!" window, paste URL.

It's a real profile, so it should behave quite closely and should be harder to detect. Of course, unlike incognito mode, it will save your history, so be aware of that.


Another workaround:

Disable javascript on the newspapers' sites. They load the full article and then use JS to hide it. (They also use JS to do the incognito detection in the first place.)


Often doesn't work, they now take a new approach by only loading in the extra content if you pass their checks (using JS). Not using JS will get you the first paragraph.

Downside of search engine indexers evolving over the years to execute JS on crawled pages.

I hope this never changes. It’s so easy and works so well, even on mobile!

Talking about this makes me feel the same as when I discuss youtube-dl.


I have started giving websites a one-strike rule for javascript. If you pop up a paywall, a modal, an autoplay video, or anything else that distracts me, you permanently lose js execution privileges on my browser.

The world would be a better place if developers knew that they can’t count on javascript.


Alternatively you could use Firefox's Container Tabs.

This works especially well when combined with the Cookie auto-delete extension.


Your other left. :)

Oops, I'll fix that.

Client tracking throw Browser Fingerprinting is a spooky tech. Check these two websites to see for yourself. Try using different browsers, even aVPN connection, and see how trackable you are:

1) https://panopticlick.eff.org 2) https://amiunique.org


The amiunique.org one was very enlightening - along with my user agent, my content language header was giving away a huge amount of information around me. Long ago I selected German (second language) as a secondary language, and that combination is the rarest thing I have going for me.

Goddamnit, I'm blocking everything except CSS and images through uMatrix and it was still able to determine I was unique due to the "If none match" header.

These things state I'm unique every time I go to them. That's the problem with the tech, it can't track very well.

It's stable but not over a long period because your browser will change. But it could be used to link two identifiers together, for example when your IP address changes, to create a chain that can track you over time.

OK, this article was not what I expected. I use incognito mode whenever doing online banking, searching for medical information, etc. I don’t care if a web site knows I am in incognito mode. Indeed, I think it is a web site’s right to know because they may need to enforce free access quotas, etc.

I think they would have no right to USE YOUR COMPUTER to enforce such quotas, please.

I’m a bit surprised Chrome developers went the route of an in memory filesystem instead of trying to sandbox and clear real disk access. Silently using up to 120MB without realizing sounds pretty bad.

You can't allow for bytes to sit around on disk in case of crash.

Maybe they could encrypt with a key kept in memory? That'd still allow detection of use though.


I wonder if the privacy threat model includes being able to prove that you used incognito mode at all, when, or how much. I can imagine all sorts of leaks in that regard (how many incognito disk files were created, size/ctime/mtime, system logging, indexes etc). None of these would require physical access to the machine at the time of incognito browsing, just subsequently.

Might seem like an incidental concern, but being able to vacuum up a pattern of incognito sessions from a seized laptop (at a border crossing, say) and correlate it with the activity of an online pseudonym could be pretty useful.


You could encrypt both the regular one and the incognito one, with the only difference being that you persist the key for the regular one.

> "You can't allow for bytes to sit around on disk in case of crash."

Just check periodically (at startup?) for orphaned temporary storage data. I'm sure there are other parts of the browser that need to do this sort of thing anyway - expired cache data, for example.


Sure, but those are a different use case.

Why not just use a temporary file system like /tmp?

EDIT: I suppose these are often backed by memory anyway, so not sure if this would solve the problem, but interested in hearing arguments around it nonetheless.


Because there is zero guarantee /tmp gets cleared regularly, if ever.

But that onus is on the OS, not the browser.

Exactly? The browser shouldn't depend on the OS to DTRT.

Firefox will cache videos from sites like YouTube in /tmp last I checked.

On the surface this sounds like a good approach. The overhead of encryption should be very consistent and get buried in general file-system latency.

The point of incognito was never to hide that fact from the sites you're visiting. The point was to not leave a trace on the machine you're doing it from. Basically, it's for porn, not news.

Yeah, it's weird to do this strategic about turn on it. It feels very much like the personal grievance of someone on the Chrome team rather than a thought through feature.

But it is super easy to use 120MB by including images or other resources, or by creating elements, SVG, javascript etc.

A browser requires a generic solution to prevent denial of service due to excessive resource consumption.


If it were actually the case that these detection methods were not known when they shipped the in-memory storage solution.. I would find that incredible.

The dutch cable company 'Ziggo' (owned by Liberty Global) also does Incognito mode detection in their web-based tv player and does not allow streaming.

https://imgur.com/a/TacdDRm

You can check it yourself here: https://www.ziggogo.tv/


I don't believe that's actually their fault, HTML5 EME implementations don't work in incognito mode in Chrome.

But that is another way in which newspaper sites could do this detection is they wanted to, send an HTML5 EME clearkey to a one pixel video in the corner and get back the error response.

I think Google are on to a complete loser here tbh, and I'm not sure why they're wasting development resource. As much as using incognito mode to bypass soft paywalls might be fun for a user, there's no real moral justification. There's no privacy issue here in a newspaper giving a clear and unambiguous statement before you enter that you've got to disable incognito, and a user can either choose or refuse to do it. It's probably the clearest consent screen in the world.


Yeah, I know, but the article seems to not know about this.

Another avenue could be that they just check the uniqueness of your signature; if it's too generic: block content, and only lift it after installing a first party extension or something, that way, for most people it will just work and for the few that are false positives, you have a workaround. The whole goal of the incognito modus is also a way to detect it.

It's the same for adblockers; just serve a unique content key with the ad and check back via the ad provider if it was loaded before proceeding to serve content.

Only serving ad-free content to crawlers is no problem either, because ip ranges for the big ones are known and you can't spoof them in TCP. It's all a question of effort Vs reward. They probably know just a very small percentage of users will abuse it, so it's not worth it for them to spend a lot of effort blocking it.

For example: in a retail store, if there's a difference in expected vs actual money of ~€4, it's not even worth it to investigate, because it will cost more than you'll get back from resolving it. It's sometimes hard for me to comply because I always want to have stuff match 100%, but it's always a effort Vs reward dilemma that you have to work with.


The moral justification could just as easily be turned around: a website does not get to run arbitrary code on my computer, so I can (and do, by default) turn of javascript.

If you want to you're very welcome to.

That's really not the same thing as Google actively developing a tool to block soft paywalls, that will primarily be used by people to just not pay for things who really don't give a stuff about people running things on their computer or not.


Partly for this reason I use Firefox with its "delete cookies when quitting" mode instead.

You may like the Temporary Container add-on https://addons.mozilla.org/en-US/firefox/addon/temporary-con...

Why not use Permanent browsing mode?

I meant Always use Private browsing checkbox.

I don't really care about tracking within a session and private browsing had other downsides (e.g. limited history support). It's the cross session stuff that tends to be creepier in my experience.

They can paywall / block incognito all they want.

Just delist the (paywall'd) articles. That's the annoying thing - when articles come up on Google and you can't read them. Please fix this.

If people want to pay, that's fine. Perhaps ISP's should pay for these websites via their plans so there's no more need to login.

I don't want to login to something just to browse a feed. I believe people would be happy to pay for these websites, but in a convenient way, ahead of time. Allow IP ranges, create a browser plugin that reauthorizes the site session even in incognito / w/o password saving. Innovate like Spotify did.

Annoying and badgering the user is 101 UX antipattern. One reason some don't buy is they don't want to encourage it.

It's fair to hold them to a high standard because many of these websites are articles and presentation is supposed to be a forte. You don't battle the adblockers and incognito modes - you fight to make it easier and more convenient for your readership.


Yes. I just open an incognito window in Chrome and go to www.instagram.com and then it prompts me to log in via a list of Instagram accounts I've ever logged in in the non-incognito Window.

If you use Chrome, did you really expect any kind of privacy to be respected?

Well, they should be sued for false advertising. I know Google slices and dices all my info, but I expect them to honor my privacy when in incognito mode.

Seems like the TWTF is that in normal browsing mode, sites are allowed to use considerably more than 120MB of "temporary" storage on the filesystem.

I agree; by default it shouldn't be so much. (Maybe the user could configure it (to any range between zero and 4 GB, perhaps), but even then, the limit should not differ for incognito mode unless the user configures it the same.)

Keeping local storage in memory might make sense if Incognito Mode had separate storage for every tab. However, it's just another browser session that gets wiped when the the last tab gets closed – you cannot have multiple parallel ephemeral sessions. I remember having to install Chrome Canary because I needed four separate Chrome sessions simultaneously.

Your workflow is your workflow, but as an FYI Chrome profiles _also_ get their own everything, plus each profile also gets its own incognito session

There is an added benefit to using profiles in that if any other window is open from a separate profile, then all profile context menus acquire an "Open link in ..." menu item which will then list all the other profile names. Unknown why it doesn't do that context menu modification all the time.

If your other profiles are only used for development-time scenarios, you can also choose to "Clear Browsing Data..." on them at will, since you won't be losing anything valuable


They can track Mirimir all they like. Not that it'll do them much good. Because Mirimir only does stuff that I'm OK with everyone knowing about Mirimir. And none of it is connected to my meatspace identity, or to other personas.

Unless you're really paranoid, all you need is a VM, which hits the Internet through a VPN service. You use the host machine for meatspace stuff, and the VM for private stuff.


If Google really wants to fix this, they could always have Incognito mode detection as a negative ranking factor.

I wonder if it's occured to these sites that essentially they're poking the bear. Sure, Chrome can slowly patch these issues, but there's always a possibility that Google just turns around and says "Hey guys! I search engine crawler looks like incognito chrome! Fuck you!"

Alternatively, instead of fixing these issues by making the incognito mode detection break, they should fix them by making regular mode look like incognito mode. Suddenly, all readers are blocked from reading NYTimes for half a day while they scramble to remove it from their site. Then they can start teaching sites that relying on hacks isn't a viable, sustainable strategy.

Seems like it was a mistake to use a RAM disk to back the storage API in incognito mode?

Why not just create a new "real" storage db on disk, deleting it when the incognito window/tab is closed? It seems like this approach would defeat all of this class of attacks.


Why not just forbid the storage of user data? It would solve most problems. Why needs a whole society to be taken hostage for the profit of some companies?

I seriously do not understand.


Pretty embarrassing TBH. Both of those detection methods immediately come to mind when reading about their in-memory solution to the storage API problem. I can only imagine Google phoned it in on this one.

in a galaxy far far away Opera 12.x offered fully customization local storage, with per domain/subdomain enable/disable/quota/delete on exit options. Google owns our browsers and wont let us do anything crazy like configure away stuff useful for ad targeting.

Well duh, it is Google we are talking about!

I believe google has cut a deal with advertisers to ensure they can always identify incognito.

So do random cookies?

Is this a cat-and-mouse game that's worth playing for Chrome?

Browsers are so complex that I imagine incognito mode is always going to leave some kinds of statistical signatures that can be detected and exploited through merely moderate cleverness, but will be much harder to hide on Chrome's side.

Is it worth it for Chrome? Or would resources be better spent on other parts of the browser?

It's not really a privacy/security problem or anything as far as I can tell -- just a way to bypass paywalls, right? "Sites not detecting my incognito mode" never felt like part of the web's "contract" to me.


One part of the "web's contract" is that googlebot and other scrapers should get what the users get. And vice versa - if you're telling googlebot that there's a particular text available to the public; I'd expect to get the same content that you just gave googlebot and not something else, thank you very much.

That used to be the case. But Google has long ago given certain sites a pass on that.

Did they? This fight against Incognito mode suggests otherwise

Ever try to click on a LinkedIn profile from a search result? Most of them just show a LinkedIn login screen, and none of the text from the search result's snippet.

Shocking I know, but it seems like Google contradicting other parts of Google due to a lack of strategic leadership, which almost never happens. Daily...

Google will index paywalled content from news sites provided they annotate it correctly. [1]

[1] https://developers.google.com/search/docs/data-types/paywall...


What do scrapers and indexing engines have to do with it? Is it normal for them to run a full headless browser in incognito mode?

I was under the impression that they're either just doing straight HTTP requests for HTML only... or they're running a full headless browser in normal mode.

So I'm not getting what's different here?

Sites have a long history of serving up different content to different users, e.g. to paying users, or blocking certain countries based on content contracts. It's certainly not part of the "web's contract" that scrapers get paywalled content they haven't paid for.


As far as cat and mouse games go, it's one that's tilted heavily in favor of the Chrome devs. The mice have to advertise themselves very conspicuously in order to gain utility from their workarounds. Cats can usually catch mice that announce themselves.

All they have to do is navigate to a site using incognito mode detection and briefly review the code to find the next hole to plug. In this case, probably stop advertising the correct limit for incognito mode, and introduce latencies on writes to mimic a real file system. These are not a trivial fixes, but they also are not hard.


Pretty important for scrapers ...

"Can this fight ever be won?"

No, the point is to make the people subverting this for their own nefarious gains (looking at you, NYT) put so much effort, money, and time into it, that eventually they die a slow horrible death and, maybe, just maybe, something better and more relevant and less evil comes along (or maybe NYT changes their ways - either works).

I mean, look at this thread, so may great undermining methods! Beautiful.


After all the newspapers die a slow horrible death, I think the world will figure something better out.

But it might take a few decades of uninformed confusion.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: