> suggesting that Stratechery includes trackers from Amazon because I rely on AWS is ridiculous.
On the contrary. Suggesting that Amazon aren't tracking requests to Cloudfront is ridiculous.
You don't need to have malicious intent in order to accidentally send your user's data to third-party companies. Every single time you load any resource from anywhere, you create at minimum a line in their log file that contains your user's IP address, their User-Agent string, the content their browser requested, and the time they requested it.
EDIT: Personally, I no longer load any resources from third-party domains in new code I write. Bootstrap, jquery, etc. get served from the same domain as the rest of the website. Google Analytics is an absolute non-starter. And if you're concerned about wasting loading time or bandwidth on popular assets, you can use the "integrity" attribute to let the browser reuse a cached file from a different domain.
> Suggesting that Amazon aren't tracking requests to Cloudfront is ridiculous.
Counting a solitary log that an IP loaded a resource is hardly the tracking that is harmful. This is the absolutism that derails the discussion for me. In one step, you have forsworn any service which doesn't own it's entire stack from CDN to network switches to racks on the data center. How many websites do this? Worth noting that HN doesn't and is therefore allowing "third-parties" to "track" you. And yet you have no problem using HN.
> Counting a solitary log that an IP loaded a resource is hardly the tracking that is harmful.
Well the thing is you don't know if it's a single log line. What if the 3rd party script you loaded has been modified by a malicious attacker as happened with the great firewall DDOS on Github last year?
As server operators, we have a responsibility to preserve the privacy and security of our users. This includes the possibility to leverage 3rd party services when we need it, as long as we make it clear to the users.
Having 3rd party requests taking place on the backend means you take the security/performance measures to ensure your website operates properly. Loading these 3rd party resources on the client side means those operational costs/risks lie on your users. You are literally opening backdoors to the browsers of all your users.
Also worth noting, 3rd party resources & client side rendering (among other recent silicon valley fashions) play very poorly with higher-latency links. It's a truly degrading experience for your users.
> This is the absolutism that derails the discussion for me.
Sorry. Many of us have assumptions which are hard to change.
> In one step, you have forsworn any service which doesn't own it's entire stack from CDN to network switches to racks on the data center.
No quite so. There are modern tools which make owning the entire stack not absolutely necessary - end to end encryption and proxies, including Tor-like, come to mind. We're in the state of flux regarding privacy, both with losses and some gains, so it's not binary.
If HN uses it, it's a reason to improve things - not to dismiss as unattainable the privacy feature which seems so essential for many purposes.
HN doesn’t own their entire stack. Since there is one or more third parties involved, those third parties are tracking you as you read this ... or so GP would have you believe.
Do we know what tracking Amazon does and how harmful it is?
I mean, I'm willing to believe that Amazon just doesn't care what I get up to. But as somebody who has watched the Amazon website be ever more about revenue maximization without regard to my interests, I can also believe that they will use any and all data if it gives them as much 0.1% bump in a key metric.
I'll also note that a lot of what people shrug off as normal nowadays would have seemed wildly paranoid to most people 15 years ago. So I don't think our intuitions for "oh, they'd never do that" are particularly reliable.
Then again, Facebook uses your security QA pairs as personal info and adds it to your “what we know about you” profile. Even harmless security and logging features lately are being used for tracking beyond their original scope. It’s a bit of a difficult position to navigate
Honest question: I have personal page on S3 which I deliver using CloudFront. Should I think that this is an unwanted behaviour for my (certainly few) visitors?
I am asking in full honestly, not trying to create an argument here.
It's less than optimal from a privacy perspective, and it's worth being cognizant of that, but it wouldn't be enough to stop me from looking at the website.
Just don't serve stuff out of CloudFront and then cry foul when people point out that you're helping Amazon track your users around the Internet. :)
There's virtually (heh) no way to host a web page that doesn't involve another company's infrastructure (AWS, your ISP, etc), even any number of nodes logging that some packets were forwarded from A to C.
If it's important to you then the best that can be done is find a company you can trust doesn't have profit motive to use this data and use always use HTTPS.
AWS is big enough that there's probably very little profit in analysing CloudFront logs but then again I don't know that.
What if I host my server at data center? Does data center network hardware logs connections? I know that technically it can do that, as it only have to watch for TCP connection start (and, may be, collect statistics for each connection), but are they really doing that?
I'm not saying if they are or not, but I'm arguing from an absolutist point of view. They're probably not, but you could make just as strong an strong argument that GCP and AWS are probably not because at the end of the day we just don't know.
My point being, at some point in the chain you have to trust a corporation is not doing something evil. It's all a calculated trade-off at the end of the day.
Question is, in which aspects you have to actually trust - with no verification or no ability to enforce - the intermediary behavior? How costly would be to avoid having trust necessary?
My opinion as I stated before, is to follow the money trail.
I'd trust Digital Ocean or Rackspace over Azure, for example, because Microsoft does get some money from advertising/tracking.
If you're terminating TLS directly in 'your' cloud VM in Azure/GCP/AWS, I think that's an acceptable boundary as I don't believe these companies would risk accessing customer VMs (since it's not worth the backlash they'd receive in return from being discovered).
I think you're mixing two different aspects of the Tor project.
On the server side, onion services (running a tor daemon that reverse-proxies request to your HTTP/mail server) provide end-to-end encryption and secure name resolution (transport security).
On the client side, the Tor Browser is a custom Firefox edition that aims to make user fingerprinting impossible among Tor Browser users. From a server's perspective, they are all the same; they also come from the same IP address (the Tor exit nodes) when the service is not an onion.
Enabling an onion service does not prevent 3rd party tracking although it raises the bar. Third-party scripts/cookies can still do plenty of tracking. That's why those (as well as web fonts and SVG) are disabled in Tor Browser's Safer security settings, and javascript is completely disabled in the Safest mode.
So fingerprinting protection really is taken care of on the client side. Neither onion routing nor onion services will help you evade tracking if you use a classic browser.
Yeah, pretty cool, if you're into cypherpunky solutions to an extent where you don't mind the accompanying dystopia.
I get why some users need Tor, many of them for agreeable reasons, and that they would benefit from more people using Tor (and having more nodes available). But I think making it the default way of using the internet for everyone would be stupid. If the average connection goes through n hops, the entire infrastructure would have to be multiplied by that factor. Just the environmental impact (in terms of increased energy consumption and hardware production) would be significant. And when Tor is the only effective option you have for escaping unwanted tracking, you're disadvantaged as soon as you want to use a service that doesn't work well with he increased latency.
There has to be a better way. Not that I could say what exactly that would be, but what would be really cool to see is a world in which everyone has worry-free internet access. Multiplying the cost doesn't seem like the best approach.
There is a difference in it being "virtually" no difference, and there being no difference. Allowing a third part to handle TLS for you (as cloudfront/cloudflare and other providers often do) is undermining the point of encryption. For a static site that little difference but encryption herd immunity is still compromised.
Proper encryption is not just for important stuff, it's for everything, so that when we send important stuff it is not seen.
Usually I think stratechery is spot on. But though I agree with a lot of points and it ends on a great note, I have some strong disagreements in this one!
(1) Anything loaded from a third party is potentially a tracker until proven otherwise. If you load Google Analytics, Google gets pinged some information about every visitor. Typekit definitely collects tracking data, although they claim it is anonymized and not identifiable. I believe same goes for New Relic, and both share (or sell?) aggregate user data to other parties as far as I know. Etc.
(2) A beneficial outcome from Apple's stance and the privacy debate would be self-hosted analytics and login-type services that don't connect to any third parties. A site like stratchery could pay for a license to use this software, but host it on their own servers, and have a strong privacy policy around what data is recorded/shared.
The response to (2) is likely to be more proxying of requests. Websites will still want to outsource and they can send data server-side, via rpc.
The current way of doing things via client-side requests is transparent to any web developer who can use the network panel in a debugger, though not to end users. Do it all server-side and nobody will be able to see what's going on. It will be more like how credit card data is shared.
That's a good point that it could decrease transparency in practice (of course in theory sites can already do server-side data sharing now). But it's beneficial if we have strong enough requirements for disclosing / privacy policies. It makes the website directly responsible since they're handling the data.
I found this article incredibly naive. A recent stint in ad tech showed just how deep the multi-company coordination and orchestration go. Assuming that linking to Google's fonts or images doesn't expose your users or that Adobe's Typekit isn't part of Adobe's Advertising Cloud seems like putting your head in the sand.
Particularly pollyannish is the author's handwaving Google Analytics away with an "only for counting conversions". Holy Crap!
Precisely. The article seems to take the tone of "All of this stuff isn't really for tracking" when that's not strictly true. Instead, He doesn't use it for tracking, but that doesn't mean the data isn't still generated and used by some of the third parties. Even if I grant him the necessity of doing things this way, as he seems to claim, that just bolsters the argument that invasive tracking is baked in to nearly everything, that you simply can't do business like this without it happening.
> Even if I grant him the necessity of doing things this way, as he seems to claim, that just bolsters the argument that invasive tracking is baked in to nearly everything, that you simply can't do business like this without it happening.
It's much the same problem as (among other examples) mobile device permissions.
Things have improved somewhat, but for years every Android app asked for permissions like "view and edit the files on your device". 99% of the time that was innocuous. Saving a single user setting could require creating and viewing a file. It's easy to see why devs with no good alternative would have been frustrated to hear "hey, why do you want permission to see all my stuff?!" But that didn't make it good or secure. Flashlight apps would get file permissions to save a single setting, then hoover up as much data as they could get with that permission.
The problem was precisely that the innocent, 'right' way of using permissions was potentially a channel for abuse. If people object that loading a single font from Typekit is inherently a violation of privacy, maybe they're fundamentalists. But in practice, what I see people point out is that it creates a threat to privacy, that the entire ecosystem of data access in inseparable from the ecosystem of tracking and advertising. And that's absolutely true.
With regards to Android apps asking for permissions like "view and edit the files on your device" -- I only recently stumbled into something about this while helping a friend with understanding app submission.
There is/was a long-standing bug on Android that as far as I can tell Google never completely fixed (though I might be wrong). For apps that have both an .APK and .OBB file, the Play Store would save the .OBB file to the user's device with the incorrect file owner. This would cause the app to fail (or at least the app wouldn't see the .OBB file already there and would, I think, re-download it). The workaround involved requiring the app developer to ask for this permission, even though their actual app didn't need it, in order to avoid this issue. ANother workaround was to ask users to reboot their phone, since (I think) the permission would be corrected after a reboot (not sure, didn't delve too deeply into that in particular).
For those that are unfamiliar - .APK files are the application packages and they're limited in size by the Google Play Store. To get around the size limit, developers can create a larger .OBB file that they could host on Play (I think?) but then have to download. Google then implemented automatic downloading of the .OBB file by the Play Store, but as far as I can tell never got it right. The whole thing seems to be getting superceded by a new format called AAB (Android Application Bundle).
What about desktop apps? Every Win32 app has permissions to view and edit all the user's files and data, arbitrarily snoop on and hook into other processes running as the user, etc. The situation on Mac and Linux for most users is similar, except for sandboxed Mac App Store apps.
On Windows there is a modern app model (UWP) that provides for app isolation and more granular permissions but the push to get software moved over to it seems to have stalled.
That's an excellent point, the article focuses on stratechery.com intent while not understanding the privacy disaster they are happily participating in.
Asking to trust him because he don't use it for tracking is much like a kid telling their parents that they want the PC "for education!". Sure, maybe some do.
The article kinda beats itself with his examples.
E.g. he includes javascript from adobe for a font. Like why is that?
I don't know if Adobe uses that for tracking, but they very well could and that's hard to deny. They could control everything his page, because he uses a font from them. Sometimes there are tradeoffs, but there's no possibly legit reason for this. A font is a static file that changes never. You don't need Javascript to load a font and you certainly don't need to include something from a thirdparty host.
Typekit (acquired by Adobe) makes licensing and using web fonts MUCH easier than it is to do without. I share your distrust and dislike of Adobe, but let's not pretend that there's no value to that particular service.
I am a typography nerd but let's be real here: web fonts are not so important that you should trade away your visitors' privacy. A CSS system fonts approach [0] is not only great for user privacy but results in better performance, security and more consistent GUIs.
Typekit uses the same CSS system. One massive reason people use services like it is licensing. It’s a pretty cheap service, no per-font costs and you are sure that the way fonts are delivered is legal.
I am not using one of these services. I had to pay $600 for a font family, still have a page view count in my license (I am supposed to track that myself and purchase a bigger license if it exceeds) and I have to make sure fonts are not easy to download (license requires me to figure out how to prevent downloads myself).
So it’s not hard to see why Typekit is very appealing — you don’t have to deal with any of that.
There are! The second paragraph in my first comment goes into detail on this. You can purchase the font and get the files, but it's still on you to track views and make sure the files are delivered to the browser safely (preventing easy downloads).
Edit: it seems this entire thread is heavily downvoted -- both sides of the argument. Is it off-topic? Bad faith? Why the downvotes?
He's saying many of these resources are necessary to run the site, and that it takes effort to not collect data. I'm not disputing that, but isn't it somewhat besides the point? The data is still generated & collected, even if there's no intention of it being used. (which doesn't mean it won't be used at some point). If anything, the argument from privacy would point this out as a deep systemic flaw in the system, that even sites that have no interest in any of this data still have little choice but to generate it if they want to avoid laborious & costly custom solutions to the task of running such a business.
> the data is still generated & collected, even if there's no intention of it being used.
As an aside, this practice is explicitly banned by the GDPR: if you cannot justify the collection of data based on actual, current business requirements, you cannot legally collect and store that data. From https://gdpr-info.eu/recitals/no-39/:
the specific purposes for which personal data are processed should be explicit and legitimate and determined at the time of the collection of the personal data.
Oh chucks, they put up a fence post. No way around it, I guess, we'll turn around and we'll hold ourselves accountable, if nobody else will. I already see ministries of privacy on the horizont rivaling the IRS in proportions and geez am I glad that nobody is gaming that system.
I too remember being annoyed at the thought that my blog wouldn't have accurate analytics anymore if people's ad blockers blocked Google Analytics from loading.
Then I realized tracking even for "benign" purposes isn't a right for website operators and we need to have more respect for our users' computing decisions.
I took Google Analytics off completely. Had no analytics for a while, then tried out my own install of Matomo. That way at least if I'm allowed to "track" my audience, it's using an open source tool running on my own server.
My point is that I don't agree with the author at all. Every third-party resource is a potential vector for abuse, and even mere fonts or payment processors aren't immune from critique. Does this cause hassles for website operators? Sure. But that's the price of doing business online in 2019.
The web desperately needs a privacy course-correction, and the more operators resist the inevitable consumer backlash, the more they will shoot themselves in the foot.
And the HTTP protocol gives you and your users a way to negotiate a fair and mutually-beneficial deal. In it's usual implementation, it's called auth, or more commonly, a (pay)wall.
> Each dot represents one tracking resource (like a script, tracking pixel or image), which would be blocked by an ad-blocker
[emphasis added]
Immediately after excerpting that, the critique:
> This strikes me as an overly broad definition of tracking; as best I can tell, Manjoo and his team counted every single script, image, or cookie that was loaded from a 3rd-party domain, no matter its function.
Ad blockers typically don't just indiscriminately block all third-party resources. I suspect that the author simply isn't aware of which things are and aren't tracking users, and assumes that they don't track users unless that's their explicit purpose.
> Ad blockers typically don't just indiscriminately block all third-party resources. I suspect that the author simply isn't aware of which things are and aren't tracking users, and assumes that they don't track users unless that's their explicit purpose.
Yep, and they seem to misunderstand the Webkit page too. As I read it, they're not even planning to block ad and tracking scripts at all. What they are blocking is:
1. Cross-site tracking (e.g. using a stored UUID or tracking cookie)
2. "Covert" means of tracking that fingerprint users (and therefore enable cross site tracking that can't be otherwise blocked by the browser).
I don't see any evidence that the Webkit devs consider third party requests to be cross-site tracking per se. You and I (rightfully) worry about tracking sites correlating users via IP addresses and other means, but what Webkit seems to be planning is simply isolating individual sites, kind of like Firefox's containers. It's not much. Maybe there's hope they'll clean referers too?
I can't help but notice this blog entry renders fine with all JavaScript disabled: https://0x0.st/z4bi.png
Despite this, the site wants to load 15 or more first party scripts and more still from third parties. Considering these scripts are evidently not necessary for the reader's benefit, what purpose am I to assume they serve that isn't contrary to my interests?
Most sites add analytics by default without having any CRO strategy in place let alone understanding of the privacy implications. It's just a case of "let's just add analytics and maybe we might find it useful sometime in the future". It doesn't help that Google Analytics, the bigger provider out there, has an atrocious, maze-like GUI that makes understanding of these matters difficult for even experienced web professionals.
I have uMatrix set up to block third-party scripts by default and loads of sites work just fine. I am very liberal in approving things that are necessary for the functionality I want (such as third-party video players, etc) but a lot of the internet doesn't seem to degrade at all.
The key reason people have problem with data-mongers is deceit. Not that absolutist do not understand why privacy is violated, but how. For instance,
- Data Collection: No way to opt-out. Ex: Analytics.
- User Studies: The default is almost always opt-in. Ex: Siri recordings.
This, coupled with the fact that a majority of internet and smartphone users do not understand the second thing about the ad-industry makes this a racket of gargantuan scale.
Also, whether one pays for the services or not, tech companies have made it a routine to encroach upon personal data to increase revenue / profits without providing real value or any value to the end-user: There's always a grave concern of what happens to the data collected when a tech company goes down-under, for a reason.
- Third-party services: Blatant disregard of end-user's privacy with separate terms, conditions, and policy to first-party. Ex: Payment gateways selling purchase data.
- First-party services: Often weasel-worded yet ever-changing, generic and vague privacy policy. And, more often than not, eventually, user ends up being the product. Ex: Anti-virus software / Other free google services like mail, search, maps.
The concerns raised by so-called fundamentalists isn't so much privacy-related but the kind of strategy it entails, with companies from all corners vying for ever decreasing attention time-span to keep them hooked, addicted so they can show more ads, coarse them into buying things they don't want, manipulate them to further someone else's political motives, sway their thoughts to sell more outrage and anxiety, expose them to content just to push the right buttons, target their emotions to control their behaviour... aren't all of these dark patterns abominable? Aren't these in-use by the multi-billion dollar Internet businesses because its easy strategy?
There's more to privacy violations than meets the eye. I expected a more in-depth analysis than concluding that a compromise in privacy and security is okay 'cause it helps weed-out terrorists and pedophiles, and create successful business, like stratechery, online.
I find the last few paragraphs about Apple and Webkit to be super... ad hominem? "Apple is taking a stance with Webkit to be absolutist, but Apple violated the privacy of their users with Siri, so why should they be absolutist about Webkit" I really didn't follow that or think it was relevant to the actual discussion around tracking protection at all.
I understand the point about 'gray area', however there is truly a difference in power here, and end-users are never accurately informed about what of their data is available to 3rd party advertisers. In fact, I don't think content producers like Stratechery know what data ad trackers track and how it's used.
There is a problem about funding good content like Stratechery, but surely mass surveillance by private corporations isn't the only option?
I want everyone to be able to go on a website, checkout products in an online shop, look up the menu of a restaurant or post on a forum with an alias, without anyone keeping track. I want everyone to write emails encrypted by default. I want the data transaction to be transparent, temporary and local, like when I decide to buy something online, and need to give them my address. I want services like Siri, Alexa to work local, offline. I want authentication services using public/private keys.
I want all of those things set in stone, not to be overruled by a default checkbox or "I agree" button.
I want a massive, global deletion effort of all the data that had been acquired during the last decades.
> the widespread creation and spread of data is inherent to computers and the Internet
It is not. It is that way because we developed software to he that way
Edit: I don't beleave in absolute privacy, we never had that, but Ibdo think people get surprised and find things creepy for a reason: we are abusing the trust that people had put on us software developers and hardware manufacturers
Except I can't even visit the landing page of Outline without the page trying to jam analytics packages down my throat. The ubiquity of that garbage is the topic at hand, and it's refreshing to see tools that don't take that route.
We are deeply embedded in an era of surveillance-capitalism. Technology has made it trivial to passively gather information about people and inexpensive to store data indefinitely, so companies gather, stockpile, and re-sell everything they can. Crucially, the future applications of this data are not limited by our current imagination.
I find the dismissive attitude of this article extremely disappointing. Bad actors are getting away with overreach because they are given the benefit of the doubt. Stop enabling them. Re-evaluate every aspect of your interaction with third-party services and truly consider what they could be doing, from the least-generous perspective available.
Consider how many "best practices" and tools for web development (including browsers themselves) are ultimately designed and popularized by companies with an open interest in spying on users. Technologies like CDNs, AMP, embeddable analytics packages, generic authentication providers, and social sharing buttons all provide some utility or convenience, and may be acceptable or appropriate in some contexts, but it is absurd to consider them harmless. If you have a website, the buck stops with you.
Countless websites- including Stratechery- are literally selling the privacy of their users for kilobytes of hosting costs or mild conveniences. Step up your game, and don't pretend you can't treat your audience with more respect.
> the truth is that all of these “trackers” make Stratechery possible.
Do they? I use Ublock Origin in hard mode[1], meaning just about everything is blocked. Every third-party domain is blocked, as are webfonts and first-party JavaScript. Yet, the site works and I can read the article without a problem.
You could make the case that Stripe is necessary to “make Stratechery possible” because it allows people to pay for memberships, but all the others seem superfluous.
I'm quite torn that privacy protection tools, which I use liberally as a consumer, can foil fraud-detection tools that I also rely on in my business.
Browser/device fingerprinting is one of these things... bad when used to track you for advertising purposes across the web, good when it can be used to prevent account hijacking ("Hey, it looks like you're signing in with a new device! Please enter the code we just sent you via email.").
It seems to be a "this is why we can't have nice things" situation. That harder it is to identify someone, the harder it is to take proactive measures to defeat credential stuffing attacks and so forth.
You really shouldn't be fingerprinting at all. As sexy as a risk department may find it. Imagine what the social interaction would be like if you implemented that functionality in the physical world, but replace the computer with the user's home, and the service as you.
Before even so much as a how-Dee-do, you're running all over the house cataloging the exact position of everything in order to "decrease your risk exposure."
This would get you shot in some locales. Is it truly a mystery why this type of intrusion going on makes people uncomfortable.
It's one thing to explicitly ask to do so. Implicitly doing so without giving the client a chance to refuse is both rude and exploitive.
As much as I wish it was unnecessary, we run a cryptocurrency platform. There are constant attempts to sign up using fake/stolen credentials, and we have to make sure our users don't fall prey to various attacks that could lead to total loss of their assets. Fingerprinting isn't the be-all and end-all and it's not 100% reliable, but it's extremely useful as part of a broad set of signals for detecting bad actors.
The comparison to a physical address is silly - the address is the fingerprint. If I can meet you at your house, and obtain a legal document demonstrating that you own the house, I don't need to do anything else. There is no such equivalent for most internet-connected devices, or at least not one that's obtainable outside of law enforcement applications.
I'm all for fundamentalism on that front but among the undesired consequences listed those that related to SSO and 3rd party auth are indeed problematic. Not that I think that the current protocols are holy and should be kept all at cost, on the contrary, I find the privacy issues in those solution very problematic.
I avoid using Facebook or Google auth for any 3rd party service.
Nevertheless, at least in enterprise environments, there is a very strong and justified need for federated auth. This is issue that must be solved to keep businesses and users secure.
I remember when I was young, being told not to say or do anything online I didn't want others to know (private messaging being the exception). When did we collectively forget this lesson?
Among other things, the line between online and local to your computer has been blurred quite a bit. Before all this JavaScript tracking became common, the only information a website would get when you visited a page was... the page you loaded and the IP you loaded it from. A third-party may have gotten similar information if there are embedded resources.
Now, though, thanks to client-side analytics, that website and any of those third parties may be recording your mouse movements, checking to see how long you stay on the page, fingerprinting you based on what addons/fonts you have and what your screen size is, etc.
It's more than just on the web. If you took a photo on a camera, it stayed on the camera. If you take a photo on your phone... is cloud sync enabled? Because it might be by default. I was pretty shocked at how pushy the preloaded apps were when I recently bought a budget Android phone.
You are being surveilled and tracked by governments and corporations regardless of what information or actions you willingly make/provide. People should be aware of how this is happening and methods of protecting themselves. Simply not saying or doing anything online you don't want others to know is absolutely not an effective method of maintaining one's privacy in the modern tech landscape
I mean... people have been using the internet for porn for a long, long time. We've never expected our email provider to also happen to collect analytics on our less public browsing habits. "Private" mode was originally just to keep your porn out of your browser history, not to defeat trackers.
We didn't. But both the scope of our online activities and the capabilities of online adversaries have increased. "Not saying or doing anything online" is no longer sufficient to make sure others don't know.
Happy to be a privacy fundamentalist and I'd also be happy to see the entire infrastructure that pays for the internet grind to a halt in favor of pervasive privacy.
I agree that GDPR has strengthened incumbents. I also agree that Facebook and Google are dangerous. But I think that appropriately siloed ad-tech can be valuable. The problem is that ad-tech refers to all technology aimed at advertising. These range form useful to creepy. I am omitting the outright fraud from this.
For example, things like television attribution providers which listen on your smartphone to try to watermark advertisements and see whether you bought something that was advertised on the RV or radio are pretty creepy. And I have heard too many anecdotes of Facebook ads showing up after conversations not to suspect Facebook is doing the same with conversations (though I can't prove it and can't be sure). The level of intrusiveness there is very high and the creepiness factor is similarly troubling.
But now let's look at something very different. Suppose you look at providers that simply do analytics on something like purchases and only share the data with the client. To my mind there isn't so much wrong with this. And you could have third party referees that can then referee disputes on whether a given click lead to a purchase or not, allowing better payment models. Those are not so problematic.
Everything I have seen in the advertising industry (print and online, etc) is that it is rife with all kinds of problems. And yet the industry exists because advertising is an important part of reaching an audience business-wise and so people put up with all the crap. So I am not sure that ad-tech needs to die. We need a better ad-tech industry. One without the creepy parts.
To this end, I think we need more things like GDPR, not less.
Every now and then we see articles which dissect how the Internet works, and then are appalled at all the supposed privacy violations. Trackers are part of the web and have been for a long time. I don't see them going away soon. They fund the web and keep it moving along. We get into trouble when all our browsing is done in one session / container and no effort is made to containerize our browsing so that chatty scripts and trackers can't talk to each other via cookies as you traverse the web in any meaningful way.
Personally I divide up my browsing into many different compartments to avoid or seriously hinder tracking. I also only ever enable JS on a site if I really have to. I am suspicious of news articles that demand JS just to read a small bit of text. This is not some magic bullet - I know I will be tracked; but at least now my identity is divided up so much that it's hard to discern who I really am. Couple this type of compartmented browsing with an anonymous mixing network like Tor and it's game over for trackers.
Having one single container / session for all your activity (and without an adblocker like uBlock Origin) is borderline stupid in this day and age and can do more harm than good. The modern web has been designed to be as hostile as possible when you surf 'bareback' without an adblocker and without dividing up your sessions into discrete containers. Of course you will get people that like to surf bareback, with literally zero configurations in their browser and without an Ad-blocker. These are the people who are the most exploited, their every activity mined like it was oil - their whole life put into databases that could leak and be hacked at any moment, meaning they're very susceptible to 'doxing' or having their life ruined.
> Trackers are part of the web and have been for a long time.
And that's a big part of the problem. Just because something has wormed its way into common use doesn't mean that thing is acceptable.
> They fund the web and keep it moving along.
There are still large swaths of the web that this isn't true for. And for the parts that it is -- that's also a huge part of the problem and needs to stop.
On the contrary. Suggesting that Amazon aren't tracking requests to Cloudfront is ridiculous.
You don't need to have malicious intent in order to accidentally send your user's data to third-party companies. Every single time you load any resource from anywhere, you create at minimum a line in their log file that contains your user's IP address, their User-Agent string, the content their browser requested, and the time they requested it.
EDIT: Personally, I no longer load any resources from third-party domains in new code I write. Bootstrap, jquery, etc. get served from the same domain as the rest of the website. Google Analytics is an absolute non-starter. And if you're concerned about wasting loading time or bandwidth on popular assets, you can use the "integrity" attribute to let the browser reuse a cached file from a different domain.