I just wanted to chime in from Stack Overflow here and let people know: we are aware of the issue. And we're NOT okay with it. We're trying to sort out how to kill the audio behavior now. It's not very straightforward to find where it's coming from, but we are working on it. We've also reached out to Google for their assistance in tracking it down. If anyone can offer advice, we'll more than happily take it.
- Nick Craver, Architecture Lead at Stack Overflow
Let's be adults here. This is SO, and I imagine you've used and enjoyed the use of their services just like the rest of us. Support them by letting passive ads sit on the edges of the page, and appreciate that they are actually trying to solve this issue.
They could “solve” the issue by not having third party ads. Of all the sites on the internet, StackOverflow has the demographics that any advertiser would crave. How large of an inside ads sales force would you need to target higher than average income earners?
How large reputable sites trust third party ad servers is a mystery to me.
Besides, native ads that could be served from StackOverFlows own servers would be harder to block.
> Of all the sites on the internet, StackOverflow has the demographics that any advertiser would crave. How large of an inside ads sales force would you need to target higher than average income earners?
I'm curious what you think the world wide demographics of SO are.
That's just not how digital ad campaigns are run. Advertisers, agencies, media buyers and the rest of the supply chain don't negotiate with individual sites like that, not at any scale that can sustain a site like SO.
No. They already do this. This is how ads have been run on SO forever. The new ad network ads that they've started with are an aberration from SO's own established practice.
No they don't. They use their own adserver (bought from adzerk) to physically serve the ads, but they have always come through RTB connections to ad exchanges.
There are private marketplaces and "automated guaranteed" deals to isolate their inventory in its representation and pricing from the rest of the market but the actual campaigns they get exposure to, and the creatives delivered, aren't special to them.
They run house-ads for their own jobs board and products, and they will have private marketplaces inside the exchanges, but those campaigns and creatives will still go through the standard adtech supply chain with all the JS-based layers added on.
They do sell job postings and have sponsored tags so it's not all network ad revenue, but that's a minority of the income. Since they released their Q/A SaaS product now, maybe they’ll shift to selling that as the primary revenue stream.
I thought Penny Arcade does bespoke ads. Rock Paper Shotgun used to do bespoke ads. I always found them million times more effective, and more trust-inspiring of both ad AND site, to me.
Is SO smaller business? Possible... just surprised.
Those sites are far smaller than Stackoverflow. Bespoke campaigns don't always pay more, and usually mean less total revenue for the site if that's all they run.
Tell that to John Gruber over at DaringFireball - He makes $6500 a week just from posting one ad in the RSS feed. If he can gross over $300K a year on a niche Apple blog without a sales team, can you imagine what a sales team could do at SO?
He also sells three ad spots on his mostly weekly podcast for $6000 each. He’s a one man business grossing over 1 million a year without a sales team.
I don't need to imagine, I've been in adtech for 12 years and built companies and sales teams. DaringFireball is 1 example of an extremely tiny minority that can earn that today after building up an audience and reputation for decades.
It doesn't just scale up linearly and getting to $10M is magnitudes more work, especially if they're going to place their own requirements on campaigns and creatives. Even buzzfeed went back to programmatic ads with layoffs because their custom articles didn't sustain the business.
It’s the whole “1000 True Fans” method of making money. If your company can stay small, it doesn’t take as much to have a sustainable “lifestyle business”.
Native is a format, RTB is ad transactions and delivery. They're not mutually exclusive. If you mean an entirely custom format then that reach is offset by the higher production and impression costs, which results in less overall demand.
Liquidity becomes an issue, which is why Reddit is a good example. They also use adzerk and built their own custom self-serve ad network but they make very little money compared to similar traffic using standard programmatic demand.
That would be the reddit example, just low overall demand for something like that. They could try it, but probably already did the analysis to see its not worth it.
Youtubers run ads and sponsorships in their videos all the time (not the automated ads from Adsense). Squarespace, Dollar Shave Club, Ting, PIA; they all directly pay Youtubers to run ads. Youtube even had to block Youtubers from using brand's logos outside their ad network in videos.
There are a million channels for every one that gets any sponsorship, and there's an equivalent ratio for the pricing of these ads. The CPMs on sponsored content are usually lower than network CPMs until you cross over to the very popular channels, but even then it's really just shifting the production costs of a video ad to the youtuber and paying some premium for their reputation.
It's very hard to scale and again there are networks and agencies that aggregate channels for most campaigns and buyers. Anyway, Stackoverflow isn't producing their own content and the pricing mechanics of Youtube/video advertising is very different than display ads so this isn't really comparable.
Stack Overflow has a known demographic and is not some unknown site. Are you really saying that companies like Amazon (AWS),Microsoft, JetBrains, Google (GCP), Slack, etc would ignore a sales pitch from SO?
It doesn't work like that is what I'm saying. You dont just pitch those companies. They have layers of agencies (a master agency of record, creative agencies, media buying agencies) that handle all the advertising duties. These shops create and traffic these ad campaigns in demand-side platforms (DSPs) which connect and bid on inventory in adexchanges and supply-side platforms (SSPs).
Then there's layers of targeting, accountability, measurement, and insurance that gets requested and bought from anti-fraud, brand-safety and verification vendors. That's likely where this fingerprinting came script came from.
Going to individual agencies with your own different supply path is not going to get any attention and nobody is going to change the way they buy millions in advertising just for you. No single publisher has that much power these days, not even Stackoverflow.
of course you pitch this way. you hire an inside sales guy who has personal relationships with c-level execs, and those execs make it happen because they're personally invested.
Make what happen exactly? C-level execs do not run ad campaigns. At most they'll defer to the CMO, who then just tells you to talk to their agency.
There are private marketplaces and other deals you can work out with agencies but this is usually for inventory against existing campaign RFPs, and comes with all of the typical creative requirements. They're not going to run an entirely custom campaign under your own terms and restrictions, not at any sustainable scale.
Thanks for your insights here dude. It is wasted on HN crowd though, who are mostly clueless about business and have pre-conceived notions and hate about everything.
Any books / blog posts that you recommend. You should write a book :)
Most ad money is in walled gardens (Facebook running ads on facebook.com). SMBs and performance marketers buy billions without ever talking to a salesperson.
Advertising at big companies is the only space left for traditional sales teams and it's all outsourced to agencies. Adtech vendors and publishers sell to these agencies, not the client (and if they do, its just redirected) but there aren't any long-term deals because everything is constantly shifting. Agencies have lots of teams, they win and lose accounts, work on multiple campaign initiatives and strategies with constantly changing budgets and requirements, and use dozens of vendors to create and execute campaigns.
This is the opposite of a traditional SaaS contract sold for a term directly to the people that will be using it. There's also a steady trend towards all inventory being traded programmatically which will eliminate most of the sales negotiations. Unique inventory and formats can still stand out but they also cost more in effort and money so there's less overall demand. It's very hard to scale custom sales like that, especially if one of your requirements is to avoid all the javascript verification, brand-safety and measurement.
Most media sales these days is more about biz dev to get the pipes connected to exchanges and represent your inventory in the best way possible while letting the market work.
No. We got content without this in the past, and we can do this in the future. And I will note THEY admit this is bad. Stop trying to defend the indefensible.
Who paid for the content I actually visit StackOverflow for? It surely wasn't SO; they provide a nice platform but they also get that content for free. This isn't a journalism site, the value in SO comes from freely provided user answers. Yes, SO provides some value vs. forums via their q/a platform, but it is a marginal amount of value.
Sure, SO is easier than parsing a forum thread, but the actual value that I care about is the answers provided for free by their users. I could easily return to 90's era usenet, it wasn't as convenient but it worked. What I couldn't deal with is a lack of a platform where people ask technical questions & get answers, I remember being on dial-up and reading paper manuals that were out-of-date/incomplete. But SO isn't irreplaceable, and I am oftentimes frustrated with finding questions closed for incorrect reasons, normally my answer is buried 2 links deep in SO because my DDG search (and Google too) takes me to an improperly closed question where the 'previously addressed' question is adjacent to my query.
StackOverflow does not provide an irreplaceable service; like github they do some nice things but there isn't any reason they must be the dominant platform. And the real value is in the answers, which SO gets for free.
"it wasn't as convenient but it worked" - this is the story of every single dead product. Zunes were less convenient than iPods and iPhones. Books are less convenient than Kindles. AltaVista and Yahoo Directories were less convenient than Google.
SO doesn't pay for the content, they pay for the space to host that content and community. Since users aren't paying, SO needs to monetize it somehow and that's what the ads are for.
I'm sure you could return to the old internet, but many billions of other internet users enjoy the content they consume for free. Use your adblocker and stick with paywall/subscription sites because we're unlikely to ever go back to a pre-commercial internet.
A tiny nitpick: nothing is free, the internet users just pay with a different currency: their data and attention. Both of these things are much more valuable than most internet users think.
As far as a non-commercial internet goes... well, we can always hope. We just have to wrangle the means of content production and control (heh, heh, see what I did there?). The resources are there to do that and have a free (both as in beer and in freedom) and high-quality internet, what we are missing is... attention of the masses, the most expensive thing.
It is straightforward, but not for publishers. The adexchanges know, and do business with shady companies on both sides because they make money from volume without any consequences.
It's ridiculous. It's a text-based ad. At worst, it's a clickable image. At what point did it become okay in your minds to let advertisers run arbitrary code?
I've left ads turned on specifically on StackOverflow because 1) I want to support StackOverflow, and 2) I trust them not to run malicious ads.
I don't even care that they're running ads network-wide. But if they're going to be running these kinds of ads anywhere on the site, they're going right on the ad block list along with everyone else.
It’s completely insane. Can you imagine a TV station receiving ads on tapes and playing them to their audience without looking at them first? Can you imagine TV stations occasionally showing ads containing porn, urging people to kill, showing extreme violence during cartoons, or containing specially crafted audio that blows out your speakers, and the TV station just shrugs and says they try their best to stop these things but they can’t stop everything?
Imagine a TV ad that tries to make your phone call a 1-900 number so they can rip you off, and the station says they don’t know where it came from but they’re trying real hard to put a stop to it. And somehow watching the ads themselves before broadcasting them never crosses their mind.
It’s worse than that. Imagine a TV ad which sends malicious code that gets executed to your television, which profiles the hardware in your TV and sends information about your viewing habits (tied to a unique ID) back to the advertiser.
In any other context we would call this a security vulnerability. I think that label also applies here.
The only reason it’s not the same situation is because they’re willing to throw their users under the bus for a little extra cash. If they wanted to exert more control, they absolutely could. Ads would cost more and we’d see fewer distinct ads as a result.
Digital ads could work where every single one is vetted by people before it’s served to any users. There is no reason it can’t work this way, other than it being a lot cheaper to skip that step.
All creatives (and the root templates of dynamically construted ones) are actually audited on the advertiser-facing platforms before they ever get to the publisher.
Unfortunately running javascript means these ads can do anything at any time and change into malware. Other than adding some technical guardrails, the best practice would be to ban bad actors (of which many are known and usually the same shady people) but many large adtech companies look the other way because it makes money and they have no consequences.
Malware and adfraud is primarily a business problem, not a technical one.
It's not that simple. There are many layers in the supply chain that currently requires JS. Publishers can't disable the JS and they can't demand JS-free creatives either.
Of course it’s that simple. Don’t let ads run JS. Done.
You’re saying that doing this would drastically decrease ad revenue. Which is what I’m saying too: it’s about money, not necessity.
Would a site like SO be unable to survive without ads that run arbitrary JS? I don’t know. Even if the answer is that they must do this to survive, it’s still insane that content companies let randos inject arbitrary code into their pages. If this is so entrenched in the industry that there’s no way around it, that just means the industry is insane.
Money is a necessity, that's how SO exists, and it wouldn't sustain its current size if it required JS-free network campaigns or tried to sell all ad space directly.
Simple doesn't mean it's easy or realistic. Yes, adtech has major problems but they're being slowly worked on and won't change overnight. This applies to any other industry where you think can just walk in and solve everything if everyone just did X. Reality doesn't work that way.
We know that advertising can work and make money without arbitrary JS. When there’s a clear existence proof, is it really wrong to say that a problem could be solved by not doing the problematic behavior?
Of course reality doesn’t work that way. Ad companies aren’t going to change, because they like money and don’t give a shit about users.
We’re stuck in a local minimum. It’s insane. It could be easily fixed if everyone just stopped doing the insane things. And they won’t stop.
That sounds nice but is neither realistic or even sensible. There are other solutions like sandboxing to prevent access to features, it's not an unsolvable problem.
Billions? No single creative is seen by that many. In fact, with dynamic creative optimization (DCO) and all the optimization that happens, you can easily get creatives that are custom generated and only see by a few individuals or even a single person.
Sites like StackOverflow require JavaScript to work (or at least, to work in a manner approaching interactivity). So, even someone who disables JavaScript normally, would presumably enable it in order to use this popular and useful site. Furthermore – and importantly – they place trust in StackOverflow not to abuse the privilege of executing arbitrary JavaScript. That is an entirely reasonable thing for a technically savvy modern web user to do.
By serving this ad with JavaScript not vetted to StackOverflow's presumed standard, StackOverflow has violated that trust. Thus the onus is on them, not the user, to remove the offending ad or risk damaging their brand.
Honestly, what you said is like saying "why would you ever not keep a hand on your wallet" after someone got pickpocketed in a nice restaurant. Reasonable people have reasonable expectations of safety in certain places which they trust to provide it for them. No-one should go around being constantly paranoid of pickpockets everywhere, no more than anyone on the web should be constantly paranoid of malicious JavaScript even on sites with established records of safety.
Not just arbitrary JavaScript, arbitrary JavaScript where they can’t easily even see where it came from! Sheesh.
Could we require advertisers to sign their ad code to have a trail of where it came from, prevent tampering, and make it easier to pull the plug on bad actors?
The people bearing the costs of the internet ad economy aren’t the people in any position to do anything about it. So there’s very little pressure to fix anything.
Maybe if the US government started threatening to enact something like GDPR unless the a democratic industry gets its shit together.
Large adtech demand/sell side platforms do not want to remove these bad actors because they make money on percentage of spend. They are incentivized to increase volume and ad spend at all costs, and there is no regulation to stop them from doing otherwise by continuing to deal with shady companies and known malware techniques.
If you're "NOT okay with it", how about stopping ads completely until you resolve this problem? That should give a bigger impetus to solve it ASAP as the bottom line gets hit for multiple stakeholders.
This is not just ads, but about fingerprinting and tracking users somehow or the other by third parties. It's plain evil, and not a decent thing to continue foisting on your unsuspecting users after you've known it. Tell management to take an ethical stance and preserve the reputation of SO.
Probably not his call. By "we" he's probably talking about the engineering team, which in many cases is nothing more than a conduit for whims of the marketing and sales teams.
The only time they'd do that is if the marketing team decided that the value-add from taking ads off cancelled out the profit loss from taking the ads off.
"The ad is attempting to use the Audio API as one of literally hundreds of pieces of data it is collecting about your browser in an attempt to "fingerprint" it... Your browser may be blocking this particular API, but it's not blocking most of the data."
Seems like killing the audio is the metaphorical putting a finger in the dyke of serving arbitrary JavaScript to your users.
> we are aware of the issue.
> We're trying to sort out how to kill the audio behavior now.
Are you really aware of the issue? The issue people have here is not the fact that the ad is trying to access the audio api per se but that it is trying to fingerprint the users.
This is the actual problem at the heart of it all. And even if it were more profitable to take subscription fees than to serve ads, what's stopping you from "double dipping" and serving ads anyway?
ArsTechnica (obviously a very different site compared to SO) has an ad free subscription model where it also removed all trackers for paying subscribers. It's possible to do this in an ethical way. Whether the site publisher is interested or not is a different matter.
You think the NY Times, Linkedin, etc. is going to have the same response as StackOverflow? Good luck even getting in touch with someone who knows what you're talking about.
Very likely. I'd pay hundreds of dollars a year to Gogle if they guaranteed* me, with severe legal repercussions otherwise, that they wouldn't track me, or allow a single bit of my data, anonymized or not, leave their servers, or be used in any other way that wasn't for my own purpose.
Re-selling digital personas as commodities must be far more lucrative.
I hear from multiple sides people reporting, to receive ads about topics thy only talked to friends about but never entered in a search engine.
Google has is currently as far away from their previous world famous "don't be evil" corporate culture.
Other examples are AMP where Google wants to make it harder to de-individualise URL's. This is being driven to an extend where Chrome on Android makes it harder to edit the URL.
Or games like Egress or PokemonGo, which in my opinion helps Google constantly update their WiFi SSIDs-To-GPS-location database.This database is rhen furthermore being used to track users location through a little permission called "WiFi Control", which also can not be found in the regular App Permissions settings entry.
To me WiFi-Control sound nothing like location tracking. But I have to admit, I am not a native speaker. Therefore I might be misunderstanding something.
It's hard to read the obfuscated code and be sure what's being done with the browser environment information. This script seems to generate some hash and put in some global variables, presumably for some other script to consume. I don't know whether such scripts send it to a server, compare it locally to a previously-known value, or ignore it.
How We Make Money at Stack Overflow: 2016 Edition: Quality ads. "...we don’t want to use an automated system that selects some ads for us. We looked at this. It didn’t allow us the control we required to maintain the level of quality we want to maintain."
How We Make Money at Stack Overflow: 2019 Edition: Taking money from Microsoft and Google fingerprinting our users 100+ ways
3. Look for a js sandbox -- this _will_ break arbitrary js, will not be supported in all browsers, and will require dev work on your side:
* Google Caja https://github.com/google/caja
* MentalJS https://github.com/hackvertor/MentalJS
other options are available as well, in varying levels of maturity and support.
I think using a sandbox iframe is not going to be able to defeat browser fingerprinting, because the sandbox control options are not rich enough. You would need to block all JS.
1. scrollbars and positioning can cause problems with iframes that an inline div doesn't have, especially if there are multiple small iframes on the page.
2. As soon as you allow script in the sandbox iframe, then you are susceptible to these types of fingerprinting attacks. The fact that you have origin isolation doesn't really block what the ad was doing. This is because iframe sandbox was never designed to block fingerprinting attacks, it was design to create a separate origin that gave the dev broad control over features like 'allow js' 'allow access to origin', etc.
>1. scrollbars and positioning can cause problems with iframes that an inline div doesn't have, especially if there are multiple small iframes on the page.
I'm not quite sure what you mean here, but I'm curious. Have any examples?
Ideally you would like the iframe to not be visible -- you don't want it to show scrollbars if the content overflows.
But at the same time, you want to see all the content in the iframe. If you knew ahead of time exactly the layout of the text in the iframe you could do this, but it's harder when you have dynamically generated content inserted into the iframe, and now add to that wanting the page to be on different devices with different viewports, resolutions, users resizing the page, users increasing or decreasing text sizes for accessibility or changing default fonts.
And if you don't control the content, some of it may contain fixed size elements or absolute positioning inside the frame.
It's a really difficult problem that we were struggling with before ultimately giving up on trying to use iframes for this purpose. And when you make a mistake you either get ugly scrollbars in your iframe or part of your content is cut off when the user resizes the page.
Maybe it's to identify users behind a VPN as this is fingerprinting the device, not the connection.
That's why I think the idea of running each site in a container is so effective.
And while we're at it the container should just spit out random shit like different resolution, audio api, user agent, once in a while (unless the user turns it off) to thwart such attempts.
Unfortunately when the creator and maintener of 67% of all browsers is an ad company who is exploiting this in the firsr place, then there is no chance that this could happen
> And while we're at it the container should just spit out random shit like different resolution, audio api, user agent, once in a while (unless the user turns it off) to thwart such attempts.
Wouldn't that break the legitimate feature-detection uses for these APIs? Asking the user to identify and whitelist each call is impractical, especially since the fail-case in this scenario would be subtle (you'd still see the page but it might randomly be in the wrong mode, or images might be scaled incorrectly, etc). At that point you might as well just turn Javascript off.
Yes I thought about it that's why "unless the user turns it off" comment in parens. I think out of 100 sites I visit everyday no website needs to access the audio api without my consent maybe except one or two which i can whitelist. Same for user agent, I don't think it should break if the container says I'm running firefox v65 or v67, etc.
It was already possible to write CSS that adapts to various screen sizes before CSS became a privacy issue (ie. before CSS 3); except it was called regular CSS, not responsive.
My guess is the difference between "regular CSS that adapts to screen size" and "responsive CSS" is that the former only has a single set of rules while the latter has different CSS rules that get enabled/disabled based on screen size.
Conditional rules -> different content gets loaded -> server gets notified of what rules are enabled -> fingerprinting
Have you heard of those projects trying to defeat behavioral tracking where, whenever you visit a page, it simultaneously opens a bunch of other random pages in the background, hidden from you, and simulates activity on them, the idea being that Facebook has no idea what actual websites you like to visit because it's lost in the noise? What if instead, whenever you visit a page, your browser or a plugin or a proxy server or whatever opened the same page simultaneously in a bunch of hidden background windows, with a random configuration of audio enabled/disabled, user agent, screen resolution etc fingerprinted characteristics?
That way, the page displays correctly for you, but the server has no idea your actual fingerprint.
There's some trickiness to get this to work right; the collection of fake fingerprints would have to have a certain amount of persistence, because if it was regenerated every pageload, the server could probably tell that only one fingerprint kept showing up repeatedly. Maybe each fake fingerprint should have a completely realistic-seeming browsing session, happening in parallel with your real one, with half the collection continuing on browsing even after you're done? Except wait, ads could just separately target every fingerprint, and it doesn't matter if 99% of them are fake as long as its accuracy for your real one is still good. To defeat that you need the randomized activity using your real fingerprint.
The ideal would be if this was done through a proxy server, which would then know every fingerprint ever sent to a website. It could then provide you with a random collection of past fingerprints that have actually visited the same website, so every visitor gets a collection of fingerprints randomly drawn from the same "bag", rendering visitors indistinguishable.
And this is why, even with the best intentions of site operators, my browser will continue to use the best ad-block tools I can get, and my networks will be protected by tools like PiHole.
In the 2005 era when I was a young video gamer I used to play World of Warcraft. There was a site, Thottbot, that players would use to find out information about things in game. I picked up a keylogger malware from their adservers. One of the advertisers had been hacked and was serving Malware every few thousand ads. Since that day I've used an adblocker and I'll always continue to do so.
Exactly. Market solutions for market problems. I'd love to see the Raspberry PI foundation develop and sell a home router with PIHole for regular consumer use.
Considering the alternatives, that sounds really appealing for me. I'd also buy it for my less tech-literate parents.
You can't profit your way out of a problem you profited yourself into. There will never be enough people setting up PiHoles to offset the value of spying, and it's publishing platforms like StackOverflow that suffer.
It’s incredibly disrespectful. Nobody wants some random ad listening to their microphone. That they’re trying it anyway indicates that they’re hoping to get some people with browsers that don’t block it, or trick some people into saying yes.
It’s not harmful, as long as you’re not one of the people who gets tricked. But it does indicate that they want to do you harm, and try to. That they failed doesn’t make it all better.
Arbitrary javascript execution is generally meaningless. Very rarely you'll get a zero-day or something, or maybe a site will use too much battery when focused.
Really? Malicious JavaScript can steal anything you can see and do anything you do. It can steal your passwords, bank account information, or transfer money out of your bank account. It just depends on what kind of page is serving up the arbitrary JavaScript.
It's pretty obvious that the only real fix is to accept money in exchange for putting an image with a hyperlink on your website.
Anything involving javascript will do shenanigans for various reasons. Fingerprinting via any means possible is industry standard ad-network behavior at this point. No one in the industry could imagine doing any less - it's impractical, it's absurd. But targeting! But fraud! But the only fix is to just give it all up, go back to how it was done in the 90s.
This makes little sense. If the data collection capabilities are more restricted, how would one be easier to identify? Firefox has a significant market share still.
Which is why one of several things should happen. The first option is that there be legal requirements to adhere to public standards if you are a content distributor. And that any standards compliant client side software be allowed to use the service. In some areas we're in a world where your telephone carrier sells you your telephone, e.g. twitter, apple's imessage, etc.
Other options would be that if you are a content distribution company, e.g. youtube, google, facebook, twitter, instagram, etc. then you cannot have any control of the client side applications that consume the content. Trustbusting would come into play here.
Or legal obligations to follow a user's desire not to be tracked with real criminal fines and jail time applied to executives, managers, and developers who failed to follow the law.
What kind of legal requirement can thwart a multinational corporation? The UN cannot enforce any laws, the only supra-national entity with this power is the EU (see GDPR).
It's easier to just use Firefox with uBlock Origin, Cookie AutoDelete, etc etc.
Why is this surprising to anyone? It is clear that ads use tracking mechanisms and cookies and this is no different.
Audio feature detection isn't even a novel techique.
I've seen trackers look at download stream patterns to detect whether or not BBR congestion control is used, I have seen mouse latency based on the difference between mouse ups and downs in double clocks and I have seen speed-of-interaction checks in mouse movements.
Just checking for the constructor of something an ad might legitimately use (like audio) is relatively benign to be honest and it is naive to expect ads to not do this and it is why I use an ad blocker even on sites without annoying ads
But for code that's supposed to be so smart in trying to fingerprint people without them knowing, calling an API that throws a warning in the browser seems like a really stupid move. Especially since that can be checked through feature detection, which is literally what this code is doing...
And as a fun fact networking timing fingerprinting attacks and work even if you don't have JavaScript enabled and I have been able to make a PoC that was very accurate (I did not release it but I did disclose some bits to relevant parties)
There's a passage of Carl Sagan's "Contact" that's on point and interesting to read 34 years later. The billionaire who helps to decode the Message (from outer space) and ends up building the working copy of the Machine made his fortune by selling tools to detect and block ads from television.
There is some discussion of the technical cat-and-mouse game he has to play as advertisers try to make their content avoid detection and blend in with the regular programming. In this version of the future, the ad blockers eventually win and network television is destroyed. (The book also features networked computers and email ("telefax"), but the concept of ads appearing on them was still too futuristic for 1985.)
Adnix and Preachnix were the essence of capitalist entrepreneurship, he argued repeatedly. The point of capitalism was supposed to be providing people with alternatives.
"Well, the _absense_ of advertising is an alternative, I told them. There are huge advertising budgets only when there's no difference between the products. If the products really were different, people would buy the one that's better. Advertising teaches people not to trust their judgment. Advertising teaching people to be stupid. A strong country needs smart people. So Adnix is patriotic. The manufacturers can use some of their advertising budgets to improve their products. The consumer will benefit. Magazines and newspapers and direct mail business will boom, and that'll ease the pain in the ad agencies. I don't see what the problem is."
Adnix, much more than the innumerable libel suits against the original commercial networks, led directly to their demise. For a while there was a small army of unemployed advertising executives...
I feel that it may go the other way: that receiving communication from a source that is supported by ad revenue while knowingly and actively bypassing those same ads will be seen as theft. I fully expect lobbyists to push for this and see some success in the next 10 years.
When people refuse to watch ads, there is theft going on, but it's theft from the advertisers by the media owners. The viewers aren't guilty of anything.
I think ad blocking is a misnomer. What people are trying to do when blocking ads is prevent marketing people from spying on them. And the performance and resource consumption that comes from that.
Personal opinion: Laws are needed to make what advertisers are doing illegal. Advertisers are spying on people to the extent where if the government did it they'd need a warrant.
I'm only mildly bothered by the tracking, since it seems so inaccurate, but the ads themselves always drive me to adblockers. Taboola were running pictures of rotten teeth for a while which was intolerable; Youtube ads are often louder than the videos.
They may be inaccurate when you actively block trackers but they are surprisingly effective if allowed to do what they want. The whole “I think they’re listening to me” effect is because of how effective these trackers are.
Web tracking data gets combined with your real life digital breadcrumbs collected by the data aggregators. Their profiles of you are extremely accurate.
I disagree. The tech crowd is using adblockers to prevent spying and resource consumption. But majority of people running adblockers just don't want to see ads.
In most cases, I don’t think it’s ads as concept that’s the problem. If websites only had static ads in the sidebar, I question how many people would bother with ad blockers.
But when ads block content; include flashing animations, audio, and video; and take up more layout space on a site than the actual content; then people have had enough.
I disagree that most people use ad blockers because they “don’t want to see [any] ads.”
Meaning, if advertisers hadn’t built more and more intrusive ads and had stuck with static ads that don’t severely harm the UX, then I doubt most users would bother with ad blockers.
Yeah no one would care about magazine style ads with an ordinary click through link. Especially if clicking resulting something useful instead of being the browsing equivalent of jumping into a dumpster fire.
The advertiser arms race has resulted in a classic tragedy of the commons. That's my diagnosis of the problem. Traditionally regulation is needed to fix that. Exactly what that entails is beyond me.
I think you are misinterpreting the comment you're replying to. It's arguing that people block ads because they dislike seeing the ads they're seeing (as opposed to for privacy or resource usage concerns), not because they dislike all possible ads, which is what you're arguing against.
I don’t think it’s ever going to be so simple that we can bucket all non-tech users together. I occasionally volunteer as IT support for a small non-profit where everyone is non-technical and in many cases old. Their understanding of browsers were:
- Chrome is the fast one.
- IE is the one that have to use for some government/old websites.
- Ad Blockers are for safety (akin to anti-virus).
This was from a group that didn’t even know how to install Chrome on the new computers they got this year.
Group knowledge on this topic is largely going to be driven by what they’ve heard in the news or perpetuated by their social circles. And scary things will stick for long after they stop being true.
That doesn't mean the side-effect of most people avoiding spying at the same time is somehow an unwanted one.
I think most can agree here this level of spying on users is bad. Its sorta like child labor but a lot less obviously bad, in that it is obviously bad, nobody likes it, but there's enough taking advantage of it not being illegal, so its just socially tolerated thing. But once made illegal it will be looked back on like "how the hell did we think that was okay? how the hell did we willingly let it occur?"
I think the reality is more complicated. People sort of suspect being spied on, but it's hidden, not so real. It's abstract so if you ask any reasonable person, they'll say they don't like the spying, but few will take an impairment in function for something they use in order to avoid it because it seems somewhat distant.
Anecdotally, that's the case for me. I've been blocking ads since ~2000 because I don't like ads. It's only more recently that I've really stepped up efforts by not allowing third party scripts, using temporary/multi-account containers, using Decentraleyes, stripping identifiers from query strings, etc.
The sense of being spied on wasn't really what drove me to use an ad blocker. It was the fact that once or twice I got what appeared to be malicious code trying to take over my browser, go to pages I didn't want to go to, and prevent me from leaving, in order to promote some scam. I'm not in fact (even if it's naive) particularly scared of legitimate businesses or the CIA or whatever monitoring me.
Adsense is just going to start providing content for you to inline into your site.
Kind of like how https://old.reddit.com/r/gaming/ is just a sequence of ads being flawlessly delivered to an ad-averse demographic that eats the ads up.
I find it outright puzzling that CDN edge servers have not morphed into ad splicers yet, that business seems so obvious to me. The closest to a "guessplanation" I can come up with for is not happening is that there might be trust issues (overreporting/underreporting impressions) in the triangle of publisher, ad-network and CDN/ad-splicer. But I'm not convinced at all that this would outweigh the anti add-blocker advantages.
Until now, while most people don't use ad-blockers and browsers have accepted third-party cookies, there's an advantage to loading a resource from another domain, since an ID created and stored by doubleclick.net on site A could be read by doubleclick.net on site B, allowing for cross-domain tracking. As third-party cookies increasingly get blocked, I think we'll see that more and more.
Desktop-wise, I've often thought [evergreen] client-side tools would emerge for content extraction via local [headless] browser automation. It's something I've contemplated building myself.
Is there an ad blocker that interrupts/blocks your profile (the data that would normally be sent to the ad company), lets you edit/alter it, and allow the resultant profile to be sent to the ad company? As a consumer, I prefer relevant ads to irrelevant ads, and I might even prefer very relevant ads to no ads, but I don't want the ad companies to know stuff about me that isn't okay with me.
Tracking is more than just ads. A website owner wants to know who his visitors are. Where they come from. Which devices they use. Maybe he can support an other language, optimize for other devices, offer deals for a group of customers. But he doesn't want the risk to be fined by GDPR, so he skipps all this. Less optimisation, less/worse contacts - everybody lose.
Seems obvious without thought to me that it’s mostly moot. Very few people will be running machines like we have for the last 30-40 years, most will be on Android/iOS where ad blocking will be minimal.
Savvy users will continue to block on machines that aren’t walled gardens and through pi-hole style blocking.
I think the cat and mouse aspect will be completely overshadowed by tech giants continually neutering their users ability to block ads.
On android many links are open in a Webview (e.g. opening links on Gmail app) and many ads come through webviews inside apps themselves (e.g. some ads inside the youtube app itself)
Safari on iOS allows limited content blocking. It doesn’t allow ad blocking anywhere else, which is most of the platform.
And, I was referring to the future and trends rather than the current situation. System wide ad blocking used to be possible on iOS without jailbreaking, now it’s not.
I expect in time google will go similar and change android APIs, or play store rules, to do similar.
At that point I think most people wouldn't care. Nobody is offended because James Bond wears an Omega watch. When it is doing not so subtly then it becomes weird (e.g.: transformers)
I'm hoping in 10 years the world will have figured out that allowing arbitrary Turing-complete code to automatically run on one's personal machine is a terrifically terrible idea, and that the World Wide Web will instead orient itself around something that doesn't make security and privacy extraordinarily difficult to achieve (whether that's still HTML/CSS or something entirely new).
At the very least, though, eventually advertising agencies will hopefully figure out that this sort of tracking is pointless; "newspaper-style" ads are more likely to actually engage with the people encountering those ads (since said ads would be selected based on the page content rather than the person reading that content). This is how DuckDuckGo's ads work; the sponsored results are selected entirely by the actual search query. If content-driven ads (plus affiliate links, but I somehow doubt that's enough of DDG's traffic to be a deciding factor here) is enough to pay for enough computational power (and the development team to run it) to serve up 30+ million queries a day, then there's no reason it can't be enough for any other site.
With absolutely no disrespect intended, the hope that we'll forget about the WORA dream is delusional. WORA is inevitable and the Web, for all its flaws (and they are plentiful), is far and away the closest we've ever come. Even on mobile, which was a bit of a setback for the Web as WORA, JS has only been getting better over time. There's just no turning back the clock.
Security-wise, I think the best we can hope for is more and more OS-like sandboxing and isolation, capability-based security, and other defense-in-depth measures.
Privacy-wise, for defeating tracking and the like, ideally I'd hope for technical countermeasures to win the battle, but if we do end up having rely on legal measures, they have my full support, GDPR and CCPA included.
(Random idea for a technical countermeasure against fingerprinting: have you heard of those projects trying to defeat behavioral tracking where, whenever you visit a page, it simultaneously opens a bunch of other random pages in the background, hidden from you, and simulates activity on them, the idea being that Facebook has no idea what actual websites you like to visit because it's lost in the noise? What if instead, whenever you visit a page, your browser or a plugin or a proxy or whatever opened the same page simultaneously in a bunch of hidden background windows, with a random configuration of audio enabled/disabled, user agent, screen resolution etc fingerprinted characteristics?)
Indeed it is. It is not, however, dependent on running arbitrary Turing-complete code in my browser automatically and without my permission. Write-once-run-anywhere is perfectly possible and feasible under the traditional "download and install this program and run it" model.
I'm optimistic about WebAssembly (on that note) because of its usefulness beyond the browser; like I described in a different comment, it's only a matter of time before we start seeing GUI-enabled WASM runtimes that allow WASM-modules-as-programs to work as desktop or mobile apps indistinguishable from their native (or kinda-native, in the case of Android) counterparts.
> You can't build apps without turing complete code.
Sure you can. None of these things should require me to run your arbitrary Turing-complete code in my browser:
* Reading an article
* Writing an article
* Shopping online
* Searching for things online
* Reading social media posts/comments
* Submitting social media posts/comments
* Browsing a code repo
* Submitting issues / PRs / etc. to a code repo
* Reading documentation
That (non-exhaustive) category accounts for a solid 80% of everything I do online (and the other 20% are things which I'd rather be doing through native apps). All of these things should be possible (and indeed are possible) entirely with HTML (and optionally CSS) + a server somewhere handling the backend logic. If they're not, then your "app" is over-engineered, or it is indeed better off as something I explicitly download and install, which brings me to...
> We would be back to downloading and executing applications/programs.
Good. That's the direction the mobile world has already been going for a decade now. Native apps actually integrate with the platform. Web pages don't (or at least don't do so well). At least in that situation I'm explicitly "downloading and executing those applications/programs" by my own choice.
We even have things like WebAssembly now, with experiments and effort toward making it usable as a general-purpose compilation target/runtime outside a web browser. No reason why it'd take more than a decade for someone to figure out how to wire a WebAssembly module into some sort of Qt-based (or whatever) runtime + UI and get the best of both worlds.
> Good. That's the direction the mobile world has already been going for a decade now.
I genuinely don't understand this argument at all -- either you understand something about native platforms that I don't, or you're working under the assumption that all of your native apps:
a) aren't already vacuuming your data at the same rate as web apps.
b) wouldn't get considerably worse if they replaced the web ecosystem.
On the first point, native sandboxing is almost universally terrible. There's some promising stuff happening (notably with MacOS and with Flatpak/Wayland) but it's all just playing catch-up to where the web was years ago.
Pick just about any company that maintains both a website and a native version of the same app -- almost universally, the web version is safer to use. Nobody should be installing Facebook, Twitter, or Reddit on their phone. In fact, I would say the single best piece of advice I can give to anyone to improve their privacy/security on their phone is to stop installing things.
On the desktop, the situation is better, mainly because the desktop is very slowly turning into a niche platform and the web is a much more attractive place to put skuzzy, privacy-violating software. But this is a bit like the old argument that MacOS was more secure than Windows because no one was targeting Mac with viruses at the time. Get rid of the web and all of those skuzzy developers you hate aren't going to go away, they're just going to start making native apps. Where, again, the current sandboxing for most users and OSes is completely inadequate.
If your security model on the desktop is, "I'll only run code I trust", you can already do that on the web today. You can already turn off Javascript. And if you don't feel like the modern web-app ecosystem accommodates that decision, then what makes you think a theoretical, purely native world would accommodate you running a small, tight system that only includes code you trust? I can run a beautiful, tight Linux system because I don't have to install much software on it.
The unfortunate, horrible problem, is that running code we don't trust is gonna be necessary, no matter what world we move to. Sandboxing and permission systems are something we are going to have to figure out. Web or not, there is never going to be a world where you'll be able to trust all of the code you run on your computer. And currently, despite the many problems that browsers have, they're still still the best consumer-accessible solution for sandboxing code.
Of course integration and app performance suffers on the web. But frankly, neither of those are more important than sandboxing.
The more I think about your statement, the more I don't know what you are trying to say. Do you think there is a fundamental difference between software that is precompiled all at once and software that is interpreted or compiled on the fly?
Programmers make these tools. When challenging said programmers who work for companies that promote this kind of behavior (G) they suggest that they work for these evil companies because their job is interesting and it pays well.
This practice could stop tomorrow if the best and brightest of us decided so.
Google's recent stance on the matter of fingerprinting[2]:
>Chrome also announced that it will more aggressively restrict fingerprinting across the web. When a user opts out of third-party tracking, that choice is not an invitation for companies to work around this preference using methods like fingerprinting, which is an opaque tracking technique. Google doesn’t use fingerprinting for ads personalization because it doesn't allow reasonable user control and transparency. Nor do we let others bring fingerprinting data into our advertising products.
The important part being: _Nor do we let others bring fingerprinting data into our advertising products._
The same company advertises their fingerprinting capabilities:
>Browser and Device Analysis: We analyze the technological fingerprints of browsers and devices in order to uncover bots fraudulently posing as human users. We can validate what type of mobile or desktop device a browser is running on, providing additional context with which to identify fraud.
And it is this fingerprinting that gets them selected as a Google Brand Safety and Viewability Preferred Measurement Partner[1]
>New York, NY – Integral Ad Science (IAS) has been selected as a preferred partner in Google’s Measurement Program for both brand safety and viewability. Partners were selected after meeting rigorous standards for accuracy and using reliable methodologies to measure KPIs that matter for marketers. The program is designed to make it easier for advertisers to source trusted, third-party measurement providers.
The gist of it being that Google has heavy cognitive dissonance, with their advertising wing rewarding partners that fingerprint users (against their own policies), and the Chrome team barely managing to introduce some anti-fingerprint measures, which are clearly not enough.
Perhaps, but I think some of that behavior only appears dissonant. Like the NSA, Google often uses carefully constructed language that is designed to sound like a statement about a topic of concern without saying anything actually useful. For example:
> Google doesn’t use fingerprinting for ads personalization
The only reason to add "...for ads personalization" is if they are using fingerprinting for for other purposes. This could include other ad-related purposes like attribution.
Google claims about not using specific data for a specific purpose are probsabl7 true. They simply fingerprint (and probably correlate) everything else.
This issue (along with many others) is due to one simple fact -- the internet is still primarily about presentation and rendering not information. We had both client-side template-based rendering and Semantic Web initiatives -- these failed for various technical and non-technical reasons at the time, but I'm hoping we go in that general direction again at some point. Nobody else should be able to (definitively) decide what information I want and how it should be presented to me. We only get the Internet that the majority are willing to put up with.
I guess it's part of Googles Ads's endless battle against "robot" clicks. A site as big as SO should not use Google ads, but instead use their own ad service. Just make an automated system where people can signup and show an ad. Make it cost 1$ per 100 page views. That would probably earn SO two orders of magnitude more then they get from Google Ads.
Why can't Google come up with an AMP for ads? That will transpile a restricted javascript (or whatever) into a runtime that just doesn't do these things?
This would get rid of the greasy ads, and Google could focus on making tools that allow site owners to filter by "features used in ad", and ad developers could actually return to delivering ads, rather than collecting fingerprints?
"Caja uses an object-capability security model to allow for a wide range of flexible security policies, so that your website can effectively control what embedded third party code can do with user data."
Seems like classic fingerprinting behavior from Google Ads. It's unfortunate and hope they fix it quick but most importantly figure out a way to prevent it in the future
No but SO has always prided themselves on reasonable, pro-consumer, safe advertisements. So the fact that SO is allowing this speaks to them a little, its unlikely they know whats happening but it's still a little gross.
I don't think it's a scandal. It's not new or surprising that ads use tracking techniques like this. Stack Exchange recently announced [1] that they will use ad networks as an experiment. That announcement was quite unpopular and met with resistance and pleas to allow only static images to avoid annoying ads as well as sophisticated tracking. So this is no scandal since they were open about it and knew the risks. It seems they ignored the community though so they probably lost some trust by the community. I wonder if they will take action and stop the experiment.
It seems that the specific script comes from https://integralads.com/ as stated by another commentator. I think the blame is to be shared here.
integralads is guilty of developing and selling this technology.
Microsoft is guilty of buying it and using it
Google is guilty of serving it.
And why not also StackOverflow is guilty of offering that space to advertisers without enough vetoing of their ads.
After reading about integralads I'm not even sure if the purpose is to fingerprint, it seems to be more targeted towards detecting fraud, which does not require fingerprinting necessarily.
My point is that it's not as easy as pointing to one company and blaming them. This is a problem that concerns anyone on the Ad space.
- Nick Craver, Architecture Lead at Stack Overflow
reply