Hacker News new | past | comments | ask | show | jobs | submit login

The issue is that, for example, the New York Times actually used WebRTC to gather data for exactly that purpose. https://webrtchacks.com/dear-ny-times/



Although NYT shouldn't get off scott-free, I think it's worth pointing out that they probably had very little to do with this and more than likely had no idea about it. The WebRTC 'tracking' was from a script from an ad network[1] used by at least Ars Technica and The Washington Post.

As someone who works on sites implementing display advertising using these sorts of networks, I have nothing but contempt for the developers that are writing these JS. I've lost count of all the JS errors they cause (currently I see a lot of `Can't find variable: _body`), or just errant console.log messages (one on every browser scroll or resize was fun).

[1]: Edit: Amusingly I found a developer arguing for this WebRTC-punching, who says he's from White Ops (whiteops.com) working on anti-bot tools https://github.com/EFForg/privacybadgerchrome/issues/431#iss...


Why are you even running scripts from a third party on your own site?

If the ad is a static image, use an <img> tag. If it's text, show the text. If it's a video, use <video>. If they want to run custom code, tell them to get lost.

Yeah, it's ultimately the ad networks' fault, but what did you expect?


The way it usually works (or at does at least for 'us', at Australia's largest media orgs) is we have our own ads library, we'll communicate with our trusted ads network, and it'll find an ad 'provider' who might have about 10 possible ads for that spot. We'll insert an iFrame into the page and then insert their JS into the iFrame, which will then load the one ad to display.

This way it's a little bit more than just dumping a random script into the body. However, I don't do much with ad serving so I'm not sure exactly what there is technically to curb the iFrame interacting with the parent site (apart from extra console.log statements)


Because it's an ad network and a script tag is how you pull the creative into the page. The whole point of using an ad network is that you are letting a 3rd party handle the management of your inventory, the ad creative is unknown until when the call is made.


Gee, that sounds like a brilliant idea. I can't think of anything that could go wrong with this scheme.

(And yes, I know, this battle was lost in 1996 or thereabouts.)


It just isn't practical once there were millions of websites that you would negotiate media buys with them individually and send them some image assets or text ads that they would then have to host as a 1st party.

You also have to factor in all the things ad servers are designed to do like control the number of impressions shown, track views, clicks, and interactions, as well as allow advertisers to rotate new creative in on-the-fly.


Usually the script (and ad) is in an iFrame, so that makes it slightly better.


> I have nothing but contempt for the developers that are writing these JS

I have nothing but contempt for the companies that accept advertising from untrusted third parties who can offer no assurance as to the security or even the content of the code their platforms allow to run on client browsers. That doesn't even get into the tracking that the advertising platforms themselves have access to.

Host your advertising yourself and I let it through with very little exception. If it comes from another server, it's blocked.


I'm not a fan of it, nor all the variations of invasive ads that are sold[1]. I get the sentiment of what you're saying, but reality is a bit more nuanced than that, and there's a lot of legacy and business reasons why it happens.

Using ad exchanges means that you always have ads available to make money from. When our ads team doesn't sell an ad directly, it'll go out to the ad exchange and get traded algorithmically.

If you're going to use display ads, you have little choice but to use an ad exchange, and no ad exchange is set up to not have content run from a third party - they simple haven't or don't care about the disadvantages that you or I see from running third party content. That industry just isn't as 'progressive' or modern.

Thankfully though, media and publishing companies (or at least the one I work at) are starting to become more away of the problems of relying on display advertising, and are starting to rely on them less and use other forms such as sponsorship deals or video ads[2]

[1]: Like this obnoxious wallpaper ad http://i.imgur.com/IPVAVwx.jpg although this is actually one of the better ones. [2]: A 'new' tech is 'server side ad insertion, where the video is inserted into the video stream on demand on the server. Pretty cool stuff https://www.brightcove.com/en/once


As someone who works on ad networks, I agree. While they put a lot of work into the backend stack, the frontend is usually written in the worst possible way.

Tons of document.write, loading dozens of more tags, everyone has their own copy of jquery, etc.

The industry just doesn't have any technical leadership in the governing bodies so there's no accountability or any expertise to check that the networks are built right.


Ads are blamed for bloated pages, slow load times, JavaScript errors flooding console.log, and mixed content HTTP/HTTPS problems. I'm surprised that ad networks are not super optimized. For programmatic advertising, wouldn't serving ads faster allow for better/more/longer ad views?


Yea our network has put a lot of work into what actually runs in the browser for this exact reason.

However most networks go for the volume game so it just isn't that important to focus on JS performance. When you can spend time on jamming more expanding units and video into an ad that for the most part still works, that's better ROI than trying to optimize. Things are finally changing now with adblock and mobile usage but there are lots of long-tail shady networks who aren't legit with business practices in the first place (let alone dev) and the big companies just don't care because they're already big and engineering is a committee based process. Part of it is also the fact that there's no accountability in the industry, especially with tech.

I've been pushing for a technical certification process for ad networks (along with data/privacy handling) but it's a long road and won't happen anytime soon.


They aren't written for the user experience, they're written by low-level techs usually (because the JS part isn't as "cool/exciting/important" as the backend part) to just get the ad on the page somehow.

It's probably the worst of the worst in JS engineering sadly.


Probably so... maybe if the browsers put limits as to the amount of JS content can go into a given iframe (including child frames) to say 80KB, that would cut a lot of it out. It would still allow for a LOT of code, but not nearly the kitchen sink + the kitchen.

Then again between Ghostery and uBlock, I don't see most of it.


There's quite a lot of case law that says that companies can be held responsible for their subcontractors. See e.g. Deepwater Horizon

I am not a lawer. This is not legal advice.


Oh yes. I agree.

> Although NYT shouldn't get off scott-free

Legals aside, ultimately you're (where 'you' == 'the company') responsible for what ends up on your website.

What I'm saying that this is more nuanced in practice. If you look at the JS console on some sites I work on at my company, you could come to the conclusion that we're bad developers because of all the JS errors you would see. Unfortuantly, they're made by others and we (developers) get little choice in the matter.


> The WebRTC 'tracking' was from a script from an ad network[1] used by at least Ars Technica and The Washington Post.

And then they wonder why we run AdBlock.


And this, everyone, is why using pay-per-view or pay-per-click ads online is so destructive.


And by "exactly that purpose" you mean preventing ad fraud [1], right? They weren't using WebRTC to put you in a "VPN user" advertising segment.

1. https://www.reddit.com/r/netsec/comments/3dgwee/how_the_new_...


Doesn't matter. Privacy is not about right or wrong, it is about privacy.


Privacy is not an absolute to be maximized at all costs. Do you have blacked out windows, or do you concede that the practical day-to-day infringement of your privacy is so minuscule and so easily mitigated by window shades that it's not worth the trade-off?


no, but I have windows with curtains on the inside. Not the outside.

The distinction is both important, and blatantly obvious. Privacy control must remain with the one whose privacy is at stake.


And sometimes you have these curtains open? But what about privacy? Do you agree that sometimes people being able to look into your living room is not a big deal, even if it decreases your privacy by some definition? Great, then you agree that maximizing privacy at all costs isn't your or people's tradeoff point. Same with WebRTC - your internal network's IP is not the kind of privacy most people care about, nor should they.

That said, WebRTC from behind a VPN exposing your personal IP is a bit different. That's kind of like a light you installed rendering your curtains translucent. I'm not sure if it's the curtain's fault, or the light's, but it's certainly not what anyone had expected!

Given that OpenVPN somehow works in a way that doesn't expose your personal IP [1], I'd blame the VPN providers for saying that their VPN anonymizes web traffic when it actually doesn't.

1. https://tlog.anfedorov.com/vpns-webrtc


[deleted]


> a random guy on the internet says something and you think it is true?

Yes, if they identify themselves and their company and what they say aligns with my personal experience.

> You don't have the foggiest idea what they are really doing with the data. All we know is that they are collecting the data without user's consent.

You never have any absolute certainty what anyone does with your data - all you have are hypotheses and probabilities. Who are "they" and what do you think they are doing?

If adtech companies cared whether you're behind a VPN, they would make or buy a list of IP's that are provide VPN services and match that list. That's a ton easier than implementing a STUN server that scales to handle traffic from every single person who views one of their ads.


It's not only about the VPN leak. WebRTC also leaks internal IP addresses which provide additional entropy that can be used for fingerprinting.


Entropy that changes when your local IP does? That's worse than useless for ad targeting. Even if it were useful, I don't think there's much of a chance that adtech companies will build out STUN servers to handle the kind of traffic they do just track down the 0.001% of users who do not accept third-party cookies. Can you even do WebRTC from an iframe?

Browser fingerprinting is absolute FUD. It makes no sense for advertisers, and it's pretty useless for anyone else, too. Every time I visit the EFF site that checks my fingerprint, it tells me I'm still unique. That's perfect anonymity!

Revealing a user's personal IP when they're using a VPN is a real problem, though, where the computer isn't doing what an even an experienced user would expect.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: