Ask HN: What are the security risks with web analytics and mitigations?

_b8r0 · on Oct 11, 2010

You're injecting another site's code into your site which is then rendered in your end user's browser. The security of your site (in terms of client side security) now becomes wholly reliant on the security of your analytics platform provider.

Additionally, you're also giving the provider the opportunity to scrape everything on every page that's served with the analytics code in it, as you have no way of knowing what was served to the browser (as you don't control it).

If you imagine taking popular drive by testing tool BeEF and including that javascript, that's effectively what you're risking with google or any hosted service. Nobody's infallible, they may be better at web security than you. Maybe they're more likely to have an advanced attack?

rakkhi · on Oct 12, 2010

Thanks yeah I was thinking along the same lines, how is this risk best mitigated? By requiring that the analytics provider has had the code pen tested and testing it ourselves?

I don't undertand your second point - what do you mean scrape everything on the page with the analytics code? Doesn't the code, script, zero pixel image simply provide the analytics solution with the tracking data? What is the security risk there?

_b8r0 · on Oct 12, 2010

> Thanks yeah I was thinking along the same lines, how is this risk best mitigated? By requiring that the analytics provider has had the code pen tested and testing it ourselves?

Ultimately you can't. For example, you include script src="http://some-analytics-server/foo.js - at no point can you assert on behalf of your end user that the contents of foo.js are not malicious. Even if it was penetration tested that only provides a snapshot in time. There's no reason why the .js requested in one request should be the same in all and there's no real means to provide any assurance of it's integrity.

As for my second point if you look at Google Analytics' code you'll see ga.js invoked as in a script src. This means that the code is executed in the browser. That URL could contain anything (although it's unlikely to in the case of Google without people noticing, the fact is you have no control over the content or protection of that content and have to adjust your risk assessment model accordingly to take that into account). It could contain a picture of a horse, it could redirect your users to goatse.cx, it could read the contents of the page it's loaded on (depending upon browser and server configuration), strip out the bits it wants and submit to another server somewhere. You just don't know.

Another thing worth thinking about is that some browsers (thankfully not most modern ones as far as I know) will execute javascript in an img src tag if sufficiently seduced by malicious code.

rakkhi · on Oct 12, 2010

Good points, my thinking is that we are placing a level of trust in the analytics provider, we will mitigate this via the standard mechanisms: due deligence, contract clauses, requiring they have secure development practices (this is where I was going with the pen test). But yes I will capture in the residual risk that there is a risk that the script could contain malware or be subverted. However the analytics vendor has a very large incentive to avoid this, we will also put in service credit and cancellation penalties if this actually occures.

On the point regarding vs. script tags, I doubt the business will allow me to request no scripts, but rely only on cookies, session identifiers and zero pixel images. Wonder if the business benefit could be delivered without scripts....

_b8r0 · on Oct 12, 2010

With a blank gif you'll only get what the browser sends. You could set a cookie and track that (to see who else visits sites using your analytics tool), but you're unlikely to be able to pull data about flash configuration, javascript engines, browser resolution and certainly wouldn't be able to build heat maps.