

Ask HN: What are the security risks with web analytics and mitigations? - rakkhi

My initial brainstrom was:
[+] controlling how and what data is captured and stored by the analytics platform
[+] if data is traveling to a remote service the transmission security and security in storage
[+] access controls to the analytics platform: authentication, authorization, logging
[+] where session identifiers and cookies are used protecting these from leakage to an attacker
[+] is there any risk of using iframes or zero pixel images of an attacker being able to insert their own or is this sufficiently protected by access control to to content management?
[+] can any denial of service on the analytics scripts effect the performance of the service being monitored
[+] breach of privacy if any personally identifiable data is collected without concent?
[+]  Any risks with service mashups e.g. web analytics pulling in geographic, post code data for example and not validating this input?&#60;p&#62;Anything else? Any of these that are not a risk?
======
iuguy
You're injecting another site's code into your site which is then rendered in
your end user's browser. The security of your site (in terms of client side
security) now becomes wholly reliant on the security of your analytics
platform provider.

Additionally, you're also giving the provider the opportunity to scrape
everything on every page that's served with the analytics code in it, as you
have no way of knowing what was served to the browser (as you don't control
it).

If you imagine taking popular drive by testing tool BeEF and including that
javascript, that's effectively what you're risking with google or any hosted
service. Nobody's infallible, they may be better at web security than you.
Maybe they're more likely to have an advanced attack?

~~~
rakkhi
Thanks yeah I was thinking along the same lines, how is this risk best
mitigated? By requiring that the analytics provider has had the code pen
tested and testing it ourselves?

I don't undertand your second point - what do you mean scrape everything on
the page with the analytics code? Doesn't the code, script, zero pixel image
simply provide the analytics solution with the tracking data? What is the
security risk there?

~~~
iuguy
> Thanks yeah I was thinking along the same lines, how is this risk best
> mitigated? By requiring that the analytics provider has had the code pen
> tested and testing it ourselves?

Ultimately you can't. For example, you include script src="<http://some-
analytics-server/foo.js> \- at no point can you assert on behalf of your end
user that the contents of foo.js are not malicious. Even if it was penetration
tested that only provides a snapshot in time. There's no reason why the .js
requested in one request should be the same in all and there's no real means
to provide any assurance of it's integrity.

As for my second point if you look at Google Analytics' code you'll see ga.js
invoked as in a script src. This means that the code is executed in the
browser. That URL _could_ contain anything (although it's unlikely to in the
case of Google without people noticing, the fact is you have no control over
the content or protection of that content and have to adjust your risk
assessment model accordingly to take that into account). It could contain a
picture of a horse, it could redirect your users to goatse.cx, it could read
the contents of the page it's loaded on (depending upon browser and server
configuration), strip out the bits it wants and submit to another server
somewhere. You just don't know.

Another thing worth thinking about is that some browsers (thankfully not most
modern ones as far as I know) will execute javascript in an img src tag if
sufficiently seduced by malicious code.

~~~
rakkhi
Good points, my thinking is that we are placing a level of trust in the
analytics provider, we will mitigate this via the standard mechanisms: due
deligence, contract clauses, requiring they have secure development practices
(this is where I was going with the pen test). But yes I will capture in the
residual risk that there is a risk that the script could contain malware or be
subverted. However the analytics vendor has a very large incentive to avoid
this, we will also put in service credit and cancellation penalties if this
actually occures.

On the point regarding vs. script tags, I doubt the business will allow me to
request no scripts, but rely only on cookies, session identifiers and zero
pixel images. Wonder if the business benefit could be delivered without
scripts....

~~~
iuguy
With a blank gif you'll only get what the browser sends. You could set a
cookie and track that (to see who else visits sites using your analytics
tool), but you're unlikely to be able to pull data about flash configuration,
javascript engines, browser resolution and certainly wouldn't be able to build
heat maps.

