
The scourge of web analytics - StavrosK
https://www.stavros.io/posts/scourge-web-analytics/
======
jagthebeetle
Disclaimer: I'm a bit of a GA power user; although being on the technical
side, I'm not really sympathetic towards the marketing uses for analytical
data.

There's a tough line to draw between excessive and sufficient instrumentation
for a given app, and a business will probably opt for more than less data. I'm
not sure that's a moral argument against tools like GA, though.

Especially given one particularly compelling feature of Universal Analytics
(GA's latest incarnation): you can definitely water down what you collect.
([https://developers.google.com/analytics/devguides/collection...](https://developers.google.com/analytics/devguides/collection/protocol/v1/))

Want to opt out of collecting advertising info? Say so in the UI.
Anonymize/obliterate the IP/geolocation info? Override the IP field to a
static value. Override the intrusive collection of User-Agent values? Override
the ua parameter. Prevent collection of granular page info? Override the dl,
dp, dt fields. Stop collecting referral info? Blank out the dr, cs, cm, cn,
and remove UTMs/GCLIDs from the URLs you report.

In fact, you can even implement measurement protocol serverside – GA's ancient
forebear, Urchin Analytics, actually began as a server-log parsing utility!
([https://urchin.biz/urchin-software-
corp-89a1f5292999](https://urchin.biz/urchin-software-corp-89a1f5292999))

Reduce JavaScript use altogether but keep it clientside? Implement a static GA
pixel:

    
    
      <img src="https://google-analytics.com/collect?v=1&tid=UA-XXXXX-Y&cid=somevalue&dp=%2Fsomepath&ua=YourUserAgent&...">
    

Now, mistrust of a third party with said data is a relevant issue, but
slightly different from the technical bones of this piece's argument.

~~~
ge96
Curious how that pixel works... do you have to mouse over it, seems like if it
was that case would be a bad approach/design, one pixel versus 1366x768 (at
least). Also I wonder that code sample you posted isn't complete (realize why)
but I wonder if it has bound JS or how it works technically (all in the
source)... do you still have to include a script... I should just Google it
haha.

~~~
coalescedfrog
Think it's been answered, but the link below might provide more information on
tracking pixels/beacons.

[https://en.wikipedia.org/wiki/Web_beacon](https://en.wikipedia.org/wiki/Web_beacon)

~~~
ge96
Thanks for that, I've seen/heard of tracking pixels but haven't tried them
yet.

------
js2
_I’ve been making web apps since 2003, which means that I’ve been doing this
for fourteen years now, or it means that I can’t count. So, there are few
people more qualified than me to tell you this:

The web is shit._

I've been doing this since 1996. I watched the transition from server-side
analytics (one of my first jobs was implementing access log collection and
anaylsis for a decently sized group of sites) to client-side.

I remember solutions like analog and on up to Urchin and then Urhcin's
transiton to client-side and finally its acqusition by Google.

By 2000, anyone paying attention knew it would not end well.

So here we are, bloated web. I'll be mildly amused if sites are forced to
return to server-side log collection because enough folks have started using
tracking blockers.

Everything old is new again.

[https://en.wikipedia.org/wiki/Analog_(program)](https://en.wikipedia.org/wiki/Analog_\(program\))

~~~
smnscu
I hope to have time to work on a server-side ad + analytics framework, purely
as an exercise. Might work as a business as well, as a "boutique" version of
an ad network for blogs and businesses who care about privacy/not shitting in
their users' browsers. I noticed this kind of small network with
"premium"/"nice" ads a couple of years ago, but I can't remember any names. I
really liked the idea tho.

~~~
netzone
That's an idea I've had for a long time actually, but never made anything of
it. Running ads server-side, maybe especially that the server can preload ads
so there is no overhead for the actual users in loading the ad. That seems
like a good idea to me.

------
JohnTHaller
You can have a static button for "Share" on Facebook. I'm currently using
static image buttons that allow you to Share on Facebook, Tweet on Twitter,
Share on Google+, Share on LinkedIn and Share on StumbleUpon (should probably
remove that one) on my personal site:
[http://johnhaller.com/](http://johnhaller.com/)

The code for Share on Facebook is (hoping HN doesn't mangle this):

<a href="[https://www.facebook.com/"](https://www.facebook.com/")
onclick="windowpop('[https://www.facebook.com/sharer/sharer.php?u='](https://www.facebook.com/sharer/sharer.php?u='),
500, 500); return false;" style="padding-right:15px;"><img
src="/path/to/your/FacebookShareButton.png" width="57" height="20" alt="Share
on Facebook" title="Share on Facebook"></a>

I've been using static social buttons for a while now on my personal site as
well as PortableApps.com to improve page load speed and user privacy. I've
been building web apps for 20 years since the original Active Server Pages 1.0
and 2.0, so there are few people more qualified than me to post this comment
:)

~~~
StavrosK
Thank you, but you forgot to include the "windowpop" function :)

I'd love it if I can replace both the Facebook button and GA on my site with
something static, thanks.

EDIT: Done, the URL is:

[https://www.facebook.com/sharer/sharer.php?u=<url>](https://www.facebook.com/sharer/sharer.php?u=<url>)

~~~
JohnTHaller
Ah, quite right, sorry...

    
    
      <script type='text/javascript'>
      function windowpop(url, width, height) {
          var leftPosition, topPosition;
          //Allow for borders.
          leftPosition = (window.screen.width / 2) - ((width / 2) + 10);
          //Allow for title and status bars.
          topPosition = (window.screen.height / 2) - ((height / 2) + 50);
          //Open the window.
          window.open(url+document.URL, "Window2", "status=no,height=" + height + ",width=" + width + ",resizable=yes,left=" + leftPosition + ",top=" + topPosition + ",screenX=" + leftPosition + ",screenY=" + topPosition + ",toolbar=no,menubar=no,scrollbars=no,location=no,directories=no");
      }</script>

~~~
badosu
You can post content in verbatim by indenting it by 2 or more spaces
([https://news.ycombinator.com/formatdoc](https://news.ycombinator.com/formatdoc))

~~~
JohnTHaller
Thanks, updated.

------
gator-io
I could never understand why any ecommerce company would use GA. They are
collecting your user's data for ad targeting. Well, guess what? They target
ads to your users for competitor's products.

You browse a site that has GA on it looking to buy a bike, then that user get
hit with bike ads on other sites that host Google ads. Almost every site owner
I've spoken to about this issue was unaware of it.

Server side analytics prevents this, or use an analytics package that has no
advertising business.

~~~
cm2012
99% of ecommerce folks are not a scale where this would effect anything, and
even then google display ads are less than 5% of the usual online marketing
mix. It's well worth the tradeoff for the power and flexibility of GA.

~~~
gator-io
Why would scale matter?

~~~
cm2012
As a small company, you are barely affecting Google's targeting algorithms on
your own.

~~~
gator-io
They target individual users based on their browsing history. Scale doesn't
matter.

------
spiderfarmer
I fully agree on the social buttons but Google Analytics is so much more than
just a statcounter. Goal tracking and segmentation are invaluable tools that I
believe no other tool provides.

And before you complain that 'spying on your users isn't needed to make
money', try working at a large corporation. Things just don't work the way you
(ideally) want to. You can only say "no" so many times before someone else
steps in and does it for you.

~~~
vmateixeira
If _spying on your users is needed_ for a company to make money... maybe that
company should shut down as obviously their producs/services are not needed on
socienty anymore?

I mean, something else really wrong must have happened in core business for
this to be an actual issue.

~~~
soared
Is GA 'spying on users'? If so.. you should never go anywhere in public, never
go shopping at a grocery store, etc. They all record everything that happens
and analyze it to get similar amounts of data. Your bank tracks you way more
(and sells all your data!).

~~~
ionised
> Your bank tracks you way more (and sells all your data!).

In the US they might. It's not true everywhere.

------
jastingo
I fully agree with the sentiment (if not the tone) of this post; namely,
consumer tracking across the web has run rampant without much consideration of
its long-term cost to both consumer privacy and user experience.

However, I do find the blog post's use of Google Analytics and the Facebook
like button both indicates and perpetuates the very problem the OP is posting
about. The simplicity and convenience to the content producer of using
services like Google Analytics and of offering a means of sharing their
content through a platform like Facebook clearly still outweighs any
objections, moral or otherwise, to the use of those services/platforms.

~~~
StavrosK
It turns out that there are easy alternatives to this, and some were posted in
this thread (like Shariff). I have replaced all the social buttons with static
alternatives, and soon I will be able to remove tracking as well in favor of
GoAccess, so it's not as bad as it sounds!

------
Theodores
The problem with web analytics is that it is a low barrier to entry. I have a
saying - 'if you can't code then do SEO' \- and far too many companies have
some inexperienced, non-technical squeaky wheel insistent on the overbearing
analytics. It is not just the tracking scripts, there is a whole universe of
snakeoil built on top of that. In ecommerce there are tracking script things
that promise to deliver you the ultimate newsletter - if you sign up for it -
and if you don't then all your interactions will be recorded anyway.

This is all fantastic, however, there really just needs to be some compelling
CTA for the newsletter signup, which is doable. Instead though, this stuff you
could ask the customer has to be magically inferred.

Another favourite is some stalking module for ecommerce that does personalised
recommendations. This adds to the bloat and does 'jumble sale
recommendations'. To do it properly, with code, is probably a simple SQL query
but you still need to build, test, deploy such a thing in a professional way.
But a third party script that magically does it, well, sign me up...! (Says
the non-technical guy).

On top of that is the instant help widget that someone in marketing expects.
There is no need for it if the website actually works and information expected
is provided. Again this widget needs to stalk 2000 people for the 1 person
that uses it.

So how do I face the challenge? Sometimes due to squeky wheel noise it is not
possible. However, if you can do a better job of delivering what the required
end information is, then you are good to go.

Some people use analytics for actual sales information, again because some
report can be produced easily without any knowledge required. Again, to do a
proper report probably needs some knowledge of SQL and coding. So the result
is business decisions made on data that is not correct.

There is also a cost to some of these things, you can have a pop up giving
customers 10% off their first order with yet more script going on. The script
will charge a 8% cut to make it a 18% payout if anyone puts their email
address in the popup box.

Then there is affiliate marketing, which is okay but another level where more
script is added and more 'fee' paid out for the privilege.

So those scripts you don't see are costing money, real money. You as the
consumer pay 10% of the product price to some mystery javascript bloaters
every time you 'Buy Now' (which is always next day at the earliest).

~~~
StavrosK
For reporting, we use Django Explorer (there are definitely similar tools for
many stacks). Someone who knows SQL prepares a query once, then the person who
needs the report specifies the arguments and they get a well-formed CSV to
download immediately.

------
vbernat
It's odd to keep Google Analytics because you don't have access to logs while
being able to provide self-hosted comments. Just add a self-hosted small image
on each page.

~~~
StavrosK
Hmm, that's actually a great idea! Let me try that right now, thank you.

------
nachtigall
FWIW for non-tracking share buttons, there's also Shariff:
[https://github.com/heiseonline/shariff](https://github.com/heiseonline/shariff)

~~~
StavrosK
That is fantastic, thank you! I'll add them to the article right now, and my
site as soon as I can integrate them.

~~~
nachtigall
There are also many plugins for various CMS for Shariff, e.g. for WordPress I
can highly recommend
[https://wordpress.org/plugins/shariff/](https://wordpress.org/plugins/shariff/)

------
amiga-workbench
I've used Piwik for a few sites, I love that you don't need the JS crud to use
it, makes the overhead almost unnoticeable.

------
jackgolding
I've worked in web analytics for the last 4 years. It's a bit annoying that OP
doesn't differentiate "analytics tools" and "marketing tools", nearly every
site I've worked on doubles the site load speed due to some arcane
synchronous-only loading iFrame. Server side seems like a good idea until you
realise how many bots there are on the internet (which the business doesn't
care about), how hard it is to implement (one of the best and worst things
about client side tracking is on a lot of sides it can be implemented and
modified outside the deployment cycle) and how much information you get
through these pixel calls.

There is a massive lack of technical talent in this field that companies are
screaming out for. I'm very happy with it as a career. If you are interested
in some advice on getting into it send me an email. Also happy to answer any
questions.

~~~
ramblerman
Not to be rude, as I am genuinely interested in this area. But is there much
more to it than adding a GA tracker and setting up the dashboard?

~~~
jackgolding
Depends on the role, the bad roles generally are just that (report monkey and
a bit of project management around implementation.) The tooling is either
Adobe analytics or Google analytics at most companies. There are 3
specialisations as I see it: data, outbound marketing and inbound marketing.

Data is to do with the limitations of GA and Adobe. Businesses want to use hit
or session level data for segmentation, retargeting, personalisation and all
the fancy machine learning stuff. This isn't easy to achieve with out of the
box analytics tools so it requires a bit more data processing.

Inbound marketing is to do with conversion rate optimisation, SEO and UX.
Questions like "is our left hand navigation better or worse than it was
before?" aren't super easy to answer in a commercial context.

Outbound marketing is to do with return on marketing investment for paid
advertising. Attribution modelling is a huge space which is VERY hard to
figure out for a business.

------
chrismorgan
Your static images could do with high-DPI versions of them (via `srcset="…
2x"` and the likes), or using SVG, so that they’re not fuzzy.

~~~
StavrosK
Good call, they do look weird on my high DPI screen. I'll replace them with
better-looking alternatives, thank you.

------
imron
I was just looking for something to replace Google Analytics with the other
day.

GoAccess looks great.

~~~
tombrossman
Here's a how-to I wrote recently, which includes automating GeoIP look ups
plus 'Referer Spam' blocking with simple cronjobs:
[https://www.tombrossman.com/blog/2017/faster-and-more-
accura...](https://www.tombrossman.com/blog/2017/faster-and-more-accurate-
analytics-with-goaccess/)

------
SimeVidas
So the author assumes that the reader doesn’t use ad blockers, nor incognito
mode for porn. Who is this post’s audience? Fox News viewers? Yes, websites
are bloated with ads and trackers, but we have tools to eliminate them.

~~~
StavrosK
This post's audience is the 3000 people that showed up reading it in my Google
Analytics dashboard.

------
JumpCrisscross
> _You’ll notice my hypocrisy in having social buttons at the bottom of this
> post...the Facebook button likes the page directly, which can’t be
> substituted with a simple link, but I encourage you to use Privacy
> Badger..._

This nukes the argument. At least for this page alone, trust your users to
copy and paste a URL to content they like.

~~~
StavrosK
You're saying it doesn't matter if it's as easy as possible for people to
share content?

EDIT: Reword.

~~~
JumpCrisscross
Not right after claiming such buttons "do nothing for the user that a simple
link wouldn’t" and assess their cost in "tracking and slowness". You
apparently think they do _something_ good. Otherwise you wouldn't include
them.

My complaint isn't with your using social media buttons. (I hated them enough
to block them; they no longer bother me.) It's in using them right after
claiming they are useless and parasitic.

~~~
StavrosK
The argument was "social buttons are bad for privacy. The ones that don't do
anything a link wouldn't are especially egregious". The Like button doesn't
fall in the latter category, as it does something a link won't.

