
My Messy Analytics Breakup - johannes1813
https://www.digitalinklingsblog.com/my-messy-analytics-breakup/
======
XCSme
Great article! Your dashboard is not ugly, it's special :)

I also broke up from Google Analytics, but I didn't sacrifice the insights, I
even gained more. I created a self-hosted analytics[0] platform to replace
Google Analytics which is not as complex, but powerful enough to give you all
the data the you need. I also had the fear of missing out if I removed GA, but
I realized that I only use a small subset of the data they store, and after
implementing those in my dashboard I no longer found the need to check out GA.
Today I rely solely on userTrack and removed all Google 3rd party includes
from my site (just yesterday I uploaded the Google fonts files to my server,
so I no longer include them from Google's domain). Another surprising benefit
is that my dashboard loads a lot faster that the GA dashboard.

[0]: [https://www.usertrack.net/](https://www.usertrack.net/)

~~~
kohtatsu
[https://dashboard.usertrack.net/sites/usertrack.net/visitors...](https://dashboard.usertrack.net/sites/usertrack.net/visitors/playback/7165)

Collecting scroll and mouse movements is enough to build a fingerprint on
people. It's also creepy, this kind of stuff needs to be opt-in. You could
record first and only send to the server after they've granted permission with
a clear dialog like "can we send your mouse movements and page interaction to
the webmaster?"

No idea on how that plays into GDPR, but you'll want to take that into account
with something like this.

Overall it's better than Google having all that data, and congrats on building
something cool, I like it minus the minutiae that you capture.

~~~
XCSme
> Collecting scroll and mouse movements is enough to build a fingerprint on
> people.

I never thought of this, I do see the potential fingerprinting but I don't
think it actually works as currently the mouse position and scroll is tracked
only ~200ms, so you just get some random positions, not enough to generate an
accurate fingerprint. Plus it would require a lot of data and ML, which I
highly doubt would be worth the effort.

> This kind of stuff needs to be opt-in As I mentioned in the other comment,
> you can display an opt-in dialog if you want to. Some related info: I don't
> know if you heard of Hotjar before (probably you did, as their ads are
> everywhere), but it was on like 25% of alexa top 100 sites and on over 500k
> sites, and probably all of them just bundle the consent with the other
> cookies or don't show any information at all. I think the problem is that
> GDPR mostly referrs to tracking and personal identifiable data, and all
> those movements, heatmaps and actions are not really enough to identify a
> person.

My current opinion about this: Although I agree it feels creepy, as I user I
don't really care if my actions are tracked on the website I go on, if there's
no connection made to my person or to other websites I visited. Also, tracking
mouse movement feels more creepy, but tracking all the content that you see
and buttons/links that you click on in order to show targeted ads is probably
worse. I think the big difference is that once you go to site X, you expect
the site to get some information about your usage on their site (what pages
you visit, what information was useful for you, where you got stuck on the
page) in order to improve your experience and for them to improve conversions,
but you don't expect for another 3rd party to get all this info about you and
use it for other purposes such as advertising or selling of personal
information

~~~
kohtatsu
I'm happy to see you're putting this much thought into it, I appreciate it a
lot.

I think it is a dozen orders of magnitude better than 3rd party services
considering it's self hosted, which mostly nullifies fingerprinting concerns.
I firmly believe opt-in should be required for the scrolling and movements,
but I understand the climate isn't there yet.

Thanks for taking time to consider privacy, making it a priority, and taking
the time to respond here. I reckon you're well on the good side of the fight
for privacy just by decentralizing this data.

------
stevekemp
Ironically I decided to ditch analytics largely because I wanted a faster
site. Google's PageSpeed Insights page shows my site gets 100/100 on
mobile/desktop, which was not the case when I had analytics available.

Of course such measurements are meaningless, but I had already come to the
conclusion that I was adding analytics snippets to my sites as a cargo-cult
thing; I never actually viewed the stats, and when I wanted to get a quick
feel of the level of traffic I was receiving I'd just tail the server-log.

These days with CDNs, caches, and similar, I don't really have a single
specific server-log to look at. But if the site is up I know if it became
unexpectedly popular, because it was featured somewhere, when I get
spontaneous emails from strangers..

------
pabs3
There are open source analytics tools, Matomo is one of them:

[https://matomo.org/](https://matomo.org/)

~~~
xellisx
This is what Piwik ended up renaming itself To. I use Piwik 10 years ago, I
remember it being nice.

------
pachico
I am building a web analytics platform that in theory could be a decent
substitute for GA. I'd like to share some notes with you.

In addition to all you have mentioned, GA has a problem now called adblockers.
It's actually not a problem for them as much as it is for website owners.
Depending on the country, independent agencies say adblockers are present in
more than 30% of browsers. I can't imagine what kind of business intelligence
tool provides that amount of error and is still considered ok.

------
therestapi
And at the bottom of the page I see a Google Captcha. Facepalm...

~~~
johannes1813
This is fair criticism — Captcha is more of a stopgap measure to fix input
field spam until I can get around to finding a better way

~~~
putsjoe
This may be a better alternative (no affiliation):

[https://www.hcaptcha.com/](https://www.hcaptcha.com/)

------
johannes1813
Does anyone know what is wrong with HN timestamps? I saw that there was a lot
of traffic to that article in the last few hours (using my own analytics
service of course), and the article is back on the HN homepage with a
timestamp that says 4 hours ago, but I posted it yesterday. Comments from
below also have incorrect timestamps.

~~~
Tomte
Submissions the mods find interesting, but that aren't making it to the front
page, sometimes get a second chance.

They get posted in /new (or even at the bottom of the front page) again, with
modified timestamp.

------
robin_reala
I guess the next step is to dump analytics completely and save your dopamine
for the things that matter. It’s one thing if you’re a large organisation
making iterative changes to a service based on usage data from analytics, but
what changes are you going to make to your blogging based on your analytics?

~~~
Aeolun
Write more posts?

------
njhaveri
Been using Netlify's (server-log-based) analytics offering with my site. It
feels maybe a little expensive for what it is, and definitely doesn't have
many bells/whistles, but it's been a good-enough alternative.

------
tarun_anand
Thanks for writing this. Appreciate the insights. We have taken the same
stance when building our privacy preserving analytics and engagement tool-
appICE.

Do keep posting and we would be happy to share your story with our customers.

------
visaals
the things people do for that sweet, sweet analytics dopamine...

------
buro9
Recently I've been thinking there must be a better way.

I want rich insights from a session angle, but I mostly want to know
performance timings and what my end users experience. How can I improve my web
apps to increase engagement, and where are the issues on my website as it's
seen from a user's perspective.

My current thought (not yet implemented) is to take several existing things
and to glue them together.

1\. Boomerang
[https://github.com/akamai/boomerang](https://github.com/akamai/boomerang) is
a Real User Monitoring JS include that is open source and comes from the Yahoo
and Akamai performance teams

2\. Grafana Cloud
[https://grafana.com/products/cloud/](https://grafana.com/products/cloud/) has
Loki and Prometheus for log consumption and metrics respectively and now has a
self serve plan of $50 per month (but I think this is reasonable as the data
storage is managed for you and you could run many instances of RUM monitoring
via this to spread the cost reasonably)

3\. Write a simple server to receive the Boomerang requests, log each request
received in full detail and structured. At the same time increment Prometheus
metrics. Let Loki and Prometheus then put that into Grafana Cloud.

Then it's just a case of using Grafana to configure a dashboard to offer
whatever views, alerts, analysis that you want.

The thing I like about this is that no matter what the website the inputs are
the same (whatever Boomerang provides), and so the dashboards are inherently
shareable so that others can use them.

