Hacker News new | past | comments | ask | show | jobs | submit login
From Google Analytics to Fathom (jeffgeerling.com)
172 points by geerlingguy on Feb 9, 2019 | hide | past | favorite | 58 comments

> Since the mid-2000s, right after it became available, I started using Google Analytics for almost every website I built (whether it be mine or someone else). It quickly became (and remains) the de-facto standard for website usage analytics and user tracking.

I ask everyone to consider is whether they actually need analytics at all. I think in many cases the answer is "well, no, but it's nice to have…". Sure, it would be nice to know how many people visit my personal website, but I don't think it's worth sending extra JavaScript to people for, or dealing with the risk that I may be collecting more information than I actually want.

This is why I'm more comfortable with Fathom; I have removed GA from sites in the past when I realize I haven't logged into view stats for years. But for a few sites it is very helpful to have some data, especially to see if there's an older article getting a lot of consistent traffic (I'll brush it up and make sure it's more current).

I know most people don't care to look back much, and leave old content to rot... but most of the pages with over 100,000 lifetime visits are pages from 3+ years ago (like this ridiculously short post on how to sync a shared Google Calendar with Calendars on macOS/iOS[1]).

[1] https://www.jeffgeerling.com/blogs/jeff-geerling/sync-shared...

> But for a few sites it is very helpful to have some data, especially to see if there's an older article getting a lot of consistent traffic (I'll brush it up and make sure it's more current).

Usually my solution for this issue is to use people emailing me about things as a proxy of whether I should go back to something, and I think it works pretty well (though, I don't have the data to compare, ironically ;) ). If something's useful to many people but wrong/broken, eventually someone is nice enough to let me know. This has the benefit of making user consent very explicit, since they can choose exactly how much they want to "share" with me.

Also, I'm enjoying the @media(perfers-color-scheme: dark) support on your website! I usually call these out when I see them, since they're still rare, but it was before sunset when I first commented so I wasn't using Dark Mode then.

Ha, thanks for noticing! I enjoy browsing in Safari Technology Preview more often than not lately because more tech sites are starting to adopt a dark mode color scheme.

There are still a couple of bugs I need to work out with code display in a few spots (mostly because of some old color-unaware plugins I use), but it was pretty quick and easy; I wrote up my process here[1].

[1] https://www.jeffgeerling.com/blog/2018/jeff-geerlingcom-now-...

Yeah, this basically mirrors my experience converting my own website. I browse exclusively in Safari Technology Preview, so I'm always pleasantly surprised when I come across websites that support the feature (interestingly, it seems like Twitter has a dark mode implementation, but only on their mobile website (?!); their normal desktop website, which is the only place where this feature is supported doesn't have this).

A number of the large ad networks I work with requested Google Analytics data for review when going through my application.

If you ever want to sell your site, having a decade of Google Analytics is valuable.

Yes, you could use a different analytics software or logs for the above. However, since Google Analytics is the standard, it's familiar, trustworthy, and performs equally across all sites. For example, Google Analytics shows I have about 1,000,000 page views a day. I have Apache blocking at least 30 user agents from common types of bots and scripts. Even so, my database logs show about 10,000,000 page views a day getting through to my site. That's a pretty large difference in reporting. This is why if someone is buying or analyzing a site, they want to compare the same source, such as Google Analytics, across all properties they're considering.

This is a problem I see - that no two analytics engines actually agree on the numbers, and none of them agree with my server logs.

I just flat don't trust GA, on anything. I can't trust it to give me the right numbers, so how do I know the rest of the data is actually correct, and what is the point of slowing my site down to have it?

I'll have a look at Fathom. If its numbers agree with my server logs then I'll think about adding it in. I can see the usefulness, but I'd prefer to just analyse the logs I'm already keeping (and maybe expand them to include user behaviour on the client).

If you ever want to sell your site, a decade of data is valuable. But the relevant part is when you start growing. If you have a couple of visitors, just maintain the site for fun (like my personal blog and some handy tools that I and a few others use), the data of that period is not going to be very interesting anyway. Once it starts growing and you start thinking "maybe this site has value for many people", then the analytics is more than -- as your parent comment says -- "well, no, but it's nice to have…". I don't think you're disagreeing: it's valuable if you want to sell, but if it's only nice to have and not necessary, you're probably also not having the kind of site worth selling.

So in conclusion, I support not putting analytics on sites that don't need it. And if it turns out you need it, most of it can be reconstructed from access logs (perhaps not the user's screen resolution, but definitely rough visitor counts). We have enough bloated pages already. But of course, if you already expect that you might need the data later, it's a different kind of site.

I can’t imagine selling saagarjha.com ;)

Stats were very important to me and other people I worked with.

Up until the time that google started hiding the keywords people used to find the sites / pages.

Now stats are not very helpful for us really. I do occasionally scroll through them and look for anomalies. Like heavy errors or subdomains or something as an entry point for example.

However just getting the basic server stats with awstats or analog or whatever is plenty. There is no need for third party java and such.

Without the keywords matching up to entry pages, it's mainly cpu usage and error codes that are worth checking. ommv.

I'm currently enjoying Goaccess [0]. It's command-line based, works on my nginx logs and tells me exactly the things I want to know. Might not be for everyone but I thought I'd share in case someone else found it useful.

I wholeheartedly support Fathom's endeavour and will definitely use them if I go back to JS-based analytics.

[0] - https://goaccess.io/

Such solutions don’t work if your server is behind a proxy like Cloudflare, since most requests will be served from cache.

And btw, this implies that you’re keeping logs on your visitors for other purposes than security. Under GDPR that’s not OK.

A good analytics solution will anonymize the data. Including Google Analytics if you configure it properly.

The GDPR question depends on what the logs include. If you don't include IP (or anonymize it) and don't include sensitive data (like userids/cookies or POST data) then it should be okay under GDPR. You can for example have one log for security (fail2ban and so on) that includes IP and one for GoAccess that does not.

goaccess has the option to anonymize its data using --anonymize-ip.

While it looks 'cool', it only gives you the basics, besides you need to write some custom scripts to make sure all the logs are continuously fed into it across process restarts and you need to filter out bots with grep.

Also last time I used, it didn't have easy way to filter out by date period, so it's only good for trends of the entire log span which is not really useful by that alone.

Fathom has custom date periods now. And yes, Fathom is meant to give the basics that most non-pro users require. It's a one-page dashboard.

I'm talking about GoAccess.

I was originally taking a look at that a few months ago, but since I have a few other small sites I also wanted to switch over (some of which aren't even on hosts I control), I decided to give Fathom a go.

Honestly, it would be best if there were an easy way to aggregate the logs into something like Fathom or GoAccess from any server with a little daemon running... but well, that's how I do it for larger sites with an EFK-based cluster. It just takes a lot more resources and maintenance!

> Before that you basically had web page visit counters (some of them with slightly more advanced features ala W3Counter and Stat Counter), and then on the high end you had Urchin Web Analytics

Before that you had access log summarizers such as analog:


Goaccess is a nice modern implementation of that: https://goaccess.io/

> Google Analytics is free, but Google uses the gobs of data about Internet usage it gets from tracking half the Internet for it's own purposes

Does anyone have a source on this? I attended a talk from someone from Google Analytics a few years back, and they claim that Google doesn't use any data they get from analytics for ads targeting. Not sure if their policies have changed.

Their Privacy Policy: https://policies.google.com/privacy?hl=en

> And we also use data about the ads you interact with to help advertisers understand the performance of their ad campaigns. We use a variety of tools to do this, including Google Analytics. When you visit sites that use Google Analytics, Google and a Google Analytics customer may link information about your activity from that site with activity from other sites that use our ad services.

> Google and a Google Analytics customer may link information about your activity from that site with activity from other sites that use our ad services.

To clarify: Linking your Google Analytics account with your Google Ads account is an explicit feature that the property owner can do to get a fuller view of the conversion funnel.

As far as I know (having worked on that team several years ago),

> they claim that Google doesn't use any data they get from analytics for ads targeting.

was otherwise factual. (That is, Google didn't willynilly take GA data and use it for its own purposes, or even cross-contaminate it in any way.)

Though don't know what might have changed in recent years.

This [1] happened in 2016. Google can change its privacy policy, but your data is still on their servers. The problem is not just a matter of what Google says about privacy, but a matter of trust, and Google has broken that many times already.

[1]: https://www.propublica.org/article/google-has-quietly-droppe...

I now use both Fathom and Google analytics.

Most of the time, the only thing I care about is where people come from, and which page gets accessed most often. I feel this should already be enough information for one to understand what is a popular topic.

I'm actually still super surprised that Fathom's javascript code is 1.6kb. Can't it be lighter?

"where people come from, and which page gets accessed most often" - I get this info from the server stats with something like awstats or analog stats (sometimes webalizer),

no google scripts, no extra javascript at 1.6k or less - much lighter, more privacy, faster.

Are these server side stats things not an option with your hosting? (they come included on some of my hosts, had to be wrangled to get one a freebsd host we run with)

I use github pages, so I can't gather such statistics.

I'm still trying to find an affordable, reliable, privacy-conscious analytics tool for a blog.

I don't want to run my own server. I'm willing to spend money. I'd like to spend ~$5/mo. I write less than 6 posts per year. A good post gets tens of thousands of lifetime views. A mediocre post gets five to ten thousand. My server burden is pretty bare bones.

All I really want to know are page views and referrers. My next experiment is to use a custom "tracking pixel" served from S3 and use log tools. But I have no idea if webcrawlers will distort my data or not. This would at least allow me to reliably count page views even for people who use ad-blockers.

Edit: Blog is static design, hosted on Netlify free tier. https://www.forrestthewoods.com/

I’m really curious about this too, as I have very similar requirements, and also host on Netlify (where ‘server logs’ don’t exist). The possibilities I’ve come up with so far are:

- Hosting Fathom using Docker someplace affordably. I’ve tried this with Joyent, where I’ve been running a custom Docker container for ~1.5 months with no problems. It should be about $2.50/month, but they offered $250 in free credits for signing up (not sure if this expires at some point), so it hasn’t cost me anything yet.

- Using a very limited custom script with Keen.io to log only the timestamp, site visited, and referrer. It seems that Keen.io has a pretty good privacy policy, and my site’s traffic and my views of my dashboard seem to only cost about $0.01 to $0.02 per month so far, and I haven’t actually been charged that.

Happy to share more detail on the above, and curious to hear if anyone has other ideas.

I have pretty much the same requirements as you - ~5$/mo is the amount I'd be willing to spend for a personal blog.

Another alternative I discovered recently is https://simpleanalytics.io/ . At $9/mo it's still above my threshold, but it's a little bit cheaper than fathom. I guess I'll have to keep looking for an alternative.

Well, maybe that's just the price we have to pay for the privacy, aka not being a product ourselves ;)

How are you hosting your blog? If you're already self-hosting that, please know that you can easily host a Fathom instance using SQLite (because easy) on that same server without having to worry of it eating up your system resources.

From experience, a Fathom instance on a $5 DigitalOcean droplet can easily handle millions of pageviews a month. It's really quite efficient.

Static blog hosted for free on Netlify.

Does the Fathom script get picked up by ad-blockers? That's one of my concern with any off-the-shelf solution. They all involve a one-line javascript injection which is trivial to block.

if your blog was wordpress powered I would suggest this "wp statistics" plugin - one I finally found that stays self hosted not third party.

but I look at your .xyz and the source looks like it's something much leaner powering your blog there, so I don't think this one will work with that system.

I use awstats on some non-wp sites and it works server side / backend, giving plenty of details included most visited pages and referrers. It also sorts many of the bots into different tables and such. I think it's free / gnu. ( https://awstats.sourceforge.io/ )

Do you have access logs? Just parse those. You probably already have the analytics.

I’m pretty sure you can setup access logs for static sites on S3

Piwik isn't bad!

For those who haven't been following it's development, it's been renamed Matomo[1]. Definitely an option worth considering for deeper analytics while still being a bit more performant and respectful of privacy!

[1] https://matomo.org

Thanks for sharing this summary. I had matomo in my notes, and was just wondering how it compares to Fathom.

If there are any more detailed comparisons, I'm all ears.

Good to know!

I tried it once on my potato server (which handled the HN front page just fine with php5.4 at the time (not as fast as php7) and 3 mysql queries per pageload) and it's super slow. If you run Wordpress, it might not make a difference. But if you run something low-powered, Piwik will kill your server.

Edit, sort of: I should note that this was years ago, maybe four years or so. I still use the same server so it probably only got slower, but on the off chance that it actually got faster... you might give it a go.

Been enjoying countly on a DO droplet for $5/mo

Finally wrote up how I did it and what I learned about the different ways of doing it.


Countly needs more Advertising or evangelism. I only recently heard of it on HN. It may have been easier if they had monthly plans rather than going straight to enterprise.

Fathom has a one-click on DO now too, for $5/mo. Although if countly is working for ya, I wouldn't switch either.

Likewise. Using Countly on a DO instance.

I track conversion rate on all my sites without JS, for free, and without hitting a DB on each request. I just download my access logs and parse them with a 5 line script.

I don’t see the point in tracking anything but conversion. I don’t need to know screen size or browser breakdown, because my websites are responsive, and work in all browsers since IE8.

For those using Google Analytics and the like: what are you actually tracking and taking action on?

Browser Breakdown is the one feature that I'd like from a contemporary analytics package that appears to be missing from Fathom. Simply because I'd like some slightly smarter way of checking which feature set I can get away with using as I've been keen to roll out a fully CSS Gridded layout on a couple of my own sites. It appears from my archaic analytics (I gutted GA from everything around GDPR time last year because - you're quite right - WHAT DO I gain from it, fundamentally and actionably?) I'm not able to quite yet, sadly.

Does anyone have any experience with Open Web Analytics? It looks very feature rich and it doesn’t have matomo’s annoying business model


I recently looked at all the open-source and self-hosted analytics systems I could find. OWA felt very cumbersome, no native docker support, and I wasn't even sure if it was actively maintained. I ended up going with Matomo and haven't encountered any issues with their business model yet.

Unless I missed it here or elsewhere, GA is the only option that integrates event tracking. That's really important, especially in js-powered UIs. I so want to ditch GA, but I so need (integrated) event tracking.

We're working on this for Fathom (I'm the cofounder of Fathom).

If you want to leave Google, but are not opposed to proprietary infrastructure, you could use something else with event tracking, like HEAP (https://heapanalytics.com/), which is very good. Or if you prefer open source, Matomo seems like a good alternative, with both hosted and self-hosted options (open source, https://matomo.org).

Just curious, did you notice any change in your Google SERP rankings after the move?

So far, no. And hopefully it doesn’t affect SERPs or that seems like some sort of antitrust violation, no?

I've had Fathom on my sites for almost a year, no difference in ranking that I've seen.

For those not in the know, you can configure Google Analytics to:

1. not share data with Google for their own purposes

2. anonymize IPs

3. reduce the max age of their tracking cookie or disable it completely

GA is also compliant with GDPR if configured properly. At least one DPA thinks so.

Self-hosting of an open source solution is great, but I trust Google more than I can trust the hosting provided by startups. That's because Google is under constant scrutiny, whereas the startups working on these alternatives are not. Of course, you can make your visitors feel good about not using Google Analytics and get some free publicity, but that's another issue.

3 posts on the same topic within 24 hours :(

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact