I know most people don't care to look back much, and leave old content to rot... but most of the pages with over 100,000 lifetime visits are pages from 3+ years ago (like this ridiculously short post on how to sync a shared Google Calendar with Calendars on macOS/iOS).
Usually my solution for this issue is to use people emailing me about things as a proxy of whether I should go back to something, and I think it works pretty well (though, I don't have the data to compare, ironically ;) ). If something's useful to many people but wrong/broken, eventually someone is nice enough to let me know. This has the benefit of making user consent very explicit, since they can choose exactly how much they want to "share" with me.
Also, I'm enjoying the @media(perfers-color-scheme: dark) support on your website! I usually call these out when I see them, since they're still rare, but it was before sunset when I first commented so I wasn't using Dark Mode then.
There are still a couple of bugs I need to work out with code display in a few spots (mostly because of some old color-unaware plugins I use), but it was pretty quick and easy; I wrote up my process here.
If you ever want to sell your site, having a decade of Google Analytics is valuable.
Yes, you could use a different analytics software or logs for the above. However, since Google Analytics is the standard, it's familiar, trustworthy, and performs equally across all sites. For example, Google Analytics shows I have about 1,000,000 page views a day. I have Apache blocking at least 30 user agents from common types of bots and scripts. Even so, my database logs show about 10,000,000 page views a day getting through to my site. That's a pretty large difference in reporting. This is why if someone is buying or analyzing a site, they want to compare the same source, such as Google Analytics, across all properties they're considering.
I just flat don't trust GA, on anything. I can't trust it to give me the right numbers, so how do I know the rest of the data is actually correct, and what is the point of slowing my site down to have it?
I'll have a look at Fathom. If its numbers agree with my server logs then I'll think about adding it in. I can see the usefulness, but I'd prefer to just analyse the logs I'm already keeping (and maybe expand them to include user behaviour on the client).
So in conclusion, I support not putting analytics on sites that don't need it. And if it turns out you need it, most of it can be reconstructed from access logs (perhaps not the user's screen resolution, but definitely rough visitor counts). We have enough bloated pages already. But of course, if you already expect that you might need the data later, it's a different kind of site.
Up until the time that google started hiding the keywords people used to find the sites / pages.
Now stats are not very helpful for us really. I do occasionally scroll through them and look for anomalies. Like heavy errors or subdomains or something as an entry point for example.
However just getting the basic server stats with awstats or analog or whatever is plenty. There is no need for third party java and such.
Without the keywords matching up to entry pages, it's mainly cpu usage and error codes that are worth checking. ommv.
I wholeheartedly support Fathom's endeavour and will definitely use them if I go back to JS-based analytics.
 - https://goaccess.io/
And btw, this implies that you’re keeping logs on your visitors for other purposes than security. Under GDPR that’s not OK.
A good analytics solution will anonymize the data. Including Google Analytics if you configure it properly.
Also last time I used, it didn't have easy way to filter out by date period, so it's only good for trends of the entire log span which is not really useful by that alone.
Honestly, it would be best if there were an easy way to aggregate the logs into something like Fathom or GoAccess from any server with a little daemon running... but well, that's how I do it for larger sites with an EFK-based cluster. It just takes a lot more resources and maintenance!
Before that you had access log summarizers such as analog:
Does anyone have a source on this? I attended a talk from someone from Google Analytics a few years back, and they claim that Google doesn't use any data they get from analytics for ads targeting. Not sure if their policies have changed.
> And we also use data about the ads you interact with to help advertisers understand the performance of their ad campaigns. We use a variety of tools to do this, including Google Analytics. When you visit sites that use Google Analytics, Google and a Google Analytics customer may link information about your activity from that site with activity from other sites that use our ad services.
To clarify: Linking your Google Analytics account with your Google Ads account is an explicit feature that the property owner can do to get a fuller view of the conversion funnel.
As far as I know (having worked on that team several years ago),
> they claim that Google doesn't use any data they get from analytics for ads targeting.
was otherwise factual. (That is, Google didn't willynilly take GA data and use it for its own purposes, or even cross-contaminate it in any way.)
Though don't know what might have changed in recent years.
Most of the time, the only thing I care about is where people come from, and which page gets accessed most often. I feel this should already be enough information for one to understand what is a popular topic.
Are these server side stats things not an option with your hosting? (they come included on some of my hosts, had to be wrangled to get one a freebsd host we run with)
I don't want to run my own server. I'm willing to spend money. I'd like to spend ~$5/mo. I write less than 6 posts per year. A good post gets tens of thousands of lifetime views. A mediocre post gets five to ten thousand. My server burden is pretty bare bones.
All I really want to know are page views and referrers. My next experiment is to use a custom "tracking pixel" served from S3 and use log tools. But I have no idea if webcrawlers will distort my data or not. This would at least allow me to reliably count page views even for people who use ad-blockers.
Edit: Blog is static design, hosted on Netlify free tier. https://www.forrestthewoods.com/
Another alternative I discovered recently is https://simpleanalytics.io/ . At $9/mo it's still above my threshold, but it's a little bit cheaper than fathom. I guess I'll have to keep looking for an alternative.
Well, maybe that's just the price we have to pay for the privacy, aka not being a product ourselves ;)
- Hosting Fathom using Docker someplace affordably. I’ve tried this with Joyent, where I’ve been running a custom Docker container for ~1.5 months with no problems. It should be about $2.50/month, but they offered $250 in free credits for signing up (not sure if this expires at some point), so it hasn’t cost me anything yet.
Happy to share more detail on the above, and curious to hear if anyone has other ideas.
From experience, a Fathom instance on a $5 DigitalOcean droplet can easily handle millions of pageviews a month. It's really quite efficient.
but I look at your .xyz and the source looks like it's something much leaner powering your blog there, so I don't think this one will work with that system.
I use awstats on some non-wp sites and it works server side / backend, giving plenty of details included most visited pages and referrers. It also sorts many of the bots into different tables and such. I think it's free / gnu. ( https://awstats.sourceforge.io/ )
I’m pretty sure you can setup access logs for static sites on S3
If there are any more detailed comparisons, I'm all ears.
I don’t see the point in tracking anything but conversion. I don’t need to know screen size or browser breakdown, because my websites are responsive, and work in all browsers since IE8.
For those using Google Analytics and the like: what are you actually tracking and taking action on?
1. not share data with Google for their own purposes
2. anonymize IPs
3. reduce the max age of their tracking cookie or disable it completely
GA is also compliant with GDPR if configured properly. At least one DPA thinks so.
Self-hosting of an open source solution is great, but I trust Google more than I can trust the hosting provided by startups. That's because Google is under constant scrutiny, whereas the startups working on these alternatives are not. Of course, you can make your visitors feel good about not using Google Analytics and get some free publicity, but that's another issue.