I work in website analytics. Cloudflare's numbers are laughably bad. Really, they are not even tracking visits, they are tracking times the link is requested. So everytime the HN page gets updated, that link is getting clicked by the dozen or so bots that trawl that page.
It's a remarkably bad way to do tracking. At the very least, a page view shouldn't be tracked until the page is finished loading for the user. Otherwise you are collecting garbage data.
But fwiw, the specific number of page views is never that important in the grand scheme. You should really be looking at trends, and so long as you pick a number that is consistently measured as your baseline, you can leave well enough alone.
>Really, they are not even tracking visits, they are tracking times the link is requested. So everytime the HN page gets updated, that link is getting clicked by the dozen or so bots that trawl that page.
Very true. I do think metrics are always a balance between precision and ease of collection, and counting IP addresses is definitely towards the ease of collection side. Regarding the bots crawling HN submissions, that would be backed up by an even higher rate of Cloudflare:Squarespace visitors for this submission (about 10:1 vs about 8:1).
>You should really be looking at trends, and so long as you pick a number that is consistently measured as your baseline, you can leave well enough alone.
Thanks for the perspective here - a kind reader sent me a similar message via email as well! The nice thing is that I don't have particular goals for my personal blog, so these analytics numbers feel more like interesting trivia than some OKR metric that I need to increase somehow.
> a page view shouldn't be tracked until the page is finished loading for the user
Isn't this a little more client-driven than Cloudflare would have visibility to, unless we narrow "page load" down to something like "probably read all of the input stream"?
I think that's the thing - Cloudflares metrics are flawed on premise.
If you owned a store at a mall, there are many decent thresholds for constituting foot traffic. But you should probably at least count people who step into the store and not just people who stop and look at it.
The advice I give more and more to my clients is to stop focusing on absolute numbers. Instead, pick a decent analytics package and use it to monitor changes over time. Focus on the relative, not the absolute. (You still need to be careful to make sure your changes over time aren’t because of an increase/decrease in bots.)
Similarly, focus on a few core metrics and don’t go down the rabbit hole that something like GoogleAnalytics tends to compel. Usually those metrics are a few basics (uniques) plus a small number of domain/operation specific conversion metrics.
Early Plausible customer here — Happy so far; most importantly is the necessary feature to proxy myhost.com/randomstats.js to their server to circumvent blocking.
Eventually these new-gen solutions all look and work the same now so it's mostly about the UI you prefer. Can't do magic without cookies and stuff.
I use goatcounter, self hosting the script and proxying the metrics endpoint through my website's server. This way there are no 3rd party assets to block. Also, there's a noscript image tag in case js is disabled that just lets me know how many times the page was accessed. The script prevents most bots, but the noscript probably causes a bunch of bots to count.
It's a remarkably bad way to do tracking. At the very least, a page view shouldn't be tracked until the page is finished loading for the user. Otherwise you are collecting garbage data.
But fwiw, the specific number of page views is never that important in the grand scheme. You should really be looking at trends, and so long as you pick a number that is consistently measured as your baseline, you can leave well enough alone.