Hacker News new | past | comments | ask | show | jobs | submit login
You don't need analytics on your blog (yossarian.net)
67 points by woodruffw 4 months ago | hide | past | favorite | 73 comments



Knowing that people read the stuff I write is the only motivation I have for writing at all. If all my posts are seen only by bots, then the hours I spend making them are pointless.

I really enjoy the light-weight analytics I use for my personal blog. I self-host plausible; it has no cookies or persistent tracking. Just user counts per page, device types, and stuff like that. There are zero valid privacy concerns from this setup.

I get a ton of satisfaction from seeing a spike in traffic to some post and then tracking down the tweet/reddit post/etc. that caused it.


I always viewed comments as the true decider, for me personally. A view could be anything, but someone taking the time to leave a comment or reply to somebody else is a great motivator for further writing.

But I'm of the "a blog without comments is not a blog" folks.


It's 2023, just federate your blog and pull in comments from across the ActivityPub spectrum. If you're into low effort, just webmention.io it, else there are premade solutions for WordPress, 11ty, Jekyll, and many others.

Comments are so 2003 pre-Trackback/Pingback.


I don't have comment fields on my blog. Comments often feel low effort and noisy. Also no analytics.

Figure if someone wants to reach me there is email, and if someone wants a discussion, they can post the article here or on reddit or whatever...


Depends a lot on the niche. A lot of people are just readers in tech blogs, and only comment if you made a mistake. But if you have a personal blog, the most random opinion can get dozen of comments even with a fraction of the readers.


Different strokes for different folkes I guess.

I've been blogging for a decade and I never bothered with analytics of any kind.

Sure, it is a good feeling if you reach the frontpage of HN, but that's not why I blog. I do it for me, regardless if nobody reads it.


> Knowing that people read the stuff I write is the only motivation I have for writing at all. If all my posts are seen only by bots, then the hours I spend making them are pointless.

For me, it varies. If my aim is to make an impact around me, then the more traffic there is, the better. However, sometimes just clarifying my own thoughts is a good enough reason for writing. Anything beyond that is a positive side effect.


Interesting, I write just to have those ideas in the wild, not necessarily that anyone would actually read them. I will say that I have a private "blog" or journal as well as my public one. My private one is usually for day to day musings while if I find an interesting enough idea, I polish it up and "promote" it to the public blog on my website.


You can just use the logs of your webserver for that...


where you can't tell if they're all bots


Some (most?) of them announce they're bots with their user agent.


I find goaccess the best solution for this if you host your website or web app on a vps.

No client side analytics, tracking, or cookies at all!

goaccess generates same/similar dashboards as plausible without having to host/manage any additional self hosted services


I think the point was you don't need client side analytics (i.e. the "putting analytics on a page" type). Your server logs are more than enough to get all the information you're talking about.


The first thing they say is "I parse the access log". So they are also checking how many people are visiting.


Bit of a clickbait title. Should be changed to "You don't need an off the shelf client-side analytics script".


Same but i use self hosted umami


Could you describe how you set this up?


Not the above commenter but I used Coolify on Hetzner to self host Plausible and other services, works great.

https://Coolify.io


>the only way I track traffic trends on this blog is with vbnla, a hacky little Ruby script that summarizes my Nginx access log.

This is literally analytics. I didn't read the rest of the post, so if you're arguing against things like GA, fine, I'm with you. But any kind of analytics, even GA, ultimately does the same thing, just with a ton more complexity and additional tracking sources.

FWIW this is what Ive used for a long time https://news.ycombinator.com/item?id=31841051


The post is about browser analytics; this is in the second sentence. It's possible you skipped that to get to the part about the script.

(I disagree that this adds any complexity or tracking sources. It's a 70 line script that took me about half an hour to write, and has required no significant changes since I wrote it ~5 years ago.)


Not the original commentor but I've never seen the term browser analytics used before. That certainly doesn't help. Maybe client-side analytics would've been clearer.


This demonstrates my overall ignorance of analytics terminology :-). I'll add a footnote clarifying this, thank you.


Re: added complexity, I was referring to analytics solutions like Google analytics: it adds a ton of sophisticated tracking stuff, but is ultimately a server (well, more like several thousand) somewhere with logs registering visits, not so fundamentally different from your script. I also do a very similar thing (also in Ruby!). Apologies for the lack of clarity, English can be so ambiguous.


No problem at all, thank you for clarifying!


> I removed them in back in 2017, with no discernible negative impact on my writing motivation

In other words, “I don’t need analytics on my blog”? Sure, some people don’t need it, and sure focusing on analytics can have downsides, but why assume or project your assumptions on others? Maybe the slightly dramatic or hyperbolic title is something the author learned by watching analytics or seeing what other popular blog titles do… ;)

For anyone who wants to reach a wider audience, analytics can be great. Speaking from experience trying to write a blog for a startup -- analytics will show you that the things you care about aren’t the things your audience cares about. :) Analytics can help you learn what topics engage your audience. They can also show you how when you go deep and technical on things, hardly anyone reads to the end or engages. But! Sometimes those kinds of posts will have much more staying power in the long run. Personally analytics helped me bring more balance and variety to my posts, and not shun the deep dives necessarily, just avoid overdoing them. That was nice in a way because it made writing easier; I thought I needed to do a long high quality piece every time, and that assumption was wrong.


It's a post about a personal blog; I don't think the claims in it necessarily apply to a corporate or startup blog, where business interests may trump what (in my opinion are) healthy personal blog-writing habits.


Sure, though the title doesn’t say ‘your personal blog’, so if it was meant for a subset of blogs, that’s another way it’s making assumptions and could have been more specific… but then it wouldn’t have people reacting to it! ;) Personally, I think my point is valid for personal blogs too. Writers who would like to grow their audience can benefit from seeing the analytics.

I don’t know what healthy writing habits are… I suspect my HN commenting habits aren’t super healthy. ;) It’s definitely possible to focus too much on analytics, my point is simply that saying “you” don’t need them at all is perhaps silly. They’re a useful tool, one of many.


The second sentence of the post says that it's about personal blogs.

You're right that "I don't need analytics on my personal blog" would be a strictly more descriptive title. But I don't write blog posts for titles, I write them for their content, expecting people to actually read the content and contextualize it as necessary. One reasonable form of contextualization would be a value judgment like "actually, I do care about reader numbers" and ignoring the rest of the post's prescriptions.


You’re right. This is all in fun, the post is being a little cheeky. My crit is too. I don’t write posts for the titles either, but you know as well as I do that titles are especially important. There’s nothing wrong with trying to be a little intriguing or dramatic with a title, that is one of the valid functions of a title, but I think it’s fair game to call it out when the title is misleading or inaccurate or makes assumptions, no? The fact of the matter is there are plenty of people who like seeing analytics on their personal blogs.


Don’t need ‘em. Sure enjoy having ‘em. I use self-hosted Matomo to see what people are reading. And much like profiling any program I’ve written, the hotspot is always something different from what I’d have expected.

Having analytics doesn’t change what I write. I still enjoy the information.


Right, but the point was you don't need on-page analytics for that. Seeing what people read (and when, and from where) is already in your self-hosted server logs, or is one "enable server logs" setting away.


How can you know what they've read inside an article? Sometimes I'll embed Hotjar and similar services into my site just to see the heatmap of what people have mouses over or text selected. Still looking for an open source self hosted version though.


A far better question is: what does knowing let you do? Does it tell you how to be a better writer? Not really. Does it tell you want people want to read? Also not really, because you have no way to separate the "people spending time on a paragraph because it's high quality information" from "people spending time on a paragraph because it'll make for a good tweet quote".

If you care about your writing, then your aim is for people to read your whole post, not just parts of it. And you might think you want to know how much time people spent on your page (i.e. what your bump rate is) but even that tells you nothing: what are you going to do with that information? And that's not rhetorical: do you actually work on your writing based on whether people scroll past the fold? Because if you do, I'd love to hear what your workflow around that is.

Unless you're trying to monetize your readership through ad placements, analytics like these are a little lie we like to tell ourselves matter, when really they don't. They're nice to look at, but eye candy isn't a good enough reason to run tracking and behavioral profiling on a blog.


From the initial parent comment which also serves as my answer to your question:

> And much like profiling any program I’ve written, the hotspot is always something different from what I’d have expected. Having analytics doesn’t change what I write. I still enjoy the information.

I like seeing the hotspot so I know what to focus on in the future. It's really no different than in startups, let's say you build features A, B, and C, but most people are focusing on C, knowing that info ensures you better serve your users. For a blog, your user is your reader. You might say server level analytics might work, but just as in startups, it assumes that there is only one feature (or for blogs, only one idea) per page. Hence, intra-page analytics can be useful.


This to me is intrusive and creepy. Glad I can block Hotjar easily and do not wish to have to check for self-hosted scripts doing it.


Not sure you can block self hosted Hotjar type scripts if I just embed them into the JS bundle itself for the site. You could try disabling JS but most people won't, and I could also disallow text being rendered if there's no JS. Not that I would do any of this of course, since I like no JS sites as well as SEO for my blog posts, but it's a hypothetical of what one could potentially do.


Blocking XHR requests by default would probably be a good start before 1st party JS. I already default to blocking 3rd party content.


That's true, depends on the kind of blog though, and whether it's a SPA or not.


Because hotjar and other such things are incredibly invasive?


Well, it's my site and I can host whatever (legal) content I want on it, it's akin to having cameras in my store. Similarly, people can choose not to visit the site.


Embedding intrusive tracking code that is invisible to the average user is not akin to a visible camera in a store, at all.


How is it different? Cameras can be invisible in a store too and often are.


It's a common understanding that CCTV exists. It's not a common understanding that this level of user tracking is happening.

and being in public visiting a store vs reading a website in the privacy of your own home are clearly very different situations.


Maybe it's not commonly understood but that is what is happening regardless. Being at home when reading content versus in public doesn't make any analogical sense because the internet is not a physical space, so it should have no bearing on your expectation of privacy if you visit someone else's site.


It's your anology about shopping in a store -vs- reading the web [anywhere, including in private].

>but that is what is happening regardless

So the justification is "everyone else is doing it" - OK.


Same with cameras in stores then, there is no expectation of privacy when entering a store, and similarly, there is none when visiting someone's server and website.


As already stated: most people are unaware of and do not expect this level of intrusive behaviour tracking. I don't think I can be any clearer on this point.

Enough going around in circles.


I'm just stating that the justification is not "because everyone else does it," because you could say the same about stores. That is why I mentioned that they are not "clearly very different situations" as you said, they are the same situation; one is going to a place that is not theirs so they have no expectation of privacy. Just because one is in cyberspace and the other is in meatspace doesn't mean that they are necessarily different in analogy.

That people know about this type of tracking with stores and do not with websites is not my problem, and at best, they should be educated about that, and indeed we have laws against this exact thing, such as CFAA, and no judge will let you off the hook just because you said it was in cyberspace and not meatspace that you invaded someone else's servers. Therefore, both are analogously the same.


Not your problem and someone else should educate people about what you're actively choosing to do, got it.


That is correct, people not knowing about CCTV is not the problem of the store. At best, they could have a sign stating so but are not obligated to do so.


Or more likely, all of your cameras disabled by ublock.


It actually would be quite difficult for adblockers to block 1st party scripts if I just embed the tracking into my JS bundle. At that point the adblocker would have to deobfuscate the code and block selective parts of the JS.


Only if you self host or proxy all of the tracking requests. Otherwise, the tracking will occur but get trapped in the browser. The moment your code tries to send the request to hotjar or any other domain, it's blocked.


Yes, hence why I said in my initial comment that I'm looking for a self hosted version, mostly because Hotjar is bloated and has unnecessary amounts of extraneous JS that slows down the page.


Thing is, with Matomo you can use on-page analytics and server logs. That’s what I do. If I’ve gone through all the trouble of setting it up, feeding logs into it, and making it viewable, it’s practically no extra work to add the on-page stuff. There’s no drawback to adding that bit of JS, even if it only adds a little bit of extra information.


I reached a similar conclusion with my own blog[0] and run a small rust process that parses access logs.

However, I could only find a need to know which posts are popular and which give 404 errors, nothing else. I'd argue that for big chunk of personal blogs, collecting general geographic and/or traffic sources isn't actually useful nor justified.

[0] https://tiuraniemi.org/blog/site-setup#_statistics


One of my 2024 goals is to write more and better. This 2023 I've written more than ever, but I've focused more on visits than on quality. This year I want to focus on getting better at writing. This article has convinced me to turn the analytics off, or at least to not check it as frequently as I do it now.


This was a nice and short essay.

I think that it can be hugely motivating to see your essays reaching people. However, as the author pointed out, it can also be demoralizing if the ratings are lower than expected.

With that in mind, I don't think we should stop traffic analysis. The issue seems to lie in our initial expectations. I plan to apply some analytics, but I don't necessarily anticipate high ratings.

Also the author probably isn't advising us to completely discard analytics. He actually provides us with a simplified method for it. I think what he tries to do is to suggesting that we should not obsess over it too much. I agree that if analytics demotivate us from writing, it might be best to completely remove it.


If one wants server-side analytics with a little more info than the author's "hacky little script", there's always goaccess [1], which functions in broadly the same way. I even use it with Firebase Hosting-hosted sites via [2] (which I wrote).

[1] http://goaccess.io/

[2] https://github.com/Silicon-Ally/gcp-clf


The worst offender for this I ever saw was a analytics company using their SDK on their blog. It was generating 1 request a second logging absolutely completely stupid metrics(for a blog) and my local DNS resolver was filled with >100k requests from a tab that was not even in the foreground!

Took a little bit of effort tracking down the tab generating this traffic and now they are banned from my local network forever.


You make a good point. I just removed GA from my blog.


Instead of the simple Ruby script, I think GoAccess is better. It's a simple command line tool available in nearly all package managers. You can simply pipe your logs to it and get an ncurses based summary of the site.

zcat /var/log/nginx/access.log* | goaccess


I don’t have any on my blog at all. I just let go of any interest in knowing.


Of course you don't.

You don't need your blog either.

But analytics are a fun toy to mess around with, much like your blog is.


You don't need tracking period. You don't know who, or even if anyone, reads yor blog.


I think there are good logistical reasons to have logging; I enumerated some of them in the post.

Besides logistical reasons, I think there's very little value in refusing to acknowledge that many people find attention to be a useful motivator (even if it isn't the best or right motivator). The point of the post was to communicate that you can track a blog's attention without relying on third-party services or hosting your own heavyweight analytics solution.


Note that IP addresses are personal information, so if you're logging them like this, you'll need to get consent (or have some other legal basis) and also handle deletion requests for EU users. One of the many joys of GDPR compliance.


On a personal blog? Good luck enforcing that. If I host the thing myself and your IP address is reaching into my network to read my blog you have to expectation that I'm not going to record the data if I care to.


These logs get rotated out after a week, and fall well within the "purpose limitation" clause of the GDPR.


I log traffic. I use log rotate with a 30 day rotation. No one can complain because it's gone within a GDPR compliant window of 30 days.


Or, you know, it's a personal blog: you don't, because access request IPs are part of how TCP/IP works (it's literally in the name) and cannot identify a person unless your website has a user management system and logs IPs in tandem with session id activity so that that the IP can be resolved to "a person".

This is one of those "if someone wants their IP scrubbed, they can request it, but lol no" situations: for blogs run by regular folks instead of services run by businesses, just pointing to "it's in the GDPR" is not enough. The GDPR gets a LOT of things intentionally wrong so it can go after PII-gathering-so-that-can-be-monetized businesses.


cgnat


Taking this opportunity to tell you about my open-source side project: I've built a very simple event-based analytics solution some time ago. You can simply host it yourself and track basic events. It's not possible to track any personal information, not even IP: https://github.com/shafy/fugu




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: