Hacker News new | past | comments | ask | show | jobs | submit login

Can someone explain the attraction of embedded JS for analytics, what exactly does it buy you versus log parsing?

Log parsing seems like the logical choice for the static site crowd but it seems like there's little interest there. I must be missing something.

I can tell you in one word: bots.

Any site is constantly being accessed by bots, only some of whom announce themselves in the user agent. Some are deliberately designed to mimic human browsing and you can only tell by carefully following their access pattern.

Filtering out the bots from the logs in an automatic fashion seemed like too much work so I implemented a javascript beacon. I figure anyone without JS probably doesn't want to be tracked anyway.

At least static sites on places like GitHub pages need client-side analytics as you cannot access their logs.

The most obvious one I miss (I don't currently only use server-side analytics) is something like screen size. You can do user-agent sniffing to _guess_ what the size of a mobile device is, but it doesn't tell you whether or not you can stop wasting time making your content responsive on a tiny screen that no-one uses anymore.

Are you talking about mobile ? Sure there are less pixels but the size if my phone (i.e. the size of my pocket) is between 15 (laptop) and 25 times (desktop screen) smaller in area. My pocket will be smaller than my arms or my desk. So a different site presentation will be in order.

You can probably use UA Client Hints to do this; but it requires some custom-fu and doesn't work in all browsers.

You can do this with only media queries in CSS, most likely.

The disadvantage there is that sometimes you want to give the user a totally different site if their client is mobile. CSS queries are indiscriminate in that a smaller browser window may trigger the “mobile” css. Likewise, many tablets have similar screen sizes to some laptops, yet often you don’t want to present the same UI to a tablet and laptop.

Out of curiosity, what are the options for log parsing? The ones I know are Awstats and Webalizer. Is there any more up to date or more modern alternative to these two?

GoAccess is great nowadays.

Matomo supports log parsing.

There’s 100x more bots that don’t run JS than do, so the presence of a running JS environment is a helpful filter. User agent is not a good substitute.

So, so many bots. 100x might be hyperbole but only just.

We host websites and the bots are super annoying, because even the well-behaved ones throttle requests per domain, which means they just hit all of our customers at once. If our cache architecture were a little more rotten, like I’ve seen on other jobs, then bot-driven evictions would get ugly, instead of just spiking our traffic, increasing our overhead, and making it harder to get clear metrics.

I agree with you, however the only popular static site host (other than having your own vps or whatever) to add an analytics solution that I've seen is Netlify. I use it and I'm pretty happy with it so far, but it's very barebones and pretty expensive ($9/month for just the analytics)

A few reasons have been pointed out by others, but let me include another.

others have pointed out:

- Client side SPAs sometimes don't hit server logs

- Some static sites are hosted places where you don't have access (github pages, netlify, etc)

- Bots are sometimes defeated by a simple js file

But another one that is not mentioned is one that effects large apps and services. Many large apps and services don't exist on a single server. Furthermore servers are launched and destroyed on a whim to meet scalability needs. Javascript analytics easily surmount this. I suppose it is still possible to feed multiple server logs into a single source of truth for analytics, but I dont know if such a solution exists right now. JS Analytics easily overcome this obstacle.

We do something similar with logs. Everything gets fed into Splunk and we do analytics from there.

For dynamic sites, new page views often happen without any HTTP requests as the app renders the page and changes the URL entirely client side.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact