Hacker News new | past | comments | ask | show | jobs | submit login
An alternative to using Google Analytics on your website (plausible.io)
462 points by markosaric on April 8, 2020 | hide | past | favorite | 301 comments



It sounds like a nice product, but sadly it's not exactly why GA is so popular (being a 'nice' product, that is). It is popular, because it's free, relatively easy to implement on any website, requires no self-hosting and powerful, once running. I've checked your website for pricing, but not only there's no word about free tiers, but there isn't even anything on the pricing at all. So, when it comes to majority of GA users, it's not a competitive solution.


There's a big blue button at the top of the page that says "Start Free Trial". Click it and you get this:

- Quick and easy to set up

- No credit card required

- Unlimited use during the trial

- From just $6/mo after the trial

Looks like a flat fee? Hard to tell from that wording. They should make this more visible on the front-page and elaborate on the "from just $6/mo".


A free trial isn't the same as "free for low volume / hobby usage". Not sure if that's what the OP is after, but it's what I'm after.

And to head off a couple of popular distortions/digressions:

1. There's a difference between wanting a free hobby tier and believing that I'm owed a free hobby tier.

2. Just because a company should be paid for their work doesn't imply that I'm obligated to use their product.


In this space I think wordpress.com provides a decent free tier, obviously only for wordpress sites, they have the pay for premium features instead of googles usage based policy.


I hate free trials, especially for my site. They get you hooked and then you have to cancel. It's not productive.


They get you hooked, aka proof their worth, and when it’s time to pay they have proven their worth. Or not.

Free trials are just fine like that.


Usually I look for a 'pricing' tab. Clicking on 'start free trial' tell me they are going to ask me for my email, instead of telling me how pricing works


Pricing can be seen at https://plausible.io/#pricing

6$ / month for a personal website.

12$ / month for a startup.

36$ / month for a business.


$6 is about the cost of monthly hosting for many personal websites. I'd argue theres not a good value proposition here, at the low end of the scale at least.


I think $6/mo. for fast, privacy-focused analytics for small sites is very reasonable.

There's no ad revenue to support this business — the website owner is the customer, not the product. That is a different business model from Google analytics or WordPress free analytics.


Depends what you use it for; if you're just vaguely curious about how many people visit which page on your site – which is probably what a large part of people use it for on their person sites – then $6/month is comparatively a lot, especially if you consider there are competitors which offer it for free (not just GA).


“From X” doesn’t sound like a flat fee, but either tiered or usage based, where the $6/m is the lowest possible you can pay.


I think if they can just be clear that they will do 10 million impressions per property per month for free forever that would clarify the free tier (and match google I think).


We do over 25m impressions a month and it's still get GA for free


I don't see why the price of GA is relevant to plausible's pricing.

With Google analytics, you are the product, not the customer. Free ad-supported SaaS is really a different value proposition from non-ad-supported paid services.


Pricing is close to the bottom of the home page.


Thanks. I didn't check the homepage and looked for pricing info in the navigation links, felt it was needlessly difficult. Pricing should be easy to locate from every page for a product like this.


Discovering that users are having difficulty finding pricing information... might be one useful piece of data you could garner with a good analytics tool.

Also I just checked out the pricing's placement and that is utterly terrible UI - the site uses light greyshade striping in the background for difference sections and then has the pricing block proceeded by a very dark background block with no information immediately visible - this is a clear visual flag for the rest of a page being irrelevant and a footer.

And, just to clarify, the site's actual footer uses the exact same coloring for the footer along with a similar amount of excessive top-padding.


Genuine question, how would you find that users are struggling to find pricing using an analytics tool? How would you set that up / what would you look for to identify that? Something like scrolling around / back and forth implying they're struggling to find data or something?


A good signal would probably be dithering as you mentioned above - either scrolling back and forth through sections or navigating through different components if you website has a paged layout.

Additionally, as a business, you want to make sure users are following a pretty predictable flow through some limited entry points (like homepage, FAQ, information bulletins) - then either to pricing and then additional information or additional information then pricing (then, ideally, a sale).

If a user is returning to the entry point (i.e. a flow like information bulletins > home page > FAQ > information bulletins) and that happens enough then you can assume that some information users want after the FAQ (maybe, ideally, you think the pricing) isn't being delivered to them and so they're returning to their entry point to see if they browsed the information incorrectly.

If you've got good stats on how far down on a page people are scrolling that could also be something helpful to detect the specific UI issue I mentioned in the post above.

Edit: I might suggest Don't Make Me Think by Steve Krug[1] as a resource if you'd like to learn a bit more about user flow analysis.

1. https://en.wikipedia.org/wiki/Don%27t_Make_Me_Think


Hmm, fair enough. This is ~what I was thinking, making inference from their flow through the site. I have seen that book on the desk of design folks I've worked with, but I've not looked through it... not a bad idea though. Thanks.


Indeed it is, I stand corrected :) However, when I was looking at it, I've skimmed through the top menu (nothing), scrolled down to the footer (noting) and finally clicked the Documentation link (nothing). Apparently the pricing info isn't exposed well enough.


> It is popular, because it's free, relatively easy to implement on any website, requires no self-hosting and powerful, once running

Tbh, I disagree. While a lot of the above is true, it's largely popular because it's a recognisable brand, and because the majority of frontend devs are already familiar with it and haven't felt the need to explore an alternative (network effect).

While an alternative does really need to be free and easy to implement, it also needs to address the above two hurdles, which is a tough ask.


GA is also widely supported by other tools marketers know, understand, and use daily. Its appeal is definitely not limited to developers - would-be alternatives might stand to benefit by considering other users.


GA also has support from many third party libraries because of the popularity that you can just plug-in and get started with.

e.g.

https://github.com/react-ga/react-ga

https://github.com/angulartics/angulartics2

https://github.com/MatteoGabriele/vue-gtag


Isn't GA itself a third-party JavaScript library that you can just plug-in and get started with?

Why do you need framework specific libraries on top of that?


You don't need framework libraries. The point is, for many frameworks installing GA is just adding one file to your dependencies and entering your UA number. Installing a comparable tool that doesn't have a library involves marginally more friction.


To handle single page sites or more specific actions


Photoshop is the defacto-standard in image editing, yet clones which are free and copy the photoshop UI are rapidly catching up for a segment of the userbase, eg. photopea.com

If someone set up a Scroogle Analoytics which had a simple code snippet, was free, and had near-identical looking dashboard, I'd use it.


In general, you should be very careful about what third parties you allow on your page because you're delegating to them the ability to do anything that can be done from JS. I trust Google to take this seriously and not open me up to XSS, but I would be much more skeptical about Scroogle Analoytics.

(Disclosure: I work for Google, and have run Google Analytics on my site since before I joined)


As someone who runs one of those 3rd-party analytics: the JS you add to your site is small, readable, and can be downloaded and included from your own CDN. Hell, you don't even need the script, you can just write your own (not very hard actually). I took a look at Plausible a while ago, and I think the same applies here as well.

With GA, I can do none of that; it's just a massive unreadable blob I can only load from the GA servers. I just have to trust Google doesn't do anything I don't want (including XSS, but also other issues). It's not even easy to figure out what information exactly it collects last time. It's very untransparent and un-auditable and the website equivalent of loading binary kernel blobs.


The concern over xss is important. But the data you send to google opens your users up to a different type of risk.


> rapidly catching up for a segment of the userbase, eg. photopea.com

How true is this? There's popularity, and there's significance in the face of Photoshop's market hold. I'd love to believe there's some significant dent being made in Adobe's market share here, but I doubt it.


While GA is free Photoshop has one of the most insanely bloated pricing approaches for casual usage - the fact it's handed out like candy to students (and various institutions allow this instead of familiarizing students with alternatives) is the only reason its kept such a dominant market position.

If your boss asks you for some analytics about site usage the first thing searching the web will turn up is GA and you can have it set up on a test env in less than half an hour including acquiring an API key and reading enough docs to configure it.


I use Photoshop alternatives because they're cheaper than maintaining a perpetual license/subscription, but Photoshop is the big gorilla because it is the best photo editing app in the industry today. Bar none. And that's what lets them get away with their pricing structure.

Nobody would pay for Photoshop if it didn't provide value.


I am (1) a current Photoshop CC subscriber and (2) an owner of the second-to-last version of Photoshop that was available under a perpetual license, and I disagree that Photoshop is the best photo editing app in the industry today.

For photo retouching, I prefer Affinity Photo. One thing I especially appreciate about it is how efficient it is at handling large, 16-bit-per-channel images with many layers.

Photoshop is useful to have around mainly because it makes it somewhat easier to work with certain kinds of PSD files that I get from other people.


I also have Photoshop CC and Affinity Photo, and while I use AP for the simple stuff, it can't hold a candle to the more advanced stuff you can do in Photoshop that is still on the (long-term) plan for AP.


Many people are quite surprised to know that Photoshop has actual competition, even when they don’t really want to pay for the actual thing. It’s often easier to just pay for it than try an alternative which may or many not be right for their needs.


This is the first time I've heard of photopea, thanks for sharing that!


Free + because it's so popular, when you have a problem, there are a million resources to find answers. Not even Adobe Analytics can match that.

My mate worked at Yahoo, and they used Yahoo Analytics, and he said it was great, but that when you ran into any issues at all, good luck working out what to do!

The reality of an analytics product is that a non-techy marketing person needs to be able to solve their own issues without wasting tech time (which is the only reason why TagManager exists), and that's a REALLY hard problem to solve.


So Google Analytics is largely popular because it's a recognizable brand, and not really because it's actually a easy-to-use and useful product, and you're going to claim the same for Google Docs, and for Google Buzz and GChat and GPlus and Google Wave?


I had the free version of Google Analytics, but ended up buying MonsterInsights on sale today so that I can see deeper - demographics, etc.


thanks for your feedback! We had pricing on the home page but have now added a pricing link in the top menu too :)


Also, unlike many smaller analytics providers, Google Analytics doesn't have glaring CSRF vulnerabilities and doesn't have creepy features that let you spy on individual users (no idea if this applies to Plausible or not).


> "[GA]...doesn't have creepy features that let you spy on individual users"

google reserves that for itself so as not to lose it's dominance of online advertising and targeting. why would they share their bread and butter with the little guys?


They made a statement within the last few years that they don't use GA data for their own purposes.

The scarier implication is they don't need GA to understand web trends. They can collect far more data using Chrome, AMP, Android, Ads, and more.


Is there a link to that? I've always suspected they feed GA data back into SER. But I don't know one way or the other.

It does seem like Google is in a textbook case of dumping to prevent new competitors from arising though.


As yes, I remember that statement - "Don't be evil" - that one, right?


Bull shit they don't. Let's see a sworn statement by a Google executive in court or else I don't believe it.


The point people are making is that it's subtler than that. They don't need GA for a surveillance machine. They've already got one.

https://www.youtube.com/watch?v=Ea8GyscSFaQ.

It's more valuable to them to be able to truthfully say they don't add GA data to it. Meanwhile offering GA for free chokes out that competitive avenue to anyone else.


No idea if that's true or not regarding Google Analytics, but you can opt out of all data sharing and I imagine Google would be in a lot of trouble if they didn't honour that.


There is no comprehensive opt out to Google tracking. If you disagree, show me the button that opts out of doubleclick (and all Google subsidiaries and data sharing partners) targeting ads at my home IP address, cell phone, car, etc.




Opting out of targeted ads is NOT the same as opting out of tracking.


But it's what GP asked for.

I'm also unclear on whether "opting out of tracking" under the definition you're implying is possible. Any entity tracking you needs to track that you've opted out of tracking, which requires tracking you. Whoops.


I reject the idea that you have to be tracked to not be tracked.

Imagine a system where you opt in to tracking, if you are not opted in, no data about you is written into any system used for business analysis.

Of course you can't/shouldn't stop all writes to your systems, like server access logs, but you can draw a line between operational and business data.

Of course if you have access to both operational and business data sets you can analyze data for people who have not opted in, but I would rather live in a world where we at least try to enforce a separation between the two.


Just add

  0.0.0.0         google-analytics.com
  0.0.0.0         www.google-analytics.com
to your /etc/hosts if you want a comprehensive opt-out.


You can opt out of all data sharing in Google Analytics on the account settings page.

There are also ways to opt out of ad targeting but that's irrelevant to this discussion.


I don't think that's true at all. You can opt out of seeing the results of Google tracking in your own Google profile, but it doesn't stop them tracking you.


i'm sure there are people at google who want to honor those opt-outs in good faith, and there are those who don't.

the massive organizational system itself, nevermind the massive computational systems, acts against doing this thoroughly and auditably.

it's also hard to believe that they'd get the data out of every last nook and cranny when they don't even swap out dead hard drives in their datacenters because of time and cost to find them (as related to me by a googler a number of years ago).

and they have an army of lawyers and lobbyists for anything that falls all the way through the cracks to the public in a way that allows lawsuits (gdpr and the like).


I don’t work at Google, but I’ve worked for and with many big corporations and government entities.

The army of lawyers look within too. The problem with big companies is usually that they lobby their way to do things that you don’t like. They are usually very good at working within those parameters.

That’s why you generally don’t see data breaches in places like Google. When you do, they are usually a combination of acute issues (hacking, etc) and incompetence (Equifax). There are overlapping controls and segmentation that make it harder to rogue stuff without oversight.

At a startup or small adtech place, forget it. The controls are not there and the company has nothing to lose.


> Google would be in a lot of trouble if they didn't honour that.

There are plenty of examples where Google have not honoured such expectations of trust. They have not, to date, gotten into "a lot of trouble".

The GDPR is probably the piece of regulation most likely to possess teeth big enough to get Google into a little bit of trouble, and the task of enforcing that is down to 1 person in a deliberately underfunded department in the country where Google has chosen to locate it's EU HQ[0]

There is no will to prosecute Google for misbehaviour in this area.

- [0] https://www.irishexaminer.com/breakingnews/ireland/over-6700...


Google Analytics consultant here.

Actually, GA does have a creepy feature that lets you spy on individual users. It's called the User Explorer report (under Audience in the left-hand nav). It lets you page through each hit from a particular user. This is also the report that lets you process Data Subject Access or Deletion requests under GDPR or CCPA.


Did you mean XSS?


I'm specifically talking about CSRF as I've found that to be quite widespread and some analytics providers have had flaws that can lead to your account being deleted or stats sent to someone else's email address, etc. I don't know if there's a similar problem with XSS, but I trust Google's security practices more than most other companies.


I see. I always have this idea in my head, to start a service and get people to put my script on their web sites, and pull a grand scam one day by changing the script to do something malicious on those poor sites that didn't add SRI.


Nothing is “free”. You are selling Google your visitor’s data so that they can let you take a peek at a subset of what they collect.


> Nothing is “free”

Can we stop with this nitpicking? You’re not paying any money, so it’s “free”. That Google analyses your data is another story: it could do it whether you’re paying or not.


Google cannot analyze your user traffic if they don't have a pixel on your site. What are they going to do? Look at your traffic logs?

Designing a website that has traffic insights while respecting your visitor's privacy requires effort.

If you want to delegate the insights, you can pay for it or let Google look at your data. The data goes straight into their ad exchange analysis. In fact, you cannot publish Google ads on your site unless you have had google analytics running on your site for a while.


Plausible, Fathom [1] and Simple Analytics [2] are very similar in features. While self hosting is not supported, plausible is Open Source [3] running on Elixir and Postgre. It is also the most affordable out of three, starting at $6 only.

I am not sure why HN are so negative about this post, when two years ago everyone were cheering for GA alternatives.

Although I do agree it needs to add "Pricing" in the Navigation section.

[1] https://usefathom.com

[2] https://simpleanalytics.com

[3] https://github.com/plausible-insights/plausible


I think everyone is so negative on the post because the post is a sleazy ad in disguise. If it had opened with "these are the reasons that we are unsatisfied with analytics which led us to launch our own alternate service, plausible" or even just "why we made plausible.io" then we would know up front that it's an ad but maybe still a worthwhile read. This hides the fact that they want you to maybe try plausible to the last section and the last bullet of the table of contents. Its nothing about the product and everything about the underhanded advertising.


Where's the sleaze? GA does shady shit, Plausible has a competitor product and they're making the pitch on their company blog. What am I missing?


They used a clickbaity title and deeply buried the lede by disclosing that this is an advertisement for a competing (paid, non-self-hosted, proprietary) product only at the very bottom of a long list. I would bet 95% of readers did not realize this was an advertisement until getting to the bottom.


Matamo is free, OSS, and takes just a few minutes to install.

https://matomo.org/


I remember using Matomo when it was Piwik and it was really easy to self host.

I do not see hosted analytics as real competition to GA since you have the same privacy problems.

Self-hosting should be the preferred option.


Fathom is very easy to self-host:

https://render.com/docs/deploy-fathom-analytics

Plausible should provide instructions to self-host for people who want to try it.


>plausible is Open Source

>self hosting is not supported

Ok, then how do you prove it's open source?


Perhaps because of the liberal use of the term "Open Source."

This product goes in a long and tedious rant, trying to justify why they are indeed open-source. (https://plausible.io/open-source-website-analytics)

If you need to do all these mental gymnastics to try to convince someone (or yourself) that this is a legit open-source software project, then you are probably not open source.

The only open-source thing about this is that the code is visible in a public repo. That's it. There's no invitation to contribute or requests to open PRs. There's no community of OS contributors, no open discussion about features, and no instructions on how to self-host (and according to their website, an explicit intention to never support or release a genuinely self-hosted distribution).

This is my personal opinion but just because your code is publicly available, that doesn't make it open source. It is at best transparent development tied to a permissive OS license.


plausible.io looks to be a one-developer show. pretty impressive.


Some open source alternatives:

- https://ackee.electerious.com/ (Node.js based)

- https://github.com/zgoat/goatcounter (Golang) (Commercial Licensed)

- https://snowplowanalytics.com (Core-ware)

- https://count.ly (Core-ware)

https://www.visitor-analytics.io is also a great GDPR compliant alternative (Not open source).


> - https://github.com/zgoat/goatcounter (Golang) (Commercial Licensed)

Small correction: it's not really "commercial licensed", it's completely Open Source/Free Software according to the OSI definition and "four freedoms", and can be used for any purpose including commercial (EUPL license, roughly similar to the AGPL).

Only the hosted goatcounter.com SaaS operates on a "free for personal use, pay for commercial use" basis, but there is nothing stopping you from running your own instance for commercial purposes.

I think I need to clarify the README on this a little bit, as I had someone else ask the same question yesterday as well :-)


Shout-out to GoatCounter which was both free and easy for me to set up for my blog.


Another happy GoatCounter user here!


I plan to move to goat counter simply because of the awesome name. That and it's truly FOSS.

All the other competitors of GA are hobbyists whining about not making enough money. Get a real job and stop complaining about GA!

GA above works for Google and goat.


What is crippleware? Does it mean it doesn't work?


I changed it to the more friendly "core-ware", means you get the basics - the rest is licensed.


Some pricing feedback:

The limitation of *K views per month only serves to introduce confusion into the buying process for me:

- I don't know how many thousand views per month my website gets, to begin with. This is not something I have ever had to be concerned about but with this product, suddenly I need to keep an eye on this?

- The first question that comes to mind is that what happens when I go over? I see you have a FAQ answer dedicated to this. (A lot of people won't bother to read the FAQ)

- According to the FAQ, a one time "spike" is OK, but if it happens two months in a row someone will contact me? This seems murky and introduces uncertainty. And not sure I want to sign up for a $6/mo. product where a vendor is going to be contacting me to "discuss upgrade options".

- That said, I don't know what "discuss upgrade options" means. Does this mean you'll contact me to force me to upgrade? Or is it just a suggestion? Is there a time frame in which I'd have to do it or will you cut off the service if a decision isn't made in a timely fashion?

- Your highest plan allows 1M pageviews per month. What happens if I go over that? Is there a higher, unlisted plan that I would be asked to upgrade to?

A lot of the above may seem silly but I'm trying to illustrate the kind of murkiness that might cause a user to think, "I don't know exactly what I'm signing up for" and move on to other solutions.


I have a hobby blog that I'm trying to run for <$5/month. I run it on Github Pages because it's a static site and GH takes care of just about everything. The only thing I don't get is analytics or server logs, I'm planning to build that myself with standard AWS components. I considered other options, but I didn't want to use GA (for many of the reasons mentioned in the blog), other tools like Plausible were outside of my budget, and the open source tools looked like more hassle to self-host than reinventing the wheel myself:

Browser (~3 lines of JS) -> API Gateway -> SQS Queue -> Lambda (ETLs queue into) -> Athena.

I was originally just going to use Postgres, but RDS/Aurora are expensive and running Postgres in EC2 is going to be at least as much work (configuring SSH, process management, backups, monitoring, logging, image building, networking, etc).

My custom plan is ~$3/month all-in provided I stay below ~1M requests per month, and even then it scales very cheaply. Also, this design is highly scalable, although I doubt I'll ever take advantage of it; it's mostly just icing on the cake. The main motivation is that these components are available on the free plan ("forever", not just the first 12 months) and/or the pricing model makes the charges negligible for my super-low-volume use case.

Note that this doesn't give me pretty dashboards; the interface is SQL. Fortunately, there are other analysis tools that I have at my disposal which can plug in to SQL.

Lastly, of course my $5 budget doesn't do justice to the actual value of my time; it's more of a fun challenge. If I really just wanted web analytics, I should shell out the $6/month for plausible or similar.


Your choice is, of course, your choice, but I'd strongly suggest GA for this use case. I understand and agree that Google is in fact evil and we should avoid feeding the beast, but a hobby website using GA is too minor to care about for any of the privacy reasons the author states and if it's fully static pages then ideally the performance will already be light enough that the bloat of GA isn't going to make a difference.


> "...a hobby website using GA is too minor to care about for any of the privacy reasons the author states..."

while it's a whole lot less signal per site, long-tail websites can provide more signal per visit than large ones. knowing you like kazakh folk music more uniquely identifies you for targeting than knowing you read cnn.

and for small sites, you don't often need a lot of data. just knowing number of visits and some basic visitor characteristics (country, time, etc.) on a per-page basis is usually as much as you need to make use of. depending on the business, a large site can and possibly should build custom analytics as a competitive advantage that doesn't have google (and facebook, amazon, etc.) looking over their shoulder.

nobody really needs GA. many don't even need realitme analytics afforded by a js library loaded on every page. you can just do log analysis, à la the original urchin analytics that google bought long ago.


I have a blog hosted on GitHub pages, and I only want to know what pages are viewed the most. Before using GA, I have considered multiple alternative. For self-hosted alternatives, I don't bother to use it because I don't want to maintain a server just for my blog. As for other paid hosted alternative, I avoid them because I want it to be as cheap as possible. So I eventually choose GA, but in a different fashion, because I do care about its bloat. I simply write a short JavaScript using Measurement Protocol[1] to only send pageview to GA, which works very well for me.

[1]: https://developers.google.com/analytics/devguides/collection...


I was in the same situation and ended up using GoatCounter (see one of the top level comments in this thread).


I've used GA before for this same blog, and the bloat was perceptible. Not only that, but the GA interface was poor and there was a lot of spam. Adding onto that the various ethical concerns (privacy and Google's questionable politics), I decided to go with something else.

Also, to reemphasize, the primary purpose of the blog and its infrastructure is to serve my own amusement and to indulge my technical interests.


For a light site, Google Analytics may be the majority of its “weight”. It’s not huge, but I’d like to think my readers came to my site for the content and not the analytics so it’s a bit ironic that the ratio doesn’t match that.


Have you looked at using Amazon Lightsail?

Linux VM for $3.50/month: 512 MB RAM, 1 processor core, 20 GB SSD, 1 TB/month outgoing transfer. For $5/month: 1 GB RAM, 1 processor core, 40 GB SSD, 2 TB/month outgoing transfer.

Should be enough for a hobby blog, especially if static.

They have images for Amazon Linux, two Ubuntu versions, two Debian versions, FreeBSD, OpenSUSE, and CentOS.

They also have a few images with various things preinstalled and preconfigured: WordPress; Joomla; Drupal; Django.


I haven't looked at it, but I want to get away from managing (pet) VMs. I want something that is fully git-opsed, and git-opsing VMs is a lot of work. It's much easier to string together a bunch of AWS resources via CloudFormation (or Terraform, although I'm less familiar), and then I can stand it up or tear it down whenever I want.


You might have a look at https://ownstats.cloud


I've done something similar to this that has worked for me without a lot of effort besides the first time setup. I use Fathom which is open-source and available to use on any hosting service. I use it with DigitalOcean so it's $5/mo and I get to run it on as many websites I have.

Might not be very helpful in your case but I'd recommend thinking about using a simpler setup than SQS Queue + Lambda, if that works for you ️


> Might not be very helpful in your case but I'd recommend thinking about using a simpler setup than SQS Queue + Lambda, if that works for you ️

I did try to find something simpler, and I remember looking at Fathom but I don't recall off the top of my head what my qualm was. Basically anything that involves a VM is more complex than the system I described and anything that involves Docker is considerably more expensive (Fargate/ECS/EKS pricing models).


You could use a VPS with hetzner or scaleway for $5/mo with decent specs, or use an older raspberry pi 2/3. I had metrics services like prometheus, grafana, and matomo analytics running on my old pi before I migrated them to a VPS.

It's a little more complex, yes, but running a VPS with a few services / docker is not really that complex in the grand scheme of things. It opens you to the world of open source and/or just running long-lived processes. You also could host your site on it, removing a dependency on github offering free static hosting forever.


I had to scroll a bit more than this thread would prefer to find the pricing. It says "get started for free" but the lowest pricing tier is $14.00/mo. Looks like "free" is only the 7 day trial.


They have a version "Fathom lite" which is nowhere mentioned on their site (anymore?). You can get it from Github though:

https://github.com/usefathom/fathom


Yes! I've just removed GA from one of my pet projects (shameless plug https://golang.cafe) - And started using Cloudflare NS https://www.cloudflare.com/dns/. Now I get users/analytics reports at DNS level. No bloated JS trackers, no privacy policies and weird stuff and I can track visits even when adblockers are enabled. This would definitely be overestimated as bots would be counted in but it seems pretty accurate so far! It also has unique users count (although this is overestimated)


I think trying to estimate site visitor/visit statistics solely from DNS traffic would lead to very inaccurate results.

However, if you're using Cloudflare as a reverse proxy, those numbers are pretty much as accurate as you can get. Way more accurate than GA and other JavaScript-based solutions.


Correct me if I'm wrong but cloudflare only provides analytics for "proxied" records.

And for my experience proxied websites (in free plan) are way slower compare to bare websites, like 300ms latency vs 50ms latency.


Is this somehow different than what you get from hosting on Cloudflare Dash? I have my GitHub pages DNS through Dash and the only additional thing I wish I had was referrer analytics.

Nice site, too. Bookmarked :)


I assume you mean you are running the whole site through the cloudflare proxy, not just DNS. You cannot get that advanced of DNS metrics by just using them for DNS.


I use cloudflare as nameserver that effectively takes over all DNS queries for my domain


I get that, but you don't get more information than number of DNS queries - nothing about users


How does Cloudflare NS deal with caching resolvers?


It tells you even percentage and absolute number of cached dns queries. Cloudflare is dope!


I think he meant queries from caching nameservers that cloudflare does not control. I would argue the majority of queries would be fulfilled by these caching nameservers.


Setting a sufficiently low DNS TTL will take most caching servers out of the question.


Got it. It works for sites at certain amounts of traffic. Hugely popular might have more than one user per minute per dns resolver.


Why dont they just title this "why we made plausible.io?" Make it clear up front that this is content marketing. It would still be a worthwhile read but it wouldn't be nearly as sleazy feeling as disguising an ad as a blog post type thing. I think a lot of people on this forum would enjoy reading about how the founders approached the problem and what they see as the solution, either out of curiosity about new tech being developed or to evaluate the product by reading about what the founders were trying to fix with competing solutions. This just doesnt read as an honest marketing attempt.


If you're going to tear down a competitor's product and shill your own you should probably be up front about it.


I think that if a blog shilling a company/product is hosted on said company's website, they're already sufficiently up-front about it.


I really don't think so, I've never heard of this company I didn't know what they sell. I had to poke around the website and read the comments here to realize. That's really the exact opposite of being up front.


I’ve never heard of Plausible either but the instant I browsed to the article I suspected they were a competitor. I skimmed the table of contents and saw the last item was recommending a product with the same name as the website, and my suspicion was confirmed. They’re perfectly up front and obvious about it, not hiding anything.


>They’re perfectly up front and obvious about it, not hiding anything.

Wtf, this is the opposite of "up front." You had to poke around and figure it out yourself based on a suspicion. Up front means that they just tell you, no figuring out required.


I didn't poke around. I knew by the time I finished reading the Table of Contents, before I even got to the content.

Even if they had said in their very first sentence of the content "Disclaimer: We are a competitor to Google Analytics", I would have already known before that.

It's blatantly obvious, they didn't need to spoon feed me. This isn't some submarine article.

http://paulgraham.com/submarine.html


Your comment reminds me of this: https://rachelbythebay.com/w/2018/04/28/meta/

>"I am THE ONE. You only need to know me. I am better than all of you."


All the examples given in that article are people belittling complex programming tasks. But this was just a simple a reading comprehension and logical deduction task, not even remotely in the same league.

But maybe to you this was a similarly difficult task, so I apologize if I was being insulting by belittling it. Sorry!


Wow what a pretentious person you are.


"Reading the link" isn't quite poking around, is it?


You don't know what it is from just reading the link. And even that itself is not upfront -- browsers have been trying to hide the URL for a while now. What if you read it on Google AMP? I agree with the other commenter that this is misleading.


I should have been more clear - reading the content of the link.


You mean at the very end of the article after they finish trashing GA?


No, just read the TOC. The last two items, especially the one that says “Give Plausible a Try”, make it clear when combined with the big “Plausible” logo in the top left of the website (or the “plausible.io” domain you can see in the HN submission title) that this is a blog post by the company Plausible listing reasons why their major competitor is flawed and advocating their own product as an alternative.

You don’t have to read to the end of the article to know this, it’s communicated in the TOC at the very start. They could hardly be more transparent if they tried. This is not a submarine article, people are knee-jerk making this into something it isn’t.

http://paulgraham.com/submarine.html


i think the title is misleading. It says "Why you should stop using Google Analytics on your website" but it's content means "Why you should stop using Google Analytics on your website, and start using our software"


The blog seemed to very intentionally not mention the company's product until the very end. They should have led with that.


It's on the company's own blog under their own domain. It's not like this is a submarine article or the like.

http://paulgraham.com/submarine.html


Sure, but I had no idea the company was an analytics company until the last paragraph. I clicked on the article from HN.


You and me both. It's not obvious up front that this is a GA competitor if you just click on the link and read the blog post. You do have to delve a bit deeper, unless you know who Plausible.io is.


Seems like a fair number of articles on HN lately are of this vein. A pitch veiled as a criticism of another product.

I know a lot of people agree with the premise of ditching Google Analytics but don't necessarily want to read a biased pitch piece about it.


>lately

This has been the case as long as I've been here and probably long before.


[flagged]


That industry has existed everywhere for most of civilization. Perhaps even the second oldest profession.


Hmm is this a fair assessment? Seems like most consider modern advertising to originate in the late 1800s. At the very least you'd say there was an inflection point sometime after the printing press, no?


> These two tracking scripts combined add 45.7 KB of page weight to each and every page load.

Which is only true is there is no caching and I highly doubt that these files are configured not to be cached after the first load


The Google Analytics script itself, analytics.js, is cachable and cached.

By design, Google Tag Manager and gtag should not be cached. The behavior of those scripts can be configured from wizard interfaces, and there is an expectation that published changes take effect immediately.

Also, all of these scripts are deeply and thoroughly asynchronous. They are designed not to impact site speed. Not to say that they can't or won't (I've seen GTM do horrendous things when used improperly), but I would trust an actual timing metric over a file size metric.


It seems like plausible is a GA competitor (and I didn't see disclaimers, but could have missed them). It seems very clean and well thought though!


The biggest drawback to not having Google Analytics on your website is if you choose to run Google Ads for your site and you have to retarget/remarket on Google properties. Without Google Analytics you are out of luck as you can enable remarketing only in Google Analytics (and you would need Analytics linked to Google Ads). If you have a personal website or a business that grows organically without needing to use Advertising then you can go ahead with any analytics provider of your choice. But the value add provided by using Google Analytics far outweighs other negatives from a business point of view.


That's incorrect. Google ads have conversion tags you can do remarketing with, no need for Google Analytics.


You are right. I stand corrected. I always used Google Analytics for remarketing audiences because you can use any existing Google Analytics data to build audiences. You can't do the same with Google Ads remarketing tag according to the comparison table here: https://support.google.com/analytics/answer/2611268?hl=en

As a marketer you would want as much data as possible available for remarketing. The wider your audience the better your conversion rate. Without Google Analytics you will be remarketing only to the remarketing list created in Google Ads but not any and every audience that is being tracked by your Analytics property.


don't run google ads either


But what are the alternatives for websites with a small audience? not many choices out there.


That is not an option if you are a profit driven enterprise. There is no alternative that is as good/optimized as Google Ads is. You are literally bidding on search intent.


I find that for information search intent is a poor pre-qualifier for sales leads.


Evidence says otherwise. Hard data shows that conversions happen the most through these advertising networks. Can't deny the obvious. Unless you want us all to go back to the pre-Internet era where we cold called clients and vendors with a hope of getting leads. We have technology for a reason. Not all tech is bad. Not all advertising is bad either. It is the intent that matters.


Perhaps I wasn't clear.

Information search. The last time you looked up how to do something, for example, you bought a product off of a Google ad?

What do you define as a "conversion?" I specifically said "sales." People in the ad biz want to define "conversion" as something else- but those of us paying for the ads want to make a sale. If somebody (or a bot) who has no intention of buying clicks on the ad it's a mistake for both of us. Yet with Google anyway, any attempt I make to keep people from mis-clicking (by putting a price in the ad, for example, to indicate up front that if you're looking for free this is not your link) earns me a penalty by reducing something deceptively labeled as my "quality score."

I would love to see the numbers you are referring to, though. Maybe something would occur to me to help me see how to navigate the mess.


What would you recommend is better than someone literally searching for something you could provide?


Oh, that's too easy: Something that indicates the person is willing to make a purchase. My guess is you do searched all day long, and if you saw ads (you probably block them) you wouldn't buy much.

Where do you go when you want to buy something? Google? Probably not. That's why Amazon is dominating. Google can be used for branding, I'm told, and that's important- but requires a different mindset.


"Here are completely legit reasons to stop using Google Analytics." <gains credibility> <gains more credibility> <gets me thinking about switching to something else>

"Try our product instead!"

<closes tab in disgust>


We use Simple Analytics [https://simpleanalytics.com/] as alternative to GA and it's also privacy-friendly.

I saw Plausible appeared after Simple Analytics launched their product. I'm glad to see GA alternatives becoming popular.


Not sure who makes Plausible, but if the "live demo" is reflective of the actual product, it's sorely lacking. The reason that GA so popular is because of the free tier, sure, but also because GA 360, the really amazingly expensive version, has features that let you really get insights about your visitors.

Here's a popular question these days: "when can I stop supporting IE 11?" With GA, the answer comes in a couple of clicks; revenue generated sliced by browser version. I don't even see the revenue generated section of Plausible, or really any way to associate any metadata with custom events. GA gives you this functionality without any coding.

If you don't want to use GA, try something like Amplitude, which has a lot more data manipulation options than Plausible.


What do people suggest if you have a website that sells something?

Knowing where your traffic is coming from, if your new redesign helped or hindered users in finding content, and knowing which traffic sources result in the most sales sounds business critical to me.

I've seen lots of situations where when we look into analytics, it becomes obvious users are having trouble finding content or don't know the content is there to be found (e.g. putting an important link behind a navigation menu was a bad idea).

I feel people can overly focus on the more manipulative side of A/B test, but analytics is useful for improving your UX as well. Not everyone runs a personal blog without a care for monetisation or viewership either.


You can look over the product I'm building (self-hosted analytics): usertrack.net. It's not free though.


Hotjar, Fullstory, Crazyegg.


Does using cloudflare counts in as well? I've been using cloudflare for some time now, because I've wanted to save some of my static website bandwidth & it offered https setup pretty easily. I've setup google analytics just to compare visitor statistics and this is what I've found over 1 month:

GA unique visitors: ~350 Cloudflare unique visitors: ~2900

So the difference is pretty overwhelming - I assume that cloudflare count some bot traffic as well or something?

Is it even a reliable source of stats (cloudflare)? If not, why?


Is this an ad?


Yes

> Give Plausible Analytics a try Plausible Analytics is built with simplicity, speed and privacy in mind. We’ve used Google Analytics for years and understand its pitfalls


It’s an ad but not a hidden/submarine one. It’s a blog post on the company’s blog giving a long list of reasons to try their product instead of their major competitor.


Yeah, this article was submitted by a SaaS marketing expert: https://markosaric.com/

There are some misleading parts of the article, which makes it even worse (e.g. by default Google uses the data from GA to profile users).


Yes, but you really have to read the whole thing to see it. Even then, they listed their own product last on a list of alternatives. I equate it to how Privacy Badger approached their product (although it's free): your privacy is compromised, so try their product which will protect your privacy. This type of "honorable" advertising is OK in my book.


If this is your personal website, one option to consider is dropping the analytics completely. Do you really need to know exactly (and it’s not even that exact, since people block or can’t load your analytics) how many people viewed your blog posts? Anecdotally, if you put your email at the bottom of the page you’ll get a pretty good idea just from the number of people who send you one.


> Anecdotally, if you put your email at the bottom of the page you’ll get a pretty good idea just from the number of people who send you one.

You can't be serious. Who in this day and age reads an article and thinks "Hmm, that was useful, I'll send the author a note of thanks"?

This might have been the case in 1995 when there were only a few websites around, and Internet use was considered leisure time and not part and parcel of everyday life.


> Who in this day and age reads an article and thinks "Hmm, that was useful, I'll send the author a note of thanks"?

I do. Maybe you should, too?


For what it's worth I find it minorly annoying when people send me emails about my posts. I'm happy for people to comment on them, but typically an email is an invitation to have a private discussion and I'd much rather talk publicly (https://www.jefftk.com/p/comment-dont-message)


And on the flip side, there are people like Noam Chomsky who reply to all emails they receive:

https://www.reddit.com/r/chomsky/comments/6a23eg/how_long_do...


You may get more comments/emails than I do. I think I've gotten 3 worthwhile comments and 2 emails in the 8 years my blog has been active. I actually turned comments off, because I was getting thousands of spam daily, and that seems like a waste of everyone's time.


I host my comments externally. For example, https://www.jefftk.com/p/mosh is currently on the HN frontpage as https://news.ycombinator.com/item?id=22810589, and I think of that like my comment section. I have my blog pull in the comments from there: https://www.jefftk.com/p/mosh#hn-22811119

I also pull comments from LessWrong and Facebook, though the FB comment integration is pretty temperamental.


Is there a way to comment on your post if it's not been posted to Hacker News/Facebook/LessWrong?


I post everything to FB and LW. If you want to comment on HN and you think it's a post that's relevant here you could submit it?


I don't submit things to Hacker News :(


I think comments are probably easier to interact with, but they're more hassle to set up and maintain :( FWIW, I often take interesting points from emails I get and add them as "updates" to blog posts.


A fair number of people, surprisingly. Some will also send emails with questions or comments.


My blog only has a few hundred hits, and I haven't put my email address on it or invited anyone to contact me, but I've still gotten emails from two people and had some pleasant conversations.


Hah, speak for yourself. These aren't as uncommon as you think.


Or just do it the old fashioned way - have the server keep an access log, then use something like GoAccess (https://goaccess.io/) to produce reports from it. It’s not real-time reporting, but then you probably don’t need real-time reporting for your personal site anyway.


Alas, I don’t control my server, but if I did I would say you might not even need this. If you’re using Google Analytics right now it’s probably better than that, though.


I too don't controll my server, but it turned out that my hosting provider (one.com) has pretty nice access analytics, e̶v̶e̶n̶ ̶f̶o̶r̶ ̶t̶h̶e̶ ̶p̶r̶o̶j̶e̶c̶t̶s̶ ̶I̶ ̶h̶a̶v̶e̶ ̶w̶h̶e̶r̶e̶ ̶o̶n̶e̶.̶c̶o̶m̶'̶s̶ ̶o̶n̶l̶y̶ ̶r̶o̶l̶e̶ ̶i̶s̶ ̶D̶N̶S̶ ̶r̶e̶d̶i̶r̶e̶c̶t̶ ̶t̶o̶ ̶a̶ ̶c̶h̶e̶a̶p̶ ̶V̶P̶S̶ (Edit:Nope it does not, how could it). When I discovered this I dropped GA for most of my projects. I have one remaining project that still uses GA but only bcs I haven't gotten around to fix it. Maybe your hosting provider has analytics too? I had to poke around in the dashboard for one.com to find it.


I use GitHub Pages.


It is 100% real-time. https://rt.goaccess.io/


Oh cool, I didn’t realize GoAccess could do that! Thanks!


If you want high quality blog posts, you need to know which ones are popular and bring in new readers.

Without that, you'll loose a really strong signal for self improvement, and your blog might become write-only...


I don't see anything in what you said that explains the assumption that a popular post is a high-quality post.

Certainly if you're looking to build an audience, then the popularity of your posts is an important metric that will help you craft additional popular posts.

But it seems to me that the the only _quality_ indicator isn't a metric at all - it's actual feedback from people who have read what you've written.


I doesn't really matter. The metric is popularity. Period.

You can delve into metrics like the time spent on a page but in the long run popularity is the king since it also reflects dwell time.


I honestly couldn’t care less which of my blog posts are popular. I write about things interesting to me, which I assume is often too boring or technical for most people. Sometimes people post my stuff to Hacker News and it does well, sometimes it doesn’t, but if I tried to “optimize” this by matching what the site seems to like I wouldn’t write some of my favorite posts.


You are right, you don't need to know it in a small blog or site

But metrics and statistics are the very foundation of science, and the graal of software engineering. I can't resist the temptation of knowing which browsers and the countries of the users of my website, or how many hacker news visitors come to it every other season, so many possibilities... sigh


You really don’t need to know that. I mean, it’s often (but not always: see the perils of optimizing for “clicks”) nice if you could, but you are actively trading the privacy of your visitors for that information.


And you can determine all that from regular web-server logs.

No need to snitch on your visitors to big G.


Congratulations on getting your subtle advertising on the front page. I admire your restraint in only mentioning your product at the conclusion of a long article - nicely done!

That said, you state what many have been been saying for years - GA is overkill for most sites and gives a stupid amount of information to Google. But sites are giving up their users information to any 3rd party analytics package, not just GA.

Hosting your own analytics is the only ethical way to go, unless you really, really need the demographic data that GA provides (which you probably don't).

Having passive feedback that people are actually reading your posts is nice but a simple hit-counter would probably do the job and relieve your readers of losing another little slice of their privacy.

I'm going to take a leaf out of the poster's book and mention my own project in passing: https://sheep.horse/visitor_statistics.html


The million-dollar question: will removing GA hurt my search raanking?



No. Hacker News does not use GA. Yet it ranks at the top for many searches.


Hacker News also happens to have a monopoly on certain search terms ;)


If the answer is yes, then the next call to action should be stop using Google.


Google Analytics data isn’t tied to ad targeting. Google doesn’t profile you based on data from sites you visit with GA.


According to google, it is:

“Google uses the information shared by sites and apps to deliver our services, maintain and improve them, develop new services, measure the effectiveness of advertising, protect against fraud and abuse, and personalize content and ads you see on Google and on our partners’ sites and apps”.


Disclosure: Google Analytics consultant. Not a Google employee, but my bread is very clearly buttered on one side here.

The article's use of that verbiage is deceptive. I think it's not actually a lie, but it's misleading.

When you use the Google Analytics product, Google is a Processor under GDPR and a Service Provider under CCPA. They are contractually bound not to process that data beyond providing the service to you, which is dashboards. It is illegal for them to dip into this data for their own purposes without additional authorization from the customer. (Note that in contrast, when you include a Facebook share button, Facebook is a Joint Controller and not a Processor.)

Sharing Google Analytics data with Google is possible but optional. It's fairly easy to do and encouraged (it enables certain features), but those checkboxes have big legal terms next to them that tends to give people pause.

In particular, the Advertising Features setting allows Google to make a connection between Google Analytics data and Google Ads or DoubleClick data.


This is a really good point. It's very disingenuous of the parent article to not mention that. Google being a Processor of data removes their ability to use this data for targeting and profiling.


It may remove legal basis for using the data, but it certainly doesn't remove their abilities.


The implication here being Google might be conducting a massive illegal conspiracy to mine data they don’t have rights to use? I do not buy it.


Implication is that they have the ability. The only thing that removes their ability for targetting and tracking you is to block them at the browser or network level. Ie. don't give them the data in the first place.

Legal action against a multi-billion dollar corp. is the most expensive and about the last option for defending privacy of normal people.


Especially given the multi-billion dollar fine if they were caught.


I agree they probably aren’t doing it, but there is almost zero chance of a multi-billion dollar fine.

They’d just explain it away as an error and be given a few million dollar slap on the wrist.

It would be very easy for Google to make a plain language public statement about this and clear things up, the way Apple goes about their privacy/data handling.

It’s not a good sign that they choose not to.


I can't find this language in the GA terms and conditions.


Section 6 of https://marketingplatform.google.com/about/analytics/terms/u... says

>Google and its wholly owned subsidiaries may retain and use, subject to the terms of its privacy policy (located at https://www.google.com/policies/privacy/), information collected in Your use of the Service.

and https://www.google.com/policies/privacy/ says under "Provide personalized services, including content and ads"

>We use the information we collect to customize our services for you, including providing recommendations, personalized content, and customized search results.


On the other hand, take a look at this:

https://privacy.google.com/businesses/rdp/

Google's usage of Analytics data is restricted unless GA is integrated with Google Ads.

I think section 6 may actually be referring to collecting data about the Google Analytics user's usage of analytics, not data about the website visitor. I.e. if Bob puts GA on his website and Alice visits the website, then the sentence you quoted is about Google tracking Bob's usage of GA, not about Alice's visit. But I am very unsure about that.


The correct proposition instead would be to stop analytics on your website, period. There's many sites that benefit greatly from the insight, but I'd argue they're a minority. Your personal blog doesn't need analytics. Let's be honest, for many people, analytics is just a way to feed their ego — my ramblings reached 100 people, so you could say I'm kind of a big deal. Google is, obviously, very well aware of this target market, and tries to feed back into this loop as well. It's not without reason you get weekly summaries and more into your inbox by default. The reality, of course, is that it's serving them while hooking you, much like social media notifications driving engagement etc.


Analytics can absolutely be useful for your personal blog / website. Imagine you're an academic on the postdoctoral fellow / tenure track job market and you want to both numbers and see locations of where your hits are coming from, google analytics can help. For universities the ISP entry is often "X university" which makes things even clear than the broad Geo category of which city the hit comes from. This can be very informative in terms of what leads are likely to come up.


What is the created value from that data extraction, though? That you know that the uni you applied to looked at your website? That's nice, but again, simply a way to reinforce your ego as far as I see — you get insurance that you're important enough to get through the first filter. But what actionable data does this interaction give you?


The academic job market can be unforgiving, and for those who also have industry leads/offers it can be helpful to know whether there may be future offers from academic institutions when deciding whether to commit elsewhere. Even without industry offers, it isn't like all academic job offers come in at the same time and you may be second pick at your preferred institution so it could take some time for a possible offer to come through. If you're still getting website hits from a place it's a fair bet you're still under consideration and you maybe don't want to commit elsewhere.

It's not really about an ego boost, and in any case, egos can take a pretty major bruising during the very anxious weeks.


> For universities the ISP entry is often "X university"

Actually Google killed that report earlier this year. It's kind of a big deal, a lot of companies were using ISP dimension to distinguish external vs employee traffic.


Yes, but there are hacks to get that back: I have been using https://ipmeta.io/ with success.


With any website it's important to know how many visitors it is receiving. In the context of a blog knowing you're getting a lot of traffic can motivate you to publish more content and knowing more about your visitors can also feed into what sort of content you publish. You could look at your server logs, of course, but analytics is easier to set up and gives you higher quality data than something like AWStats. Since Google lets you opt out of sharing your data with them I don't see much argument against using it.


> With any website it's important to know how many visitors it is receiving. In the context of a blog knowing you're getting a lot of traffic can motivate you to publish more content and knowing more about your visitors can also feed into what sort of content you publish.

No, and trying to make your content more appealing based on analytics is often a great way to tend towards SEO optimized blogspam.


How does embedded script provide better data? You mean if the page is delivered from network cache you still get analytics or something else?


By filtering out bots and giving you the number of visits rather than just pageviews, and also giving you data on things like screen resolution and time on page that can only be retrieved in javascript.


Or you can have comments or some social platform for interacting with users, which is more rewarding, direct, and useful for future direction or content suggestions.


But also requires things like moderation, especially if your topic of choice is remotely controversial. It also gives you individual data points rather than an overall picture, skewing towards people with strong opinions.


Yeah, I removed GA from my websites years ago, when I realized that I didn't look at GA output pretty much at all for many years prior. I also started being more careful about whom I give control over my domain/websites.

0 value for my personal websites. All the value for Google.


Let's be honest, many things in life (especially quantitatively measured) are done to feed the ego. There's no need single out website views.


For personal projects, what about basic usage data from something server-side like nginx logs?

No JS, no tracking, just basic analytics. Even if it's only a personal blog, knowing what people care about is helpful

If anyone has recommendations for something self-hosted like this, I'm in the market :)


Do you mean GoAccess?


oh! This looks great, thank you!


> "The correct proposition instead would be to stop analytics on your website, period."

Analytics are for optimizing your business and presentation, for finding out how big your audience is, how it responds to new content or features, etc. And for users it's not a loss of privacy if the data is anonymized.

Even for your personal blog, if you treat it as a tool for reaching an audience, then analytics are useful for growing that audience. When you put effort into writing articles, you want to see some results, after all it's an investment. And even if it's a hobby, seeing the size of your audience can serve as motivation. Even if it's a hobby, as long as you publish it online, you want others to read it, so it isn't a hobby that you can do in isolation without feedback.

In fact analytics can be a legitimate interest under GDPR. Usage of Google Analytics might be in a gray area due to it being a third-party and it's debatable if they are GDPR compliant, but usage of analytics in general is perfectly legit without explicit consent, as long as the data is anonymized.


Plausible was hovering less than 50 users on its own web site, and I wouldn't risk sending my visitors to this relatively new web service either.

Analytics requires a lot of trust to not screw you up. It's a third party script that you add in pretty much every page of your web site. I'm still looking for a provider that I can tame with CSP rules, and comes with the most minimal things I need to know. A pixel tracker would have suited majority of us.


You can self host your own analytics, like matomo. [0]

[0]: https://matomo.org/


"Nobody was fired for using Google Analytics" (c) Is financial burden to change any part of your product, including GA, and 99.99% of all companies in the world don't give a damn about users of their products, so it will never happen unless some global power restructure happens and Google will fall out of grace. And even then most likely it will be simply replaced with even worse GA-2.


I would say that 99.99% of businesses do give a damn about their customers. And particularly they care about their customers' preferences as demonstrated by their economic behavior. Very few customers care whether a business uses GA enough to change their purchasing decisions. Businesses therefore rightly view the decision as neutral with respect to customer preferences.


That customer model is a mental construct of said companies. Just because customer does not proactively ask/request/mention something don't mean that he doesn't care about this. People implicitly assume that their personal information is not used for malicious purposes, despite that it is not true anymore for most of the companies. And in the era of monopolies/doupolies customers don't have vote with their wallet anymore, unless going full luddite and live in a cave.


So, a few objections. Most sites that use Google Analytics are not owned by monopolies or duopolies, so that aspect of your comment does not really apply. I also don't know if your claim r.e. malicious purposes is true. Probably depends on whose definition of malicious you're talking about.

> That customer model is a mental construct of said companies.

Well enough, but as I said, it's the one they care about. You seem to have a dualist view of customers where there are some things they care about that do affect economic behavior, and others they care about but which do not affect their economic behavior. I don't buy that model. I prefer the monist model where in competitive markets the customer expresses the totality of their preferences through economic behavior. I know it's incorrect, like all models, but I think it's least incorrect.

I doubt there's more than a tiny fraction of customers that even knows the existence of Google Analytics. Of them, only a small fraction of those has any idea of its capabilities. And of those customers, only an even smaller fraction even checks whether a site is using it before deciding whether to shop. The notion that a large fraction of customers care about the use of GA just doesn't fit with these other observations and conjectures. Please correct me with data if you think I'm wrong.

And anyway, whether you're right or me, the monist model encompasses everything companies have any incentive to care about, so it's the one that will be used to drive decisions. It's also the one that we can use to predict company behavior.


I suspect that we won't agree on global and complex issues like GA or surveillance (metadata = surveillance after all, see (1)). But I want to discuss more specific issue - "monist model". I disagree that defining what people buy is approximately what people want. Yes, it is in some narrow scope, but that is not all. I can't buy what I want if it doesn't exist, right? And is I buy picking from two set choices while I want some thing different doesn't mean that I wanted what I bought. Example - I have Android phone, but I don't want it specifically, I don't want a gadget filled with permanently enabled Google surveillance. But the alternative is only one - Apple, slightly less but still about the same. The only reason I'm buying them is because I want convenience more than privacy and I'm only human after all. People don't know about GA, 99.9% don't know. But they do know about their home address or credit card number or SSN or photos. They just don't have proper information to connect all these links.

(1): https://www.schneier.com/blog/archives/2014/03/metadata_surv...


Just a minor clarification. When I was speaking of customers, I was speaking in aggregate. I'm not making a claim that all individual customers are having their expressions adequately expressed by their purchasing decisions.


How is this different than Fathom? (https://usefathom.com/)


I really dislike the user experience of GA. I literally know 1% of the functionality it offers and I only _need_ 1% of what it offers. But I do like that it's free. Was planning to one day give https://simpleanalytics.com/ a shot.


GA is complete overkill for hobby websites that get ~100s of visits/day. If you have a commercial website selling stuff, or conduct A/B testing, or have marketing channels funnelling traffic to the site, then you pretty much have to use something like GA or Adobe Analytics.


I'm actually of the exact opposite opinion - for the majority of websites just shoving GA on it and calling it a day is quite sufficient... It's only if your website really lives and dies by making sure users are hitting the important components and converting into paying customers that a more feature-full analytics tool may be valuable.


I like that they gave some alternatives.

With the web there are a lot of crusade like topics that while technically correct the maelstrom of finger waving blog posts really don't seem to change much and few really offer workable solutions to accomplish the goals that doing the thing they don't want you to do accomplishes.


Even better is how they listed their own product last in the list of alternatives. Personally I think the self-hosted open source alternative will yield the appropriate compromise between what the business wants and preserving (some) privacy. In my experience these kinds of changes are unfortunately at the mercy of a top-down request which often trickles down from the C-level executives. Maybe if you turn this article into a pretty power-point presentation, you'll get their audience.


Yeah I was a lot less 'oh man it's an advertisement' when I saw multiple options listed.


One alternative that should be seriously considered is not having analytics at all. I don't doubt that they're useful for many people in many situations, but consider whether they're useful to you in your situation. When was the last time you look at analytics? What did you learn from it? What actions did you take? Are you really gaining anything from them? I think most personal blogs, for example, gain nothing from having analytics running. In fact I'd argue almost all non-commercial activity has no reason to gather analytics.


I think that's a valid argument that is there in the blog post.

I do like that they did actually include some alternatives that still were analytics.


If you don't care if people are reading your blog, why bother posting it online at all? Why not just blog into a local Word document, like the character Creed on "The Office"?

If you do care whether people are reading what you write, then analytics is the obvious way to find out.


I think "analytics" and "traffic stats" are different things.


Traffic stats are a subset of all possible analytics you can gather about a website.

But even if all you want are traffic stats, a client-side analytics package is still a better way to gather that information.

Google Analytics--and similar client-side analytics packages--supplanted server-side log analysis for a reason. It's much easier to implement, and gives you much more accurate counts of real people visiting your site, as opposed to spiders, bots, scripts, and other automated traffic that doesn't matter.


They are not.


No I won't. It is free and fine.


What are good self-hosted alternatives? Something like Webalizer, but a bit more modern.



I have been working on one for many years now as a side-project (not free): https://www.usertrack.net

Since October I started working on it full-time and I'm really excited about the future updates. I want people to start using it not only because it's self-hosted but because it should lead faster to actionable insights than Google Analytics. I am also planning to provide a free version after I manage to earn enough to keep working on it.



Countly (https://count.ly) is open source, push notifications, analytics, A/B testing, performance management and deployable on-prem. You can also install a limited, free-to-use version on Digital Ocean also.


We're been using Countly for years. It's got more than you'll ever need. Highly recommended.


https://posthog.com is another good one


For those interested, my own solution to this was to use AWStats with an anonymization script [1].

It provides high-level analytics such as unique visitors, referring sites and most viewed pages, without having to store any personal data such as IP addresses, and it's 100% server-side.

IP addresses are compared against a Bloom filter in order to count unique visitors, without actually permanently storing them.

[1] https://gitlab.com/jamieweb/web-server-log-anonymizer-bloom-...


>Google, the world’s largest ad-tech company, [...]

This is kind of snarky, especially in the light that the author is writing this post solely to sell something in competition with Google.


Their own analytics dashboard is public: https://plausible.io/plausible.io


And it's really interesting: see the HN effect!


I liked it when you announce it's open source till I checked your Github page:

> At the moment we don't provide support for easily self-hosting the code. Currently, the purpose of keeping the code open-source is to be transparent with the community about how we collect and process data.

So basically you ask people to work for free on something only you can use? No thank you.


I wrote a blogpost about webanalytics and authentication. https://www.mathieupassenaud.fr/webanalytics_enemy/ Using "authorization code grant" is not as secure as we imagine with those kind of analytics


I'm also working on an analytics & A/B testing tool (https://splitbee.io). The big difference is that it needs to pin unique users for A/B testing. It is a mix out of mixpanel & google analytics & optimizely!


The first reason is about Google being evil. They may well be. However unless your alternative is as good, or better or has a unique selling point that GA doesn't have you will find people won't use it.

Don't tell me about how your competitor is bad. Tell me what you offer in comparison to your competitor.


Posthog.com is another solid option for privacy-focused analytics. They too have a self-hosted version


If you have access to an AWS account, you can host your own website statistics with very little costs. That’s why I built https://ownstats.cloud/ Would love to get some feedback.


These posts are the trash of the internet. Why?

This post...

1 - Fails to explain until the end that they are a competitor writing a "bash" piece

2 - Gives conclusions based on incomplete data and comparisons (website speed is barely affected by 45kb these days, many websites have much bigger reductions on multi-mb loads)

3 - States giant scary click-baitish claim in bold and then in smaller print gives the real story. "It's a liability..." to "it is a potential liability"

4 - Gives a false sense of "many reasons" ("It's a liability" and "It uses cookies" are the same complaint)

5 - States opinions as facts ("It worsens the UX...") -- it might, but the vast majority of websites use cookies, and so this doesn't add anything that you wouldn't have to deal with anyway.


Those looking for more ethical alternatives to the Spyware you're using see https://switching.software.


Can you be penalized for not running Google Analytics? If you have a website with no analytics or non-Google analytics, will your Google ranking be worse?


Slightly off-topic, but are there any analytics for open source projects?

Even if they require the actual data to be public (which would actually be pretty nice)


"It's owned by Google." Who knew!?


Thanks but no thanks And stop using this type of advertisement to your products.

I would rather stick to GA.

And, you never pay for analytics on personal website.


What's the best server-side only, self-hosted analytics platform if I don't want to add any JS to the front-end?

Running nginx here


In my opinion there are no longer any really viable server side analytics/log processing that's equivalent to what you get with GA unless you want to either install some kind of server with a ton of features and dependencies or use a tool that will most likely also require the use of js.

The amount of crawlers, botnet spam, hacking or scanning activity and other non-visitor related hits you get these days will require a lot of filtering, updating spamlists, blacklisting, etc and you'll probably end up spending more time tweaking your webserver and analytics setup than actually gathering any relevant information from said analytics.

That said, the "old" log processing tools like AWStats, Analog, Webalizer, etc are still available although I'm not sure if they've been kept updated. I'm sure there is also a lot of more competition on self hosted tools that try to replicate what GA does but I'm not that familiar with any either.


This is helpful, thank you.

You mean to say that spambots don't respect robots.txt??

Good to think about. I'll check these out and see what they've done (if much) to mitigate that issue. I'd imagine that would need to be front and center, unless they're focused more on internal services.



I've been relying on Cloudflare lately for analytics. Gives me a rough idea, and "good enough".


Oh, nice. I didn't think of this. But good idea. I'll look into their analytics.


Is that Tailwind UI that you're using? First time I've seen that in the wild in that case!


For the average modern website, how much load time does shaving off 44kb save?


Looks nice. Is the account multiuser with separation between sites? If it is, it's a service I'd consider throwing into my maintenance packages as an added value. Most clients don't need the complexity of Google Analytics and with GDPR the data is increasingly out of line with reality, given we load GA based on consent.


Thanks! At the moment, it's one user per account. However, you can create password-protected links to share the stats with your clients. Each link/password combination only gives access to the stats for a single site without having to create an account.


everyone hating on this in the comments, yet it's getting upvoted. Silent support is a bit annoying. I'd also rather not be told what to do with these titles "Stop doing this, etc"

sigh


Or Matamo. Its free and I've found it to be excellent.


all of these alternatives which people are posting: simpleanalytics.com plausible.io are not free, which is the main reason people use Google Analytics


>It’s a bloated script that affects your site speed Once I removed GA, the speed increase was surprisingly very noticeable. That alone was enough for me to keep it off.


How does this compare to SimpleAnalytics?


Stop using Google Analytics and pay for your project? Sorry. I'm not convinced by the logic.


Segment integration?


If you're so concerned about GDPR, probably hotjar is a good choice. It's free for personal use and it's all EU based.


I stopped reading after the words "surveillance capitalism".


Take this post with a pinch of salt... Out of the 10 points made, I'd argue only 2 are actually valid and another 2 are "technically correct but..."

  It’s owned by Google, the largest ad-tech company in the world
Yes. Google provides a free service and in return gets a view into your traffic. Very fair point, but not terribly controversial IMHO

  It’s a bloated script that affects your site speed
Um, not really. I'm a big web performance advocate but 45.7KB of JS is small for the service provided. You do not need to use Google Tag Manager either so if you care you can reduce the JS footprint to 18KB.

  It’s overkill for the majority of site owners
Yes. I'd agree with this. You probably do not need it if you are not a business.(Given the source of this point I'm guessing they do not offer most of the features of GA)

  It’s a liability considering GDPR, CCPA and other privacy regulations
Hard disagree. You can limit PII in GA to large degree like not gathering full IP address, and reduce logging.

If GA is your only reason to have a Consent option then fine, its a burden to do. However if you have any other analytics or services that use cookies its a minor check-list item to cover this.

  It uses cookies so you must obtain consent to store cookies
They are repeating their previous claim in a different form here... skipping.

  It’s blocked by many plugins and browsers so the data is not very accurate
Technically correct, but you don't need highly accurate data to make product decisions. GA does sampling after a certain threshold anyway.

  It requires an extensive privacy policy
No. It does not. You should acknowledge it and talk about why you use GA and you can link to GA's privacy policy. Plenty of boilerplate text you can copy to cover this if you really care.

  It’s abused by referral spam that skews the data
Maybe your experience if different but this is hardly a problem worth noting. Certainly not a GA specific issue.

  It’s a proprietary product so you need to put your trust in Google
Yeah. Same with any service you use, but ok.

  It worsens the user experience due to the necessity for the annoying prompts
Again, most 3rd-party services on your site mean you need to go this path. There are good and bad ways to do this.

I'm sure this service is worth trying. I would also recommend looking at Simple Analytics: https://simpleanalytics.com/


If it weren’t controversial nobody would be discussing it.


Not sure how you draw that conclusion. Its made to sound controversial (by a would-be competitor).

My point is that it is really not when you look below the surface.


As someone who still uses GA for multiple sites, can we actually discuss the claims of the article and not the broad idea of it?

> " > These two tracking scripts combined add 45.7 KB of page weight to each and every page load."

Google makes fantastic products that are available for free (with limits) because they make most of their money from ads. I may not like it, but gmail, Drive, YouTube are all great and I would prefer people who can't pay for such products have the free option.

> "It’s a bloated script that affects your site speed"

As others have said, no? Not with caching? Anyway my websites are probably bloated by all sorts of bigger things, I am no web dev and have had no time to optimize them overmuch, and the still work fine. So guilt tripping me about GA Analytics aint gonna work.

> "It’s overkill for the majority of site owners"

I like having the flexibility ; I seriously doubt this is a good pitch for any growing company or even project.

> "It’s a liability considering GDPR, CCPA and other privacy regulations"

Is it? It's not like Google is not addressing GDPR concerns (https://www.cookiebot.com/en/google-analytics-gdpr/)

> "It uses cookies so you must obtain consent to store cookies"

Okay, finally a good one -- nobody likes the cookie popup, so removing GA analytics would improve user experience. Then again, it's so common that people maybe are used to it?

> "It’s blocked by many plugins and browsers so the data is not very accurate"

'There’s no definite answer on how many people block Google Analytics as that depends on the audience of your site, but for a tech audience, you shouldn’t be surprised to see 50% or more of the visitors blocking Google Analytics.' Really? I kind of doubt it..........

> "It requires an extensive privacy policy"

Any analytics should have privacy disclosure, presumably.

> "It’s abused by referral spam that skews the data"

And other analytics aren't?

> "It’s a proprietary product so you need to put your trust in Google"

Fair, Open Source is preferable.

> "It worsens the user experience due to the necessity for the annoying prompts"

A repeat

-----------

So yeah. Look, I am uncomfortable with big huge giant companies like Google ruling the earth. But I also have quite limited time and money, and already use gmail and Drive. So far Google has not done anything to seriously hurt my trust and I think their products are excellent. Most of these points seem weak to me, as someone already using GA analytics.

Still, for someone building a new website, I could see this being enough to make the point. But, I think it would be far more effective if it was more concise and less easy to point out holes in the arguments.


GA can't really be replaced easily.

There is an army of web consultants ready to help you set up and track your marketing with GA, and just as many marketing consultants that can only work with GA.

While it's nice that someone works on alternatives—although I don't really see anything wrong with GA—it's here to stay.

Reminds me of Excel, people have been trying to replace Excel for 20 years now.


[flagged]


I've seen them, and they do seem to have a nice product, but similarly to plausible, it isn't a direct competitor. Having to pay $19 per month for a website analytics that doesn't (perhaps yet, but aspires to, hence the need for analytics in first place) bring any income is most basic and huge disadvantage versus GA.


Thanks for the mention elliebike!


Why would you need a third party for that? That is basic information your web server already knows.


Great if it's not written in some sort of exotic language and just in php. Don't get me wrong I'm not a fan of php but if it's written in php it would be run anywhere but not only on your private special configured VPS.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: