Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Alternatives to Google Analytics in 2023
39 points by jimnotgym on Jan 15, 2023 | hide | past | favorite | 36 comments
I have a Flask site hosted on Linux with Nginx and gunicorn. It is a site to catalogue development of Bonsai trees, so not ecommerce.

I would like to see visitor numbers at times of day and probably a few more things in the future. In the past I would have stuck Google Analytics on and got on with my day. Now I want to avoid the cookie banner, and the shitty tracking of my users. I don't want to code analytics, I want to work on features. It feels like my nginx logs have the data I need. Is there a simple tool that analyses this? Or should I just find a less crappy front end analytics service?.

What is everyone using in 2023?

The site is at Bonsai-garden.com just so you can imagine what I am going to need.




As others have mentioned, the most popular simplified ones are: Cloudflare Web Analytics, Fathom, and Plausible. Cloudflare (as well as Simple Analytics) use only HTTP referrer to count visits, which makes them the most privacy-friendly but also limits what they can do in terms of tracking an individual session. Fathom and Plausible both use hashed IP and User-Agent, which allows for more functionality, though they still can't track returning visits across separate days because they rotate their salts every day.

GoAccess, AWStats, or other log analyzers can get you a lot of the same data, but they also have even more trouble identifying bots and have little to no ability for customization. Also if you use client-side only javascript functionality on your site there's no ability to track that.. so if you wanted to track how many people zoom in on cool bonsai tree pics you wouldn't be able to do that with a log analyzer. Those also don't work well with CDNs.

There's other more sophisticated tools like Matomo and Piwik Pro that are similar to GA3 in functionality but have the ability to work without cookies if that's what you want. Looks like you don't need something that involved. I'd probably go with either Plausible or possibly Cloudflare if you're looking for something free. Looks like you're already using Cloudflare for some CDN assets.

I've written a book on this subject that covers 15 different options: https://www.quantable.com/analytics/google-analytics-alterna...


> possibly Cloudflare if you're looking for something free.

I only tried cloudflare in response to this thread, but I am considering their image hosting. One thing though, they only have very very basic analytics on their free plan, like total visits per day

Edit: Correction. It is free but their dashboard tries to convince you to upgrade.


Cloudflare has two different analytics products which makes things pretty confusing. "Web analytics" is always free and under Analytics > Web Analytics at the account level. That's the one you want, not "traffic analytics" under Analytics > Traffic in the individual website.

But yeah it is very basic.. Visits, pageviews, referrer, browser, RUM metrics and a few other things. Tools like Plausible give you some more features but still a lot less than GA3. Tools like Piwik Pro give you something equivalent to GA3. Product analytics tools like Posthog give you something more like GA4.


> though they still can't track returning visits across separate days because they rotate their salts every day.

Do we have something g in between? Knowing return visitors and not GA?


The majority of GA alternatives use cookies, which of course do allow tracking returning visitors.

If you're asking what GA alternatives don't use cookies but DO allow tracking returning visitors, then the list gets pretty limited. The only tools I review in my book that don't use cookies by default but do allow tracking of return visitors are Clicky (which does it with User Agent + IP) and Visitor Analytics (which does it through fingerprinting).

User Agent + IP as used by Fathom and Plausible could track returning visitors if they didn't rotate their salts, but if they did this then you would need to ask for consent to track. IMHO, the issue that determines if you need to ask for consent is this persistent tracking, not the technology you use.


Can you review another tool? https://usermaven.com


Thanks for the suggestion. I've added it to my list of alternatives that I'm tracking, but it doesn't appear to have the install base required for inclusion in my book yet. I'm currently looking for GA alternatives that have at least .1% adoption within the top 1M sites.


Usermaven tracks retrurning visitors by fingerprinting.


One piece to consider is where to store data and how it may be leveraged for counts of things, over time. For basic analytics, a simple dashboard may be created independently of the data store.

FeatureBase is a super fast, highly efficient, in-memory analytical data store which may be queried using SQL. We have customers handling 10s of billions of events with it. It's free, Open Source and available here: https://featurebase.com/. The technology is based on Roaring Bitmaps: https://roaringbitmap.org/

There are reference Docker containers which may be used for development or reference for building a deployment: https://github.com/FeaturebaseDb/featurebase-examples

We have a Discord here if you'd like to discuss what can be done with the product: https://discord.com/invite/featurefirstai


I recommend Piwik Pro for web analytics. It is by far the best UA replacement as it's pretty much cloned it's functionality (and added new ones like custom dashboards). It's a spin off from matomo that has moved more enterprise and serves some very large privacy centric EU companies and institutions.

It's got a generous free tier, but costs a bit if you go over 500k events a month.

It's built on clickhouse, while matomo is built on MySQL which in my opinion is pretty much a nail in the coffin for Matomo's future proofing. Matomo is fine for simple analytics needs but it sucks for tracking sites with custom events with custom dimensions.


If you'd be content with the data already in your Nginx logs then there's GoAccess which is a command line tool that can parse web server logs and give you a breakdown of page hits/unique visitors over time, as well as data from user agents.

https://goaccess.io/


That is really cool. I love it


Even if you find an existing tool to parse your web server’s logs, this will still require a time investment from your end. You’re right to conclude that that’s not your core business.

I use Plausible [1] which is basically GA but respects privacy and its code is much smaller (<1KB). I’m happy with it. Doesn’t do more or less than satisfying my need for visitor insights.

[1] https://plausible.io/


If you're looking for one that gives you more control, there is a starter kit from Tinybird to build your own GA replacement over a managed ClickHouse. It has the basics OOTB, but you can customise it as much as you want, capture different data, do different analytics, custom frontend, etc. https://github.com/tinybirdco/web-analytics-starter-kit


I'm using Fathom for my personal sites now and Plausible at work.

I did this 1) because of EU privacy rulings and 2) because Google Analytics is deprecating GA3 (which is website focused) for GA4 (which is app focused). The UX for GA4 SUCKS. It is so hard to find basic info like what pages are people most looking at, a realtime view, what are all the domains referring people to me, etc.

So far I like plausible better, it's simple and focused on websites. Whereas fathom seems ready to hook into more complex martech that I have no interest in.

That said, Plausible supports Google Analytics already and Fathom doesn't yet. In contrast, Plausible doesn't support TFA and Fathom does.

Plausible also has alerts on spikes and sends summary emails at your desired frequency whereas Fathom doesn't seem to do either.

So I guess the analytics market is still somehow in early days.

Also, Michael Lynch pointed out to me that if you plan to sell a site, buyers expect access to Google Analytics data specifically. Something to keep in mind.

Also, it's good to have server side analytics since this will uncover at least 100% more legitimate users (on tech sites especially). So I tried out Fastly and Netlify which show basic analytics but don't give you access to access logs. I ended up just hosting on an OVH VM with 7day block-device backups in case it goes sideways.


Plausible is a good solution and pay a lot of attention to privacy issues. It's the other product I would look at along with PostHog. The analytics are based on ClickHouse and are very fast.

disclaimer: My company Altinity has worked with Plausible for a number of years.


I don't know if many still use it but AWStats [1][2] is one of the oldest self hosted solutions. I used it ages ago. It uses access logs vs. other platforms depending on javascript being embedded in pages. It's had some vulnerabilities in the past so it should be protected by authentication. It depends on Perl and a web server that supports CGI scripts. Stats processing is done in cron against the access logs so there is no added load based on page visits. It could even be run on a separate machine/node that logs are aggregated to.

Part of the installation is configuring which log format your servers use. If using a non standard format one can map the related fields.

If configuring AWStats to perform reverse DNS lookups I would suggest also installing Unbound DNS specifically on that machine and setting the following in /etc/unbound/unbound.conf to minimize load on your upstream servers perhaps even higher:

    cache-min-ttl: 86400
    cache-max-ttl: 1209600
    serve-expired: yes
    serve-expired-ttl: 259200
    serve-expired-reply-ttl: 30
    serve-expired-ttl-reset: yes
    val-bogus-ttl: 600
    cache-max-negative-ttl: 86400
    serve-expired-client-timeout: 1800
The above is specifically for machines dedicated to looking up a very large number of PTR records, not for a home router. Some DNS providers rate limit mass-lookups of IP addresses. PTR records are slow to change so a high TTL override on a log processing box is generally fine unless you care about short lived PTR records AWS, Azure, etc..

[1] - https://awstats.sourceforge.io/

[2] - https://github.com/eldy/AWStats


I think your title and your description may already show a little problem: You say you're lookin for a GA alternative, but what if GA was never the right tool for your use case, to begin with? Your use case description is pretty minimal:

> "I would like to see visitor numbers at times of day and probably a few more things in the future."

GA is a very complex tool, that often is complete overkill for small sites. So when you ask for GA alternatives, you will probably hear Matomo as a suggestion – but this also is in the same area of complexity as GA and would probably not be a good fit for your project (and it wouldn't solve your cookie problem without proper configuration).

So maybe a better question would be "What analytics tool should I use for project XYZ?"

If you want a minimal approach, take a look at Umami (https://umami.is/), which is about as minimal as it gets (when talking about frontend JS analytics tools).


https://posthog.com/ is quite good (just a happy user)


+1. My company hosts ClickHouse data for PostHog installations. The users I've talked to are very happy with it.

Disclaimer: We are a partner with PostHog.


Since you care about privacy, Cloudflare Web Analytics might be a good fit for you: landing page [0] and announcement blog post [1]

[0] https://www.cloudflare.com/web-analytics/

[1] https://blog.cloudflare.com/privacy-first-web-analytics/


Try https://usermaven.com. It has both web analytics and product analytics. You can choose the plan according to your business needs.

It is also privacy-friendly and fast (using Clickhouse).


here are some for you to consider:

https://github.com/matomo-org/matomo

https://github.com/plausible/analytics

https://github.com/arp242/goatcounter

All 3 of them open source and you can host yourself. If you don't want to self host all 3 have options for you.

I am currently self hosting Matomo and am happy with it.

Fun timing for you to ask as just running a poll to see what others are using:

https://fosstodon.org/@softinio/109689221718875173


Amplitude if you want to understand how a user interacts with your product. They do behavioral / product analytics.

They also have a series of books written on analytics and user behaviors, which you can find by searching for amplitude playbooks.

They do offer a free tier and can ingest from different data sources if you want to provide your own.

https://amplitude.com/

(Disclaimer: My wife works there, but my workplace also uses it to help determine UX issues in our product.)


Looks like a cool product but i am immediately turned off by having to “contact sales”.


Something new to keep your eye on: https://volument.com Generates insights. Privacy-friendly.


One alternative path is to still use GA, but don't expose their JS to your end users. Submit the data from your backend, via their api, and omit anything you don't want sent. https://developers.google.com/analytics/devguides/collection...


Are you aware of any drawbacks to this?

The page listed says explicitly that it's not intended to replace data collected from gtag, Tag Manager, and Google Analytics.


This:

"Geographic information is only available via automatic collection from gtag, Google Tag Manager, or Google Analytics for Firebase."

But otherwise, it seems useful.


Maybe try https://goaccess.io/

It parses your logs and generates reports in HTML or text form.


Check out Simple Analytics, https://www.simpleanalytics.com/en

Their summary basically is: Privacy protection is our business model / Your data is always encrypted / We never, ever, ever store any personal data about your visitors. No cookie banners. / We are an EU-based company with EU-based servers. / You own your data.


If you are into self-hosting, check out the platform I'm building (UXWizz), you can toggle specific tracking features on or off, and you can also store custom events/data if you need, it uses a LAMP stack, so it's also easy to connect to the MySQL database and query for specific data if needed.


Plausible is great. It’s not full-featured, but it’s enough for most. I’m a happy user.


I migrated from GA to Plausible.io and I’m very happy with it.


getinsights.io is good and privacy friendly and has a nice free tier.


Plausible.io




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: