I was using Avodocs (https://www.avodocs.com) to produce a privacy policy for our MLOps platform, https://iko.ai, but they didn't have PostHog in the list for the "Analytics" section, and they assumed that doing analytics implied sending user data to a third party site or something.
In fact, we can find alternatives to most of the popular GitHub projects on LibHunt. All you need to do is open a GitHub repo and replace "github" with "libhunt" within the URL. Disclosure: LibHunt is a project I work on.
Side question. Posthog describes an event pipeline. Is there a saas/open source server with event pipelines where you can do add logic to do things, like: send email, notifications, send api call, maybe add custom logic etc. Ive seen things like segment.io, but didn't find a straightforward solution.
Example:
- if a user finished three workouts, send this notification.
- if a user hasn't done a workout in a week remind them etc.
Nobody else does this, I have like 100's of transactional use cases:
- Trial almost ending
- New signup
- CC expired
- Reminders, confirmations
- Workouts affirmation, reminder etc.
etc.
A pain to write a proper rule system and keep overview in own application.
The challenge is not the simple emails / notifications. But to combine it with more complicated rules, and and + time conditinos
I have the events already in the system, would love to transfer them and have a stable UI / where I can add, create pause etc.
PostHog released plugins recently (posthog.com/plugins) - these let you trigger webhooks in other systems based on realtime events that have taken place (as well as exporting/importing data/transformations etc).
You just write a few lines of TypeScript in the platform (avg LOC is something like 80) - see https://posthog.com/docs/plugins/build. There are a bunch prebuilt too.
Regarding PostHog: If am a Startup & still running my beta product, $1500 per month is very expensive for managed service. They should charge it by number of API requests. Then I can start using this and naturally grow into the product. I really do not have bandwidth to self host when I am busy solving core customer problem.
I'm not affiliated with PostHog nor am I privy to their internal tradeoffs; I therefore can't comment on how they should charge. However, their self hosting modality is rather straightforward: it's a docker-compose. It takes a few minutes to set up and mostly just works.
We need to include high availability, disaster recovery, managing db, security & other good stuff if we need to run in production. This is where managed services excel. I do not want to do all these if I am in early stage of a startup.
I wrote two articles about this for LWN last year. Several of them are self-hostable. Summary:
https://lwn.net/Articles/822568/: lightweight options: GoatCounter and Plausible (open source), Simple Analytics and Fathom (closed)
https://lwn.net/Articles/824294/: more alternatives: Matomo and Open Web Analytics (fairly heavyweight but both open source), Countly (open core), Snowplow Analytics (open source but enterprise roll-your-own product), GoAccess (open source; analyzes web server logs)
Google may sometimes disable AdWords campaigns on sites that use Matomo. They "fix" it every once in a while when Matomo devs reach out to them, but the problem returns after a few months each time.
Is this the case only on the site it is configured in this manner? Or are you saying that Google analytics data is not collected or used on the site with this setting disabled, but that data from the site with the setting disabled may still be used for advertising purposes on other sites?
Sorry if my phrasing is strained, as I was trying to be precise but it may have impacted readability.
> Google may sometimes disable AdWords campaigns on sites that use Matomo.
I first used Matomo to monitor users webapp experience, it was amazingly simple to set up, good privacy protection/anonymisation, and perfect for insights on workflow patterns.
Fathom started as an open source project and then closed- when called on it one of the cofounders got extremely hostile and lied about saying it would stay open (then blocked people who shared the screenshots).
Plausible on the other hand has been really engaged with the community on their Github Discussion board.
I've been a Snowplow user for nearly a decade. It's a bit of work to set up, but it's the best engineered of all those options.
Snowplow's JSON schema events and contexts give you complete flexibility to define a data model that suits your business. Combined with DBT and a BI tool, like Apache Superset, it's vastly more capable than Google Analytics. We have clients running Google Analytics 360 that can't do the stuff we're able to with Snowplow.
I've also used Snowplow fairly heavily (several years ago). It's good for big stuff where you need lots of control and data customization, but it's significantly overkill if you just want basic analytics for your blog or small business website.
It's quite common for us to rely on Snowplow as a source of truth, but use GA for quick exploration. Certain reports in GA are setup so nicely, like "navigation summary" and the more intuitive session definition for marketing. And while Snowplow is real time, there's no effective way to see the same reports as produced by GA.
I'm the maintainer of the project and it's so heartwarming to see it being recommended on this forum.
All the projects mentioned here are great. What I think sets Plausible apart is that we've managed to create a profitable business around a 100% AGPL-licensed codebase (i.e. no dual-license for enterprise version). This means we can keep investing into the product and adding new features without being in the 'thankless OSS maintainer' role that so often ends in burnout.
We're currently working on importing historical data from Google Analytics into Plausible[1] which should make switching even easier for many folks. Stay tuned.
I know that it's not a support channel here, but I set up plausible on my blog (rmpr.xyz) was very happy with it, decided to pay for one year. The thing is, when I try to pay, it says my bank refused the transaction (when I pay on other websites like Amazon) it's perfectly fine. And one thing I should note is that I'm charged then refunded a bit later, I don't know if the problem is my bank or your transaction provider. Thank you for such an amazing Product, easy to use and without overhead at all. Kudos to you
hi! we use Paddle for all the payments. i've sent an email to them now to see if they can tell us more. in majority of cases when transactions fails because the bank refuses the transaction they tell us to tell you to contact your bank or to try a different payment method. will get back to you with their response.
Plausible is enough for most, but it doesn't track as much as Google Analytics. It's enough to know which content works, but not enough to understand your users' behaviour.
However removing the cookie banner is a great UX improvement, and respecting your users' privacy is what you ought to do.
I'd also add that the maintainers were nothing short of excellent, and took care of two tickets I opened before.
I also like plausible - low cost for a low traffic site, but without all the more intrusive tracking/features that google has. I use it on about a dozen sites I develop/maintain.
Also has a self-hosted option which is 'free', but you need to pay to host it someplace. I just pay them instead.
I installed Plausible a few hours ago on my VPS. The installation process when very smooth and it gives me the data I'm interested in. I really did not want to use Google Analytics, and since I have some capacity left on my server, Plausible seemed like a good bet.
I'm really into Plausible. I use their hosted version, and I really like their approach to privacy - it doesn't even use cookies, which means it doesn't trigger the need for an ugly GDPR cookie banner in the EU.
The standard recommendation is https://matomo.org/. I’ve used it in production once (when it was called Piwik) and it seemed reasonable, but I’m not sure how it stacks up right now.
Matomo is also dead-simple to develop plugins for. So if you're looking at a product that can be customized beyond simple analytics, Matomo is a great choice.
I also used it a lot back when it was Piwik and the biggest issue was that it is backed by a MySQL database, and the way the reporting engine was designed meant that it would get pretty slow to work with custom date ranges with large volumes of data. But it did support caching and would pre-build reports over certain date ranges (by day, by month, by week, MtD, YtD).
Currently using a single instance of Matomo for all of my sites, it's pretty good!
Doesn't take too much resources, the UI is pleasant, the functionality is everything that i'd expect from a lightweight analytics solution like Plausible plus bunches of additional stuff in regards to enriching the data (e.g. events, information about devices and software, even performance data in the latest releases).
My only concerns so far are that if i try to open the aggregate view of all of my website data in the past month (<20 sites, most popular of has 50k views i think) the CPU usage for both the main instance and the DB behind it spike for a while, which makes me think that the code underneath has either an N+1 problem, or there's inefficient data aggregation going on. I mean, if Zabbix can show me data about a dozen servers in a single view, why can't Matomo do something similar without sometimes throwing errors?
Apart from that nitpick, it's also pretty reasonable as far as GDPR goes (there's a whole section of tools for it) - currently i don't even use cookies for tracking (a global toggle in the options is available, as well as config for individual sites), anonymize the IP addresses but have a GeoIP database set up (for free) that let's me figure out the approximate location and still get the device info which is enough to give me insights into who uses my sites and what screen resolutions etc. i should target.
Overall, i'm really pleased with my choice of Matomo!
Now, on the other end of things, i also think that most projects out there should have an APM solution of sorts, for which i use another piece of free software, Apache Skywalking, which is a bit more cumbersome, but still powerful and also Zabbix for server/VPS monitoring overall. It's really nice that we can set something like that up entirely for free (hosting costs aside).
ClickHouse is an extremely fast SQL data warehouse that supports vectorized query execution, column storage, and efficient compression. It's also Apache 2.0 licensed and sets up on a dev laptop in about 60 seconds. The SQL is somewhat ideosyncratic (in a good way, I believe). It takes some experience to use it to its full capabilities.
Not all projects move there but if you want fast analytic queries on a few billion rows up to a few petabytes, it's really hard to beat.
Disclaimer: I work for Altinity, which supports ClickHouse.
I switched to goat counter recently and I’m happy with it. It’s not exactly an open source alternative to Google Analytics, because it’s more privacy oriented. If you want to track and profile your audience, it’s probably not what you want. I just want to get traffic numbers by page, though, so it’s perfect. They even support a hit counter pixel instead of adding a script tag if you want even more minimal, privacy-friendly stats.
Huh, the UI seems refreshingly lightweight and boring! Feels very much like the first time i saw Kanboard and was surprised at how minimalistic and snappy it felt in comparison to something like Jira: https://kanboard.org/
(well, most software probably feels more snappy than Jira, but Kanboard was also amazing)
I have been building a self-hosted analytics platform[0] (note that it's not free or open-source, just partially source-available) that is focused on the ease of self-hosting. It is most similar to Matomo but with a better performance, simpler UI and with features that they only provide in their paid plans.
I used simple technologies (MySQL/PHP) for performance and portability reasons and, compared to other self-hosted alternatives, it provides features that you can only find on expensive SaaS product-analytics platforms (heatmaps,session recordings,ab tests, etc.).
Let me know if you have any questions about UXWizz or self-hosting in general.
Looks like a great tool!
Questions: I think you might be based in the Netherlands(?), how does UXWizz conform to European laws of privacy? I see that it tracks IP addresses, is that legal? How does it handle cases where people don't respond or opt out of eu cookie consent?
> I see that it tracks IP addresses, is that legal
My specific instance of UXWizz does not track IP addresses (you can disable that in the tracking settings), but only a hash of the IP+User Agent.
As for the privacy laws regarding tracking IP addresses, it's not very clear whether it requires consent or not as the law is ambiguous. The IP itself alone can not be directly considered personally identifiable information (PII) as having an IP address only, can not define which real person is the data associated with. Another thing to consider regarding IP addresses is that by default almost all services/servers/devices have an access log that does store IP addresses in plain text, so even if UXWizz does not store the IP, your OS/server/router/ISP might do.
> where people don't respond or opt out of eu cookie consent
As explained in the documentation above, UXWizz does not use any cross-session persistent storage on the user's system (not cookies nor localStorage).
Also, being self-hosted, no data is shared with/sent to 3rd parties.
If you need more detailed answers or want a deeper discussion on this subject you can contact me on Twitter or via the contact form on uxwizz.com.
Thanks.
I tried to optimize it as much as possible (mostly MySQL optimizations). All the data shown in the dashboard is generated on-the-fly in real-time (no archives/cached results). It's hosted on a cheap VPS in Germany (5EUR/month).
The performance does degrade a bit when your database reaches 20 million+ sessions tracked, but for the average website receiving <100k monthly visits it should be fast.
> As for the privacy laws regarding tracking IP addresses, it's not very clear whether it requires consent or not as the law is ambiguous. The IP itself alone can not be directly considered personally identifiable information (PII) as having an IP address only, can not define which real person is the data associated with.
"PII" has a specific meaning, in American law. Sites with references to it are likely not relevant to you, as you are based in Romania. The GDPR is crystal clear that IP addresses are personal data. There is no ambiguity. Depending on how you derive the hash of the IP and user agent, this could also be an "identifier" that may be personal data.
But! There are six different reasons you can legally process personal data. Consent is only one of them. It is quite likely that a website owner would have a valid legitimate interest basis for having analytics. This does not require consent from the user.
The only caveat to that; is that if the analytics needs a cookie (or local or session storage item) then you must seek consent for the cookie.
https://panelbear.com/ isn’t open source but privacy friendly. I found it on HN (the founder is on here), can’t speak more highly of it. If you are tracking more than one site and want to get a good overview I recommend it.
Panelbear is truly great. I moved crontab.guru to panelbear in October and I’ve been extremely happy. The site does millions of page views a month but the analytics are still fast and responsive.
While not focused on OSS, this was asked 20 days ago at https://news.ycombinator.com/item?id=29662859 as "Ask HN: Best alternatives to Google Analytics in 2021?". It listed some OSS versions there.
i've tried this for a high volume site, but since all events are stored in normal RDBMS, it gets slower [1][2]. it works well for low-traffic site though.
Another vote for Plausible, I pay for the hosted version as I like to support their way of doing things but I know people who self-host it and it's not hard to do.
It's much simpler, though. You get aggregate stats on who is sending you requests, which is useful but not what people have come to expect from "product oriented analytics."
Using a middle-man proxy for GA is an interesting idea I've seen. GA can take input from the backend, rather than the frontend, if you wish. So you could do some tokenizing/removal/etc of sensitive data, like IP addresses, but still use GA for it's reporting strengths.
Edit: The GA api does allow for things like overriding the geolocation such that if you aren't sending the IP address, there's still relevant geo data to report on.
I have an indie project called Fugu (https://fugu.lol). Fugu is an open-source (https://github.com/shafy/fugu) and privacy-friendly product analytics. If you're looking strictly for web analytics, it might not be a good fit.
It's still an early version, but basic things like tracking events with properties, and analzying those, work very well. I'm currently working on adding conversion funnels. It's free to self-host, and I provide a managed version for $9/month flat.
I've started Fugu because I wanted a product analytics software that is privacy-first (e.g., no possibility of tracking unique users), open-source and simple. I liked using PostHog but it got too fancy, complex and convoluted for my taste - a common theme among analytics software in my experience.
If you're looking for a pure web analytics solution, I can absolutely recommend Plausible (https://plausible.io). I also use it for my static page at Fugu.
As an FYI, Snowplow recently launched its Open Source Quick Start, a set of terraform modules, which automates the setting up & deployment of the required infrastructure & applications for an operational Snowplow open source pipeline, with just a handful of input variables required on your side.
Read more at https://docs.snowplowanalytics.com/docs/open-source-quick-st...
Cheers,
Eddie
I think Matomo is quite similar to Google Analytics which many people feel is bloated and confusing from the user's perspective. The idea with Plausible is to simplify web analytics and make it more understandable compared to what GA/Matomo offer.
Granted, Matomo does have more depth and features in some areas. It can be the better choice if you want to go very deep into analytics and need some power features that Plausible might not support.
We wrote a little (clearly biased) comparison with Matomo[1]. I hope we're not too harsh on it because Matomo is a great project and still a good fit for many people. But obviously we feel like a modern and simplified take on web analytics fits better for the majority of website owners.
PostHog is more focused on product analytics rather than web analytics. It's a very different product to Plausible so we didn't really do a comparison with them. I would say they're more of an alternative to Mixpanel, Amplitude and those type of products rather than an alternative to Google Analytics and other web analytics tools.
I find it a bit frustrating that most analytics tools does not have support for sending events through api calls. Like in mixpanel where it is possible to send events from backend rather than embedding a js on frontend. Why is it so?
I had to ask myself this question some time ago. I decided to build it myself instead. Starting point being the principle that Simple Analytics explained in their blogpost some time ago: if referrer is from the same site, then it's just a visit, otherwise it's a unique visitor. I mixed in some more browser details, also used some blacklist of known bots, and now it's functioning pretty accurately. I don't have the tracking that GA does, but that was exactly the point.
I finally made the switch from GA to Open Web Analytics. I'm already fairly experienced with PHP, which I considered a point in its favor, but I honestly haven't had to do anything with it other than copying it to a server and configuring a few basic settings.
The tracking code seems very lightweight, and I haven't found it lacking any of the features I was using in GA. I've tried a few, and OWA was the first that met all my criteria (100% free software, open source, actually works).
Why are most tools for analytics so expensive? Compared to other saas products I find them to be at least. Is there a significant amount of traffic that is the reason? I was thinking about implementing something myself once, but was worried about the cost of receiving all the event traffic.
Thanks for mentioning, glad to see counter popping up organically ;-)
I'd say what differs us from others is aiming at providing user value for free by aggressively optimizing hosting costs. But you don't need to care about that, it looks great and works and is free. For way over a year now. Cheers
Is there a good reason not to use GA for client side analytics? Is the juice worth the squeeze for a self-hosted/ FOSS solution?
Obvy client side isn't as 100% as looking at the logs, but I've worked with Adobe Analytics and GA (on the free plan) for a few years. UI in GA is much more intuitive to me than Adobe, and I use the tight integration with Google Datastudio pretty often to make reports or slide decks.
Personally I like the well documented GA API to run reports against + python and R API wrappers. For me the only downside is the level of sampling they use. With Adobe Analytics, the API is not as well documented, but they don't sample like GA, but also I wouldn't want to front the Adobe Analytics bill every month for my side projects.
I really liked RudderStack for gathering the data, also has some extra nice features for pushing upstream to other providers. Apache Superset for building the dashboards to display the data.
PostHog is the gold standard here. The feature-set goes far beyond GA and essentially replaces a lot of the other tools you may end up needing (FullStory, Amplitude, etc.)
Of the answers posted here I am curious if any of them are suitable for mobile (iOS/Android) or desktop (Mac/Windows) apps, or if they are all web-oriented?
PostHog has iOS/Android/Flutter/React Native libraries for analytics - full details https://posthog.com/docs/integrate#client-libraries. We don't have session recording / feature flags everywhere yet though (I work there)
Do the PostHog Android and iOS libraries automatically collect & report metrics like OS version & app version so I can easily track how many users are on each version? We're still using Flurry right now but looking for a replacement.
Every time this question / topic comes up, I read through the replies and can never find one for Mobile apps. I guess it might be time to start my own.
https://www.goatcounter.com/ simple effective visitor counting with a fast golang / postgress solution. easy javascript solution to count actions on a page. GDPR compliant! Just looks a bit spartan.
as a side note - is there something that does not require to install shitload of unnecessary software, databases and services? Let's say simple PHP+MySQL
It has been used for about 3 years with metric tracked directly via sql queries. Today, I got a more fancy dashboard that make graph from the sql queries but that part is optional (my grafana dashboard screenshot: https://ibb.co/S3RTyjm).
After trying a few other analytics tools, it's hard to beat the power of just asking a question in sql
Afaik none of them offer it out of the box but I've seen devs recommend it.
Example: umami has a default umami.js file called by JS. uBlock Origin has a rule that blocks any file with that name. I rewrote its URL to something like somerandomfile.js using caddy. And now uBlock doesn't block it.
I was using Avodocs (https://www.avodocs.com) to produce a privacy policy for our MLOps platform, https://iko.ai, but they didn't have PostHog in the list for the "Analytics" section, and they assumed that doing analytics implied sending user data to a third party site or something.
I tweeted at them and they were lightning fast in reaching out and adding PostHog to the options of the the privacy policy template. It's really cool: https://twitter.com/jugurthahadjar/status/144733750656389120...