
Umami: Self-hosted open-source alternative to Google Analytics - bananaoomarang
https://umami.is/
======
mcao
Hi everyone!

Author of Umami here. I totally did not expect this response so it looks like
you all hugged my little server to death. The demo should be back up now.

A little background. This is a side project I started 30 days ago because I
was tired of how slow and complicated Google Analytics was. I just wanted
something really simple and fast that I could browse quickly without diving
through layers of menus. So I created Umami to track my own websites and then
open sourced it. The stack is React, Redux, and Next.js with a Postgresql
backend.

Would be happy to answer any questions you have.

~~~
nodesocket
How does a post on HN that has 591 points and on the front page only have 1184
views and 567 visitors in the last 24 hours according to the live demo?
Something is not right. Should be seeing lots more page views and users right?

EDIT: just noticed the demo is for another site flightphp.com not the landing
page umami.is which is sort of weird. That explains it. The demo should really
be demoing the metrics for umami.is. Which is a shame, because that would
prove how scalable umami.is is. Unfortunately umami.is is not eating its own
dog food.

~~~
mcao
I actually am using it to record metrics for umami.is:

[https://app.umami.is/share/8rmHaheU/umami.is](https://app.umami.is/share/8rmHaheU/umami.is)

I'm using it for all my websites. The reason I went with another site for the
demo is because I wanted something with at least 30 days of data so users can
play around with the different settings. Once I get enough data, I'll switch
it over.

~~~
archon810
Just FYI, some mobile optimization is needed.

[https://imgur.com/a/j9MYG9z](https://imgur.com/a/j9MYG9z)

~~~
nodesocket
Recommend to create an issue on the Github
[https://github.com/mikecao/umami](https://github.com/mikecao/umami)

------
malisper
One of the claims of Umami is that it's GDPR compliant:

> Umami does not collect any personally identifiable information so it is GDPR
> and CCPA compliant. No cookie notices are needed because Umami does not use
> cookies.

From auditing the source code, this doesn't seem to be the case. First, it
claims it doesn't use cookies, but it clearly uses localStorage to store a
"sessionKey"[0].

The other claim, that Umami is GDPR and CCPA compliant because it does not
collect any personally identifiable information is only half true. While the
data collected isn't PII (because you can't use it on it's own to identify a
user), it's still "personal data". This is because the "sessionKey" stored
alongside all events is actually a pseudonymous user identifier. It's really
just a hash of the user's IP along with a few other properties[1]. Because the
data Umami collects, when combined with some other data, can be attributed
back to the user, the data is still considered "personal data". That means
you're still subject to most of GDPR such as GDPR deletion requests[2].

[0]
[https://github.com/mikecao/umami/blob/f4ca353b5c68750bf391e5...](https://github.com/mikecao/umami/blob/f4ca353b5c68750bf391e5874f19c609b9c421ef/tracker/index.js#L44)

[1]
[https://github.com/mikecao/umami/blob/master/lib/session.js#...](https://github.com/mikecao/umami/blob/master/lib/session.js#L29)

[2] [https://gdpr-info.eu/art-17-gdpr/](https://gdpr-info.eu/art-17-gdpr/)

~~~
mcao
I am not a lawyer so I cannot say for sure what constitutes PII and what
breaches GDPR. I am using the same techniques as Fathom Analytics,
Plausible.io and other products. Everything is hashed into a unique session id
and none of the actual data like user agent or IP address is actually stored.
It is the same data that is found in server log files. In the strictest
interpretation of GDPR, I don't think any analytics product can exist.

As for the localStorage, it's just for performance so I don't have to
recompute the session hash. The product will work the same without it. But
seeing as it is a cause contention I am probably going to remove it.

~~~
chmod775
An IP address is considered personally identifiable information in at least
Germany. If you're storing that you'll already have to think about the GDPR.

This is just another misguided attempt to adhere to the letter of the law
while going against its spirit. Is is misguided because it's based on a wrong
understand of what the letter of the law actually is. You see this a lot with
adtech and analytics companies who try to skirt regulations through elaborate
mechanisms but ultimately in vain.

~~~
dorgo
>This is just another misguided attempt to adhere to the letter of the law
while going against its spirit.

It's easy to say this and hard to draw a line between PII and what I can store
without consent. "yesterday I sold 5 products on my website" is not PII (I
hope). If I store the timestamps for each purchase I'm already in the grey
area. One could combine the timestamps with other data to identify my
customers.

------
eric4smith
Lots of home-grown analytics are very privacy focussed these days and do not
use cookies. That's a good thing.

For simple sites like blogs, simple low volume ecommerce, etc.

But for more "serious" eCommerce, SAAS based applications and sites that are
concerned with marketing on email, social and web then then optimizing what
you show then and finally generating leads for salespeople to call or actual
sales...

Cookies or local storage, or some way of tracking the customer across all the
channels and their actions are essential.

If one can avoid using Google Analytics, then that's a good thing also.

But let's get real -- the idea of a cookie-less future is not gonna happen
because people actually do business in the web.

~~~
Malfunction92
Exactly, other than very minimal metrics you can't do much of anything without
cookies. It's great that there are now many alternative analytics services
available, but I feel like they all just do the exact same thing – stick a
two-line script on your website, then get some very minimal data about your
website. This is probably good enough for most people, but it becomes very
hard to actually do anything with this data if you're running a more "serious"
project.

But I'm always amazed at how much popularity these projects seem to gather. I
myself made a very simple landing page [1] for a similar service (but one that
caters more to the saas based applications), and it's managed to gather some
interest even though I've barely done any promotion to it.

[1]: [https://tinylens.io](https://tinylens.io)

------
andrewzah
I have been using goatcounter [0] and love the simplicity. I used to use
Matomo, but they want a lot of money to see the referrals from google
search/etc. And it's a heavier dependency. Goatcounter is a drop-in golang
binary.

[0]:
[https://github.com/zgoat/goatcounter](https://github.com/zgoat/goatcounter)

------
lxe
I've seen a bunch of these simple self-hosted log dashboards here on HN, but I
don't think they directly compare with google analytics, which is just a much
more powerful and much much more complicated product. Not to say this isn't a
great product, but it really isn't an alternative to GA.

~~~
slg
I wonder how many users actually use those advance features. As someone who
has only ever used GA to help provide insight into developmental priorities
(i.e. not for marketing), this doesn't help too much. For example, this tells
you the browser but it doesn't tell you the browser version. It tells you the
device being used, but it doesn't tell you the resolution of that device. It
tells you the country of your visitors, but it doesn't tell you the user's
language. It tells you pages users visit, but it doesn't tell you the order in
which they visit them.

This isn't a criticism of Umami. It looks like a nice clean app that
accomplishes what it is trying to do. But if this is all you needed from
Google Analytics than that tool was overkill in the first place.

~~~
mcao
Agreed, saying it's a one to one alternative to Google Analytics is probably a
misnomer. I think a lot of people, myself included, used GA because there were
no simpler alternatives and better overkill than nothing.

------
arielm
This looks really nice! If... you’re only looking for high level numbers for
something like a personal blog or a simple landing page for a mobile app.

I wouldn’t call this a replacement to Google Analytics.

The reason to have something like Google Analytics is to track traffic at a
more granular level, and with very specific intent.

Some of the things I _rely_ on include:

\- custom parameters \- segments \- goals \- A/B testing \- specific views

And that’s just the short list.

Now, I use Analytics heavily because we spend a lot of effort on growth, both
organic (content, seo) and paid (ads), so knowing what’s going on at that
level is essential.

If you don’t, there’s not much reason to use something like GA.

------
vs4vijay
Looks neat! will explore.

Also, I did research on alternatives to GA few days back, might be helpful of
someone:

[https://github.com/Open-Web-Analytics/Open-Web-
Analytics](https://github.com/Open-Web-Analytics/Open-Web-Analytics)

[https://matomo.org/](https://matomo.org/)

[https://github.com/matomo-org/matomo](https://github.com/matomo-org/matomo)

[https://github.com/usefathom/fathom](https://github.com/usefathom/fathom)

[https://www.goatcounter.com/](https://www.goatcounter.com/)

[https://plausible.io/](https://plausible.io/)

[https://github.com/PostHog/posthog](https://github.com/PostHog/posthog)

[https://www.usertrack.net/](https://www.usertrack.net/)

~~~
diafygi
Of these, do any have a funnel tracking feature that shows what visitors went
through a specific series of pages/events? Seeing how users moved about the
site and seeing how many converted is a deal breaker for me.

~~~
KaoruAoiShiho
Pretty sure posthog does.

~~~
timgl
(PostHog founder) Yes we do! PostHog gives you full funnel capabilities +
ability to see exactly what users dropped out where

------
thinkmassive
A comparison of Umami and Matomo (formerly Piwik) would be helpful since they
seem very similar. I looked at both websites and didn't see any mention of the
other project.

------
colechristensen
Is there a similar product that does this server side (without injected
javascript telemetry) with http logs?

~~~
anderspitman
Keep in mind if you're using a CDN (ie CloudFlare) your absolute numbers will
be way off.

~~~
rayshan
This. Just got bitten by this by using Vercel’s serverless functions edge
caching. More details:
[https://twitter.com/rayshan/status/1295521974479798274?s=21](https://twitter.com/rayshan/status/1295521974479798274?s=21)

------
ln_00
to be honest, if you are using nginx, just use / run
[https://goaccess.io/](https://goaccess.io/) It collects the same information
as umami and is even more lightweight, since it just runs whenever you tell it
to.

just add the command as a cron job, and you get an auto generated static
dashboard. very neat.

~~~
leephillips
Apache too (and first).

------
chrisblackwell
I'm very excited to see this space heating up. It seems for years we defaulted
to using Google Analytics and no one wanted in the market. Now there are
plenty alternatives, with many of them open source.

------
dzink
It needs more granularity of OS versions and browser versions. Knowing which
iOS version your users have is important to decide on what base level version
you need for an iOS app, for example.

------
eden_h
When I've seen GA used or recommended to people, it's because their use case
is tracking the marketing performance of their website.

Tackling the privacy focus for GA is great, but they're a good deal of
products out there that already fill that niche, not to mention the
requirements of the privacy crowd usually being a venture into itself.

If you wanted to make it relatively competitive for marketing, the simplest
addition would be adding labelling via regex for referrers.

i.e. - Some users want to be able to group Baidu, Google, DuckDuckGo, into a
single bucket for comparison. Some users want to break them down into common
market segments by country.
"[https://www.baidu.com/link?url=FyYbCZqj65Vc7A4XeSNrOcQCS2qFX...](https://www.baidu.com/link?url=FyYbCZqj65Vc7A4XeSNrOcQCS2qFXD_8SBAcDWSlJnm&wd=&eqid=d43dbd6c00005f90000000025f3c6d2a")

is from your live demo referrers, and makes it difficult to actually assess
the amount of traffic from Baidu. Using a regex label means that users can
break down traffic from Paid/Organic marketing fairly quickly, and start to
build up dashboards they can use.

If you ever extended it to allow multiple labels for each hit, could re-run
the regex over past data, and could build reports off it, you'd easily have a
benefit over GA that would start to wean the marketing crowd off it.

------
hitekker
For something this simple, I was hoping to see an option for SQlite, not just
MySQL and PostgresSQL.

~~~
mcao
The app is using prisma.io which does support SQLite. I just haven't had the
time to implement it yet.

------
busymichael
Congrats on launching -- really impressive. One important issue that these
self hosted analytics solve is ad blocking. Ad blocking by users really
undermines the ability of a site or app to figure out what is working and not
working. When you host your own analytics, you can get usability information
for all of your users, not just those that don't block. That allows you to
make a better product.

I have been working on something similar at
[https://argyle.cc](https://argyle.cc) \-- we combine cloud analytics with a
self-hosted analytics collector js. That gives you the best of both worlds:
privacy focused, user respecting analytics, but full featured reporting in the
cloud and ad-blocker resistance. It also allows event tracking to be done over
js/web or in-line/server side.

------
marcus_holmes
I'd love to use this. But 34 dependencies?

I know ~10 of them are React, and there's some in there that make sense. But I
haven't got the time to audit them all, and re-audit it every time any of
those dependencies update .

And escape-string-regexp? Really? it's literally 2 lines of code [0]. Why have
I got to give the maintainer of that project commit access to this program
that will be seeing potentially sensitive data?

Why, if the developer couldn't come up with those 2 lines themselves, isn't
this a Stack Overflow copy/paste?

[0][https://github.com/sindresorhus/escape-string-
regexp/blob/ma...](https://github.com/sindresorhus/escape-string-
regexp/blob/master/index.js)

~~~
halfmatthalfcat
Would you also criticize someone for using Apache Commons StringUtils? The
fetishization of critiquing npm package choices is hilarious.

~~~
marcus_holmes
yes. And no, it's a major security problem that we're only just beginning to
realise is a major security problem.

------
m90
Is there a way for me as a user to opt out of this tool other than relying on
third party tools like uBlock? I'm starting to get annoyed by so many "privacy
focused" tools with literally no consent options at all.

~~~
mcao
I haven't implemented it yet, but I plan to make it read the user's do not
track setting and automatically opt out.

~~~
m90
While this is much appreciated, keep in mind this does not work for Safari
users.

------
shattl
Wow, Piwik is now Matomo, how fast time flies!

------
gowld
If you want to respect user privacy while collecting analytics data, I
recommend using Local Differential Privacy (via Randomized Responses) when
collecting information from browsers.

[https://en.wikipedia.org/wiki/Local_differential_privacy](https://en.wikipedia.org/wiki/Local_differential_privacy)
and
[https://en.wikipedia.org/wiki/Randomized_response](https://en.wikipedia.org/wiki/Randomized_response)

------
nickthemagicman
Am I wrong for thinking that Google Analytics has bad UI?

As a noob at UI it was bizarre and unintuitive for me.

Just finding the region locations of the traffic was odd and didn't make
immediate sense.

~~~
edwinjm
It's pleasure to work with compared to one or two years ago

~~~
nickthemagicman
Ok have to check it out. Last I looked was years ago. And I was like...there's
no excuse for this bad UI from such a great company.

------
llacb47
The whole script is 502...
[https://stats.umami.is/umami.js](https://stats.umami.is/umami.js)

------
markl42
I'll throw a shoutout for a top tier project -
[https://github.com/zgoat/goatcounter](https://github.com/zgoat/goatcounter)

Using it for some personal stuff, and does absolutely everything I need it to,
and then some.

I love the ethos of the project, and whilst it's open source, there's a hosted
option that looks super reasonable too.

------
epoch_100
This looks great! For what it’s worth, I also maintain an open source (and
self hosted) website analytics tool called Shynet [0] (someone else mentioned
it in this thread, but thought I’d share here as well). Really great to see
more options in this area!

[0] [https://github.com/milesmcc/shynet](https://github.com/milesmcc/shynet)

------
dylan604
How useful are the metrics from non-GA sources trusted when validated traffic
to various groups like investors, advertisers, etc?

~~~
mxuribe
Quite intriguing! I have no experience pitching to investors or advertisers,
(but i do have web analytics exp.) and never would have thought that this
would even be a question! Curious, is this something that you encountered, or
is this hypothetical?

~~~
dylan604
I had heard in the past that if your numbers were not GA, then they did not
put much weight into them. Since you can grant access to other people directly
into GA, they can validate the data. Using awstats or other metrics were
deemed less trustworthy since they required someone gathering the data (which
allows for potential manipulation). Before the days of 3rd party advertising,
people tried to sell local ads just like a news paper. The website with more
visitors could charge more for the ad banner space. Some "little" blog would
have to prove they received the amount of traffic.

~~~
mxuribe
I didn't know that. Unfortunately I can see the rationale in that...which
saddens me. <sigh>

I appreciate your teaching me something I didn't know. But now I feel worse
for us "little blogs". (Not your fault of course.)

------
te_chris
This would be amazing if, out of the box, it sent data to BiqQuery and/or
Redshift. Postgres is fine, but for most companies this data is most useful in
the warehouse. If this was a simple, drop-in solution to get well formatted
data into BQ plus a bit of easy vis, that would be cool and VERY useful.

------
hugey010
I've used [https://count.ly/](https://count.ly/) instead of Google Analytics
to gather exception data and business analytics from mobile and web apps.
Relatively cheap for decent scale and they're very nice and helpful.

------
superlupo
Slightly off-topic: Does anyone have recommendations for self-hosted open
source analytics that can handle a large volume site (think 500.000.000
impressions per month)? I can't imagine systems with MySQL/PostgreSQL as
database can handle this.

~~~
AlfeG
500 000 000 requests per month is just about 200 request per second. Why there
should be any problem for any DB?

As for question - I saw a lot of great reviews on ClickHouse DB

~~~
rsa25519
> 500 000 000 requests per month is just about 200 request per second

Not if you assume that some hours will have more web traffic than others.

~~~
superlupo
Yes, there are some peaks with 10-20000 per sec.

Clickhouse seems very suitable as database. Does anybody know open source
analytics tools that use it? Two parts would be needed: the client javascript
tracker which injects into the database, and a GUI for reports.

------
buraksarica
I wish to see a line about backend platform on installation documentation.
Yes, it's simple, but IMO no one will find "Umami requires bla bla platform on
bla bla operating system." sentence useless.

~~~
sradman
From a quickscan of the GitHub repo [1], this is a JavaScript client, like
Google Analytics, that sends data to a self-hosted Node.js backend that stores
the data in MySQL or PostgreSQL using the Prisma database toolkit:

[1] [https://github.com/mikecao/umami](https://github.com/mikecao/umami)

------
dandigangi
I always like seeing competitors to GA but the website could really use some
more information on why you should use it and the features it gives you. It's
hard to beat top competitors in a saturated space.

------
zanecraw
Awesome project! Google is definitely fading out for sure. I know many
businesses and developers are tired of it. Looking forward to seeing what
other inventions will give Google some competition.

------
mikece
Are there any "Google Analytics" alternatives that aren't based on Python,
Node, Go, etc but something with a PHP back-end that can be deployed to any
commodity LAMP hosting provider?

~~~
wizzwizz4
You could check the ReverseEagle list:
[https://developers.reverseeagle.org/replace/google-
analytics...](https://developers.reverseeagle.org/replace/google-analytics/)

Also, anyone with a Tedomum account? It'd be nice if you could open an issue
about adding umami.is.
[https://forge.tedomum.net/ReverseEagle/developers/-/issues](https://forge.tedomum.net/ReverseEagle/developers/-/issues)

~~~
rapnie
Thanks! PS. there is a typo here (should be 'statisfy'):
[https://developers.reverseeagle.org/replace/google-
analytics...](https://developers.reverseeagle.org/replace/google-
analytics/#satisfy-for-wordpress)

------
tobilg
If you have an AWS account, you can use
[https://ownstats.cloud](https://ownstats.cloud) to self-host website
analytics

------
anderspitman
I've been happily paying for GoatCounter for several months. I don't imagine
I'll ever need to self host but it's nice knowing I can if necessary.

------
songzme
noticed you didn't write any tests:
[https://github.com/mikecao/umami](https://github.com/mikecao/umami)

What was your reasoning? Personally, I write tests for all my projects, it
forces me to really think hard about how to break down the different
components and functionalities and it helps others feel more confident to
contribute.

------
dirtnugget
Can anyone tell how this holds up against Matomo?

------
anvarik
I liked the image on the home page, how can I create such an image to show
case my product? only way is PS?

------
1f60c
From the screenshots, the design looks very slick and I can’t wait to give it
a try!

------
armandososa
This is super cool!

FlightPHP looks nice, too, what didn't you use that for the backend?

------
dsalzman
+1 to [https://goatcounter.com/](https://goatcounter.com/) o use it for my
personal blog [https://dannysalzman.com](https://dannysalzman.com) . This is a
good reminder to donate.

------
quaffapint
Any suggestions for collecting server side logs via nginx pods in k8s?

------
sahnasidol
Is there any way to create and track custom events like clicks?

~~~
mcao
Yes, event tracking is already in the current build. I just haven't finished
the UI components or documentation yet. But basically all you have to do is
add a CSS class to an element and it will automatically start tracking. Like
this:

    
    
      <div class="button umami--onclick--signup-button">Signup</div>

~~~
tekkertje
Would be great to see official custom event tracking support in the future.

------
gitgud
Demo is throwing back "502 Bad Gateway"

Hacker News hug of death?

------
takein
Good job. Getting 502 error!

------
ethor
Seemingly no api? :(

------
gramakri
Demo gives a 502

~~~
mcao
Should be back up now.

------
kmfrk
Does this use cookies and similar to warrant a GDPR prompt?

~~~
mcao
No, it does not use cookies so no cookie prompt is needed.

~~~
eclipsetheworld
I just checked your tracking code. It looks like you're using the locale
storage to set a session id to track uniqueness. According to this [0]
Stackexchange answer you will still have to display a cookie banner.

[0]
[https://softwareengineering.stackexchange.com/questions/2905...](https://softwareengineering.stackexchange.com/questions/290566/is-
localstorage-under-the-cookie-law)

~~~
mcao
The local storage is mainly for performance. It's to prevent a round-trip to
the database to figure out the session again. The session id will be the same
regardless and it can function without local storage. But I do see your point.
I may consider removing it just to be safe.

------
bambam24
Clicked on love demo and get 505 bad gateway. I hope they analyzed it

------
rockwotj
Demo website 502's for me :/

~~~
gilli
Same, maybe we already hugged it to death?

