
Ask HN: How should we implement tracking as privacy conscious startups? - kylerpalmer
As a consumer, I use an adblocker because I care about my privacy. As a startup founder, I use tools like Google Analytics to drive my understanding of how to reach users and solve their problems. When I visit my own website, the irony is not lost on me that my adblocker blocks GA and other tracking tools.<p>I&#x27;d like to know: what are HN&#x27;s opinions and&#x2F;or solutions to this dichotomy?
======
ziddoap
My 2c:

1) Don't go overboard. GA is one thing. GA plus twenty other obscure analytics
cookies and scripts is another. If your site looks like [1], there is 0 chance
of ever being on my white-list for ads.

2) Just be honest and transparent. You don't need a big "WE USE COOKIESSSSSS"
banner, but a short little message along the lines of "We use GA to help
develop a better experience for you. All data is in aggregate, and not
identifiable. Please consider white-listing our website if you would like to
participate in making this website even better".

[1] [https://imgur.com/a/7AHbQ5E](https://imgur.com/a/7AHbQ5E)

~~~
kylerpalmer
Thanks for the tips. How do you feel about anonymized only vs aggregated. For
example: in-app (where user info is available to you) do you carry out any
behavior tracking, a/b testing, etc?

~~~
ziddoap
So, again this is just for myself as a consumer, it's all about approach.

Ideally when I visit a site I'd like to see minimal tracking by default with
an opt-in model. If a site tracks the bare minimum by default and
unobtrusively asks me if I'd be okay with sharing a little bit more (or
participating in certain testing, etc.) I will generally say yes. Showing me
you care enough to let me opt-in vastly increases my trust in you as a
provider. Opt-in is best, opt-out is not ideal but acceptible if clear and
easy, no option = no whitelist.

Also, I think if you are asking as a privacy conscious startup, you should
consider taking a step back and really asking yourself how much you really
need to track. What is essential vs. what is nice to have? Anything that you
implement that is non-essential is another step away from being privacy
conscious, in my books at least.

We're living in a time where the default seems to be to track as many data
points as possible, then figure out if they are relevent/useful. This is
backwards. Implement the absolute minimum and see how it goes. A month down
the road you might realize you desperately need to see X metric, so you
implement X tracking. However, and more likely in my opinion, is that you
realize that X metric adds minimal value to you - yet it tracks significantly
more PII from your customers. It's all about balance.

------
jamieweb
My own solution to this was to write a custom web server log anonymizer[1],
and then feed the logs into AWStats.

My anonymizer tool uses a bloom filter to determine which IP are addresses are
unique, maintaining a semi-accurate unique visitor count. User agents and
referrers are also anonymized.

My blog post explains this in more technical detail:
[https://www.jamieweb.net/blog/using-a-bloom-filter-to-
anonym...](https://www.jamieweb.net/blog/using-a-bloom-filter-to-anonymize-
web-server-logs/)

[1] [https://gitlab.com/jamieweb/web-server-log-anonymizer-
bloom-...](https://gitlab.com/jamieweb/web-server-log-anonymizer-bloom-filter)

------
Nextgrid
For analytics try to use self-hosted solution or even rely on your existing
logs (if a feature is hitting endpoint X and you want to know how frequently
it is used you can just _grep_ for that URL).

If you really need to use a third-party, I'd prefer one where you pay for the
service (MixPanel, etc) rather than a "free" one like Google Analytics. At
least the paid one has less incentive to use the data for their own purposes
while the whole point of Google Analytics (and the reason for it being free)
is to provide data for Google's advertising business.

~~~
kylerpalmer
Do you have an opinion about situations where 'tracking' is really what is
needed. For example, keeping track of a referral from a partner, so that you
can pay them for converted customers?

~~~
ziddoap
Are you paying a flat rate per conversion or a flex rate based on customer
spending or some other metric?

If its just a flat rate per conversion, this could be accomplished with
minimal (if not no) PII collected from the customer.

~~~
kylerpalmer
Ongoing based on revenue. Most PII is in our payment processor, and never
stored on our servers.

------
laurentl
You don’t have to use GA for your own needs. If you have a traditional web
site, you can get a lot done with nginx logs. Things get hairier with an SPA,
but it isn’t too complicated to replicate Google Analytics’ “event” features
on your own.

I recall someone posting a privacy-friendly analytics tool on show HN a few
months back, so you don’t have to roll your own.

The downside is that you lose GA’s insight into who your users are (gender,
interests, age group...). But that is some seriously creepy stuff when you
stop to think about it.

------
morningmoon
What do you need to track, and how will you take action from it? It’s valuable
to figure that out first.

You may not need user tracking at all. You can track how many signups came
from which domain by setting a cookie with the referrer and incrementing a
count for that domain at signup. Here’s an interesting post about it
[https://doingdone.app/blog/building-a-startup-without-
user-t...](https://doingdone.app/blog/building-a-startup-without-user-
tracking)

------
gorkemcetin
I also feel worried when my Ublock Origin warns me with big fonts. At least
you can go with an on-premise option (e.g Countly, Fathom etc). Both can be
deployed on a Digital Ocean instance via Marketplace in less than 60 seconds.
Then, make sure you change your privacy policy to reflect that you don't share
data with 3rd parties.

------
ecesena
DuckDuckGo gave a great talk at Shmoocon 2019 about their tracking and a/b
testing: [https://youtu.be/oFkmqiwX5vk](https://youtu.be/oFkmqiwX5vk)

------
kylerpalmer
A few topics for discussion might include:

    
    
      * Anonymized and/or aggregated data 
      * Behavior tracking via cookies (in app vs marketing?)
      * Referral/Affiliate tracking

------
mars
we use matomo, an open source clone of ga (previously called piwik). if you
don‘t want a big disclaimer to be eu/gdpr compliant you can turn off cookie
dropping in the preferences. you will still be able to track sessions and
conversions, but unique user counts will be off. [https://matomo.org/what-is-
on-premise/](https://matomo.org/what-is-on-premise/)

