
Ask HN: How do you manage event tracking at your workplace - onebyzero2506
I was wondering how different workplaces track user actions&#x2F;events from a backend perspective. Popular options include segment, mparticle but they seem to be quite expensive for small&#x2F;medium level enterprises
======
scottfr
Google Cloud Function streaming inserts to BigQuery. Then analysis is carried
out with the BigQuery UI (ad hoc analyses or scheduled queries for pipelines),
DataStudio (visualizations -- we use this infrequently, it's really not great)
or Google Colab (statistical analyses, complex visualizations).

The pros are:

* full SQL access to the data via BigQuery

* simple to set up (yes, you have to write custom code; but a basic implementation is on the order of a couple dozen of lines of code)

* we have full ownership of the data across the pipeline (better for user privacy than using another 3rd party)

Prior to this we used Google Analytics, but their paid solution is too
expensive for us and their analytics/aggregation API (though quite powerful)
samples a subset of the data which was not acceptable for some of our use
cases.

------
buremba
That's what we do at Rakam. If you have less than ~10M events per month,
Postgresql works smoothly if you use features such as partitioned tables,
parallel queries, and BRIN indexes. The only limitation is that since it's not
horizontally scalable, your data must fit in one server. We have SDKs that
provide ways to send the event data with the following format:

rakam.logEvent('pageview', {url: '[https://test.com'}](https://test.com'}))

The event types and attributes are all dynamic. The API server automatically
infers the attribute types from the JSON blob and creates the tables which
correspond to event types and the columns which correspond to event attributes
and inserts the data into that table. It also enriches the events with visitor
information such as user agent, location, referrer, etc. The users just run
the following SQL query:

SELECT url, count(*) from pageview where _city = 'New York' group by 1

All the project is open-source: [https://github.com/rakam-
io/rakam](https://github.com/rakam-io/rakam) Would love to get some
contribution!

------
Redsquare
We use [https://snowplowanalytics.com/](https://snowplowanalytics.com/)

~~~
namanyayg
How did you set it up? In-house?

------
namanyayg
We're using Snowplow and it's quite great. There's a complicated setup process
and you have to declare event schema before starting tracking, but it's quite
cost effective if that's your priority.

We have millions of rows coming from unique users in realtime at $600/mo, with
Segment this would at least be $5000/mo.

We then use Redash to prepare charts, tables, etc for analysis.

~~~
chrisan
snowplow looks like $1500/month with no realtime until the $5000/month plan?

[https://snowplowanalytics.com/pricing/](https://snowplowanalytics.com/pricing/)

~~~
namanyayg
That's for the hosted version, you can self host on aws

------
mosselman
Papertrail gem in Ruby on rails. If that doesn’t exist for your platform, just
roll your own little table with logging for who did what when. This is so easy
to do, why pay thousands per year for some third party to do it and at the
same time siphon off possibly sensitive data as a side effect?

------
redis_mlc
Most product analytics companies offer a free startup account for a year, so
check on that.

If you just need web server log processing, AWStats is free and pretty good:
[https://awstats.sourceforge.io/](https://awstats.sourceforge.io/)

------
bobbydreamer
Firebase & cloud functions could be a start. I use this to track events and
take according actions via cloud functions but they are not user events. In a
user scenario it can be used.

------
polskibus
Is there a good solution for event/usage tracking for intranet apps without
access to Internet? Or is the only option to roll out everything by yourself?

------
bradwood
Dabezium on all the relational databases into kafka.

------
contingencies
Logfiles.

~~~
dajohnson89
but that doesn't look good on my resume!

~~~
envolt
Generating logs in a specific format. Push that to ElasticSearch, visualize
with Kibana.

This might!

