Ask HN: How do you document analytic events?

PaulHoule · on June 11, 2019

I am a big fan of Sphinx

http://www.sphinx-doc.org/en/master/index.html

which is sorta like "Javadoc" except you can make custom "domains", out of the box

http://www.sphinx-doc.org/en/master/usage/restructuredtext/d...

for instance you can document C++ or http APIs. It's pretty easy to write your own domains and I wrote an RDF domain that can used to represent just about anything that you can represent in RDF.

It's particularly good when you need to link multiple domains such as an API and its documentation in code, for instance you could document the event definitions in comments in your typescript code and have sphinx cut them out and make user docs for them.

It's an excellent answer to the problems of code and documentation although I have a few complaints such as they made an RST reader without making an RST writer, it uses global variables way too much, etc. But really I don't know anything that attacks the problems that is better.

am1nix · on June 12, 2019

Interesting! I will check it out, thank you.

arnihermann · on June 12, 2019

This problem can be so detrimental for teams who want to measure the success of their products it's hard to put into words (ಥ﹏ಥ)

This is exactly the reason why we built Avo (https://www.avo.app). It's designed for everyone on the team: product managers, analysts, data scientists and developers. It works by generating code derived from an analytics spec, which you create and maintain in a user friendly web app.

You can check it out and reach out to me a@avo.sh, I'm happy to help any way I can.

bryanrasmussen · on June 11, 2019

If you build your own analytics system which I did when I needed one, data was stored in elasticsearch as analytics events, then I had a query that told me every property that was used.

Since one of the things that was in my properties was an identifier added in build stage for every analytics call in the code if I wanted to figure out what analytics properties came from what lines of code it was relatively easy.

So could track when properties were added, spread of their usage, and using git blame figure out who added it. But we were also a small team so usually I added it.

am1nix · on June 11, 2019

Is not a bad idea for building the event list, but it doesn't solve the problem of documenting when or by what the event is triggered, something that a product owner or a data scientist that don't know the code might want to know to do some analysis (Sorry, I forgot to mention it needed to be for non developers)

bryanrasmussen · on June 11, 2019

well that is what the normal analytics functionality of the system was, basically on meaningful events you would make a call to analytics function and pass it an object with the properties you wanted to analyze, it would add some global properties and send it off.

There were of course various levels of abstraction, but sure I would know when an event happened because a user scrolled by a widget on an external site, on the main domain, when they clicked on the widget, liked it etc. all the normal things you would want to capture if you are measuring widget interactions across sites.

on edit: So in your first scenario - if a non developer wanted to look over all events they might see that there were events named "hotspot seen", "hotspot hovered", "hotspot clicked" and so forth.

In those ones actually there was the widget id that hotspots where in and of course the original interaction id so you could draw a graph of how many users progressed from hotspot seen to product liked and so forth. But at some point a non developer is going to probably have to ask a developer or elasticsearch expert - I need this data out of the system and summed in this way.