
Ask HN: How to implement Telemetry for an Open Source project correctly? - cube2222
Hey, I’m one of the main OctoSQL[1] contributors.<p>For a while already I’ve been thinking: how to do telemetry for open source projects well, in a way that doesn’t end up with a wide ranging user rage about privacy.<p>With an open source project on github, all you really have is the star count and the download count (which is not that useful, cause in this case most users “go get” octosql anyways).<p>You need telemetry to know if anybody is actually using your software, what kind of workloads they’re running (to optimise for cases which are actually useful to users), what versions people are running, what kinds of systems they’re running on (big servers, laptops, etc.).<p>Now my thought was to just embed a constant url address in the open source repo where information like this gets sent to. On by default, with a warning on first run, as almost nobody would do any intentional effort to turn this on if it weren’t on by default, disablable using an environment variable).
The sent telemetry could be locally browsable so you know what you’re sending. The gathered data could be publicly displayed (it’s an open source project after all).<p>However, I’d expect a system like this to be met with backlash. Is this a vocal minority? How would you solve this problem? Is there any “open source way” to do this?<p>[1]:https:&#x2F;&#x2F;github.com&#x2F;cube2222&#x2F;octosql
======
Khelavaster
Good thing you're open-source. If your users don't like your telemetry they
cant remove it for their own builds.

~~~
cube2222
Currently there's no telemetry :)

But yeah, firstly we could provide an environment variable to disable it, and
additionally a build flag to remove it completely.

~~~
egdod
Opt-out telemetry is shitty. Make it opt-in or don’t do it at all.

~~~
cube2222
You really have to expand if you make a statement like this.

~~~
egdod
It seems pretty self explanatory. What’s your question?

------
egdod
> You need telemetry

Lemme stop you right there.

~~~
cube2222
I simply don't agree.

Without telemetry, at least in the early stages where you don't get 1000000
issues every day, you have no idea what your users are actually finding
useful.

This may lead to you developing features which nobody cares about, wasting
precious time you could spend creating actually useful stuff.

It's a win-win, as long as nobodies privacy is violated. And that part is
basically what this whole question is about. How to do telemetry in a way that
won't feel privacy-violating to users (and won't actually be privacy-
violating, as opposed to "good feelings by privacy violation obfuscation").

~~~
qplex
You make it completely optional.

Just opening an outbound connection can be a privacy issue for some users.

And preferably you don't ever ask normal users to enable telemetry (and if you
do ask, default to no).

Telemetry is only a feature for those who wish to actually help you develop
your software, not just use it.

~~~
cube2222
I'm worried though that in practice you'll end up with almost nobody making an
effort to enable it.

I don't care for vs code to send telemetry, but I'd surely not spend even 3
minutes to turn it on if it weren't by default.

~~~
yorwba
As a compromise solution, you could collect telemetry locally and prompt the
user to upload it e.g. if there was an assertion failure or an operation takes
much longer to complete than expected.

That way, the value of providing telemetry data is much clearer to the user
and in the end it remains their decision whether to provide it or not.

