I'm not talking about GDPR specifically, I'm talking about embedding a third par...

jamiequint · on Oct 30, 2019

I think that’s a fair point from a security standpoint, but there are clear technical solutions for sending telemetry data to third party scripts without allowing page access that are well established (e.g safeframes) yet the conversation isn’t about that. The conversation (more like coordinated uni-directional screeching) is instead an irrational moral panic which is not justified by the facts at hand.

I don’t see the fact that you have or don’t have a relationship with the company as relevant. The mechanism of passing of the data is just an implementation detail. You don’t have control over what relationships the company has on the backend (e.g. what if they store your telemetry data in BigQuery or Snowflake, or keep your log data in Loggly) so I don’t see how this expectation suddenly applies if the data is being sent from the frontend instead.

orf · on Oct 30, 2019

In the self hosted instances, which is what I was talking about, you do have a control over the backend. At last more than you do with the managed version. It pretty much boils down to this: Can something leak sensitive information to a third party. If so, then it's a no go.

If you have a contract with that third party, and you deem that third party to be a safe harbour for your data (yes, that includes gitlab.org, AWS, etc), then that's a different case.

If Gitlab was to have instead said:

1. We are going to enable telemetry on all public repositories on Gitlab

2. On self-hosted instances we will provide you with the ability to embed your own analytics, from a company of your choosing

Then the screeching (and I fully agree it is screeching) would have been less. Unfortunately, with self-hosted instances, you simply cannot allow Gitlab to leak information like that to a company you don't have a direct relationship with. I'm not sure how else to phrase this concept or explain it, and it doesn't really matter if you use safeframes or not.

jamiequint · on Oct 30, 2019

I see your point with respect to self-hosting. I get that most of the reason for running on-prem is data security and privacy and I can see why people might get annoyed at something they thought they were buying not really being there.

That said, while I'm not familiar enough with the details of how Gitlab supports self-hosting to comment on whether or not their particular case allows them control over the backend still or not, many self-hosted AWS solutions are implemented as marketplace AMIs for which the end-user can run in their VPC but still doesn't maintain control over what is running inside the AMI. It's not necessarily that odd for software implemented this way to still phone home with telemetry.

orf · on Oct 30, 2019

They indeed do have opt out instance wide telemetry. This is restricted to specific site-wide activity: number of merge requests created, users active, gitlab version, usage of feature X etc. It doesn’t send back any sensitive data (project names, namespaces, comment text, diffs), or give the potential to access that to a third party.

You can also view all the data it sends back in the admin console, and disable it.

Again, it’s the trust aspect. Sure, gitlab could just silently implement a phone home with all your private data (even by accident). They would be put out of business if they did, for breaking their contract with us and others. Nobody would trust them.