
Microsoft .Net Core telemetry is not opt-in - setquk
https://github.com/dotnet/cli/issues/3093#issuecomment-392663561
======
dnomad
It boggles the mind that a bunch of very smart people at Microsoft thought
this was a good idea. Say what you will about Oracle but Java has never been
malware that continuously spies on users. Unfortunately the GDPR explicitly
allows this kind of IoT-style ubiquitous anonymized monitoring: as long as the
information cannot be tied back to an individual person (meaning no IP
addresses but GUIDs are fine) and the business can claim a legitimate business
interest they're likely to get away with it. This sort of ubiquitous
monitoring is only going to get more popular. In Hong Kong we're seeing new
Condo apartments that have full-fledged "internal sensor grids" a la Star Trek
(as it was described to me). In theory though the data collection firm (which
is not the same firm that owns the building) has no way of correlating it back
to a specific person/apartment number.

~~~
manigandham
Like all things, there must be a sense of context and proportionality. This
issue is specifically about the .NET Core SDK command line tools.

Getting data about which commands are used most, if they're successful or
throwing errors, possible typos, unclear command flags, etc, are useful when
shipping software frameworks used by 10s of millions to build products that
run the entire Fortune 500 and thousands of other businesses.

This is not ubiquitous spyware, and is about as generic as it gets while still
remaining useful. Every software company (many of which are featured here on
HN) does the same analytical tracking to figure out how their products are
being used and it results in better features. What exactly is the danger here,
especially with the lack of personal info and an opt-out method?

They even make the data public along with insights into how it helped:
[https://blogs.msdn.microsoft.com/dotnet/2017/07/21/what-
weve...](https://blogs.msdn.microsoft.com/dotnet/2017/07/21/what-weve-learned-
from-net-core-sdk-telemetry/)

~~~
0x0
Collecting command line arguments is a HUGE invasion of privacy.

Imagine I ran a command "make.exe my-secret-project-name" or "make.exe
process_gdrp_removal_request SOME_SSN".

And as the linked page admits, they really do collect all command line
arguments, not just a white list of valid commands. Perhaps someone
accidentally pastes an AWS API Secret into the wrong terminal window and it
ended up as "dotnet.exe MY_AWS_SECRET". ( _" You will notice misspellings,
like “bulid”. That’s what the user typed. It’s information"_).

~~~
philliphaydon
They don't collect command line arguments according to the github issue.

~~~
0x0
They do collect at least the first argument, how else do they end up with
"bulid"? Accidents happen and someone could paste or tab-complete a
confidential thing in that spot...

~~~
dingo_bat
So if you push your api key to github by accident is github responsible?

~~~
0x0
Invalid comparison, in my humble opinion.

Pushing a secret API key to github, even accidentally, does not happen
covertly or as an unexpected side effect of some other tool written by github,
and you'll soon realize what happens because you'll see your key sitting
there. You also have a general idea of what you are playing with when you
start using github, whether it's private repos or public repos.

Having a local CLI tool submit your secret API key as part of telemetry
collecting command line arguments is totally unexpected and invisible for the
user, and you won't even see it has happened.

------
nhebb
I have been contacted twice in the past year by Microsoft rep's who both told
me that one of my products is among the top installed add-ins for Excel (they
didn't say top "X", just "top" \- but I'll take it). I assume they collect
this data via telemetry.

What bothered me is that when I asked questions back, I would get no reply. If
they are going to collect this data, the least they could do is help
developers out with some aggregate data. I've been selling this product for
10+ years, and to this day, I still vacillate on which version of the .NET
framework to target.

~~~
SmellyGeekBoy
I've always gladly submitted bug reports and begrudgingly accepted some
telemetry in the belief that it helps the developers fix bugs and build a
better product. But apparently this data isn't even passed back to developers?

------
yummybear
.NET Core does not "violate" GDPR any more than Google Analytics does, or any
other system that receives data. It's up to the data controller to ensure that
no personal data is sent, which is done by setting the environment variable
DOTNET_CLI_TELEMETRY_OPTOUT.

There are fair arguments against the nature of opt-in telemetry, but saying
"they violate GDPR" is just hyperbole, imo.

~~~
SmellyGeekBoy
It violates GDPR because it defaults to enabled. GDPR states that data
collection should be opt-in.

~~~
philliphaydon
The data is neither personal or sensitive. So which data collection needs to
be opt-in?

~~~
opless
Of course all metadata is subjectively speaking personally identifiable
information. But which data are you claiming isn't?

~~~
philliphaydon
All the data they collect. Atleast based on the GDPR training thing I had to
do yesterday for work.

------
fiiv
From their perspective, they claim that it is not personal info since it's
anonymised. On that, they're right.

But it still makes a shitty user experience to do something that is counter to
what your users expect and want.

~~~
setquk
Maybe, maybe not. They record 3 octets of your IP address and the experience
is opt-out rather than opt-in. ECJ ruled that IP addresses are PII but did not
specify if using a partial address is OK. On the basis it is the 3 most
significant octets, then this could be used to identify a company or a pool of
users or a unique user on an unpopulated network.

Edit: also from [1]: _" Hashed MAC address — Determine a cryptographically
(SHA256) anonymous and unique ID for a machine. Useful to determine the
aggregate number of machines that use .NET Core. This data will not be shared
in the public data releases."_

I think a machine could be a user in this circumstance? Also a hash isn't
anonymous if it's idempotent based on the IP address alone as it's merely a
derived value.

[1] [https://blogs.msdn.microsoft.com/dotnet/2017/07/21/what-
weve...](https://blogs.msdn.microsoft.com/dotnet/2017/07/21/what-weve-learned-
from-net-core-sdk-telemetry/)

~~~
tscs37
Atleast per german court rulings, which at that time largely covered what is
defined as PII under GDPR, say that the full IP is PII but the first three
octets is not.

~~~
annabellish
The last three octets absolutely could be in conjunction with some other
pieces of data, though.

~~~
tscs37
Of course, yes. Once you can identify a unique person it's personal data, even
if you cut of octets. If you only log the IP for example for detecting
duplicate accounts, it should be good enough.

------
alkonaut
Telemetry can be opt-OUT just as long as it displays very clearly to the user
what is going on.

I'd prefer opt-IN but I fully understand if microsoft don't.

Having something be opt-OUT and _not_ displayed to the user (meaning they are
in no position to opt out because they aren't aware they even need to) is of
course entirely unacceptable no matter which way you turn it. It can't be in
fine print it needs to be in big bold letters on first run, install, etc.

------
jenscow
Why can't they just make it opt-in?

When one of my kids ask for something and I say no, they might ask again. But
the more they continue to ask, the more determined I am to say no. The
difference between me and Microsoft (in this context), is I'm trying to bring
up respectful children.

This is just what Microsoft are doing - judging by the +1's, there's a huge
majority of users who just don't want this. Instead of giving their customers
what they're asking for, they just dig their heels in further. This is one of
the main reasons I started the transition away from Windows since the 8
preview.

They said that making it opt-in would mean less data. Well, doesn't that tell
them something?

I'm sure the .NET team would have got the same answers if this where opt-in -
and let's face it, the discoveries aren't exactly ground breaking.

If you _ask_ me, I probably would have opted in. I'm sure other people would
do the same (if they could). But it's not, so I jumped through the hoops to
disable it (added it to /etc/environment). You don't ask, you don't get.

And another thing, why not disable it using the registry when running on
Windows? Personal experience tells me that only a minority of Windows devs
even know what an environment variable is, let alone know how to set to
globally and persistent.

It's arguable whether or not this is personally identifiable. Even if it's
not, it still leaves a bad taste as it's still classed as "spying" \- they're
sneaking the data out by making people take effort to disable it; it's beyond
a simple Y/N, people have to actually learn how to do something. And it's also
giving people the impression that their own apps will be contaminated.

It's a command-line tool I'm running on my local computer for professional
use, not a click-bait website or social media platform.

------
rkeene2
Joy [1] also does this, with no obvious warning, and no way to opt out. I
reported it and the contributors said they had no problem with it, and if I
didn't like it I could build from source, because correcting my activities
would be useful to them.

[1]
[https://github.com/matthewmueller/joy/issues/79](https://github.com/matthewmueller/joy/issues/79)

------
rehemiau
I believe it's the same with VSCode. All "telemetry.enableCrashReporter",
"telemetry.enableTelemetry" and
"workbench.settings.enableNaturalLanguageSearch" from user settings default to
true.

------
znpy
Imho people can spend all the time they want discussing this.

The only way to sort this out is someone suing Microsoft and then let a judge
declare whether telemetry not being opt-in is a GDPR violation or not.

------
MatthewWilkes
This has been covered here many times, but consent is not the only basis for
lawful processing under GDPR.

------
moron4hire
This title reads like .NET Core is forcefully replacing all other .NET
frameworks on your system.

~~~
setquk
It is. Future versions of powershell will ship with .Net core only.

~~~
moron4hire
PowerShell is an extremely small part of the .NET ecosystem. It's more like a
REPL for a programming language than a shell. And regardless, that a future
version of a piece of software would be implemented in the latest version of
the framework is not at all the same thing as what I'm talking about.

------
jasonkostempski
I read 'not' as 'now'. For a second there I thought the world was a slightly
better place today :/

I'm all for continuing the fight, but I'm suprised this very old news is on
the front page again.

------
he0001
I bet we’ll see research articles about how people use companies public data
to pinpoint individuals in different ways. I’d think we all are going to be
surprised what you can find from that data.

------
CodesInChaos
Note that recovering the plain MAC address from SHA256(MAC) costs a single GPU
day on a 1080.

------
SwingingShips
Oh this will get exciting

------
oaiey
Can I suggest to change the title. It is misleading at the current state.

PS: this should be opt-in due to community request and not by some philosophy
or law

~~~
sctb
Sure thing, we've added the missing “telemetry”. Thanks!

------
manigandham
This should link to the entire github issue, not a single person's recent
comment in a 2 year old thread. Also the title should be changed to clarify
"telemetry".

------
polskibus
Please change the title to include "telemetry".

------
flukus
This has been an issue for 2 years now, so what is it they've done with this
oh so valuable data? What improvements have been made as a result of the data
collected?

I suspect the answer is that no concrete improvements have been made and all
it's accomplished is to make them look bad.

~~~
yread
[https://blogs.msdn.microsoft.com/dotnet/2017/07/21/what-
weve...](https://blogs.msdn.microsoft.com/dotnet/2017/07/21/what-weve-learned-
from-net-core-sdk-telemetry/)

~~~
setquk
And there's the money shot:

 _" Hashed MAC address — Determine a cryptographically (SHA256) anonymous and
unique ID for a machine. Useful to determine the aggregate number of machines
that use .NET Core. This data will not be shared in the public data
releases."_

~~~
JorgeGT
Someone please correct me if I'm wrong, but (2^48 possible MAC addresses) /
(60000MH/s) / (3600 s/h) = 1.3 h to calculate the SHA256 sum of every MAC
address in an AWS p3.16xlarge instance (~$50)

[https://medium.com/@iraklis/running-hashcat-v4-0-0-in-
amazon...](https://medium.com/@iraklis/running-hashcat-v4-0-0-in-amazons-aws-
new-p3-16xlarge-instance-e8fab4541e9b)

~~~
viraptor
You may be right. But it may not be a straight sha256. It could be a multi-
round hash based on sha2. In the same way shadow hashes are default called
sha512, but in reality it's a 5000 round version, so your price could be
$250000 instead. (Or less/more)

~~~
Cakez0r
You could cut down the search space a lot by only enumerating the MAC
addresses of known vendors
([https://gist.github.com/aallan/b4bb86db86079509e6159810ae9bd...](https://gist.github.com/aallan/b4bb86db86079509e6159810ae9bd3e4)).

That cuts the search space down to 23000 vendors * 0xFFFFFF = 385875945000.
With a hashrate of 60000MHs, you could SHA256 hash that entire space in 6.5
seconds. If you have an NVidia GTX 1080, you can do it in ~2 minutes 16
seconds.

------
akerro
Does that mean that every .Net application has a built-in botnet unless
disabled by developers? Can I disable is a as user?

~~~
manigandham
This has nothing to do with your application or code. This is the .NET Core
command line tools that send generic telemetry on which commands are used
most, timings, and SDK versions. It shows a message when you first use the
tools and can be disabled, check the docs: [https://docs.microsoft.com/en-
us/dotnet/core/tools/telemetry](https://docs.microsoft.com/en-
us/dotnet/core/tools/telemetry)

------
partycoder
What do you need to stop trusting Microsoft?

They haven't changed a lot. They're still patent trolls, and if they still
exist is mostly because they're still living off the momentum of their 90s
monopoly.

This is the type of BS that comes out of the great minds at Microsoft:
[https://www.cbsnews.com/news/hiybbprqag-how-google-
tripped-u...](https://www.cbsnews.com/news/hiybbprqag-how-google-tripped-up-
microsoft/) <\- a great example of how your analytics data might be used.

