Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft: Require user consent before sending any telemetry (github.com/microsoft)
128 points by rawland 6 months ago | hide | past | favorite | 122 comments



Have you noticed that MS mostly stopped using EEE, and changed strategy to just ignore rules/laws/licenses, and wait to see what happens? We hear it frequently that "today's MS is not the same as the old MS", but I have my doubts.

This particular one just the latest. But the really big one (IMHO) is the one where they simply started to ignore EFF[0], when they were asking them about the copyright status of co-pilot. If the court decides against EFF, that will have a lot of effect on the legality and enforcement of most of the OSS licenses (though I'm an armchair-lawyer, not even in the US). Fun times ahead.

[0]: if I remember well, it was EFF, who mentioned that MS stopped responding to them. I have found the lawsuit, but filed by not by the EFF. Google is more useless by the day.


> Have you noticed that MS mostly stopped using EEE,

No, I haven't. Notice that MS now loves Linux... provided you run it on Azure or as a component of Windows (WSL). They adopted Chrome...'s rendering engine and then abused their desktop OS market share to shove the result down people's throats. They don't have the leverage they once enjoyed, but the approach didn't change, at least not in general.


Hardly any different from FAANG so beloved by FOSS folks....


I don't see the connection between Amazon, Facebook, and Netflix and free software.


Apparently everyone celebrates the stuff they put out into FOSS, whereas their agenda is hardly any different from Micro$oft.


In the case of Go quite recently, there was such outrage about opt-out telemetry that the proposal was walked back, and implemented (correctly) as opt-in instead.


Yeah, however there have been other cases where the outrage was ignored, like on how the modules story went.


Modules isn't any worse than NuGet for privacy. Neither are good, but it's hard to say the outrage was "ignored" - people bring it up constantly.


> people bring it up constantly.

Of course. Because the Go team ignored everyone and implemented it anyway.


...yes, those are also poorly regarded:

- Facebook honestly mostly seems orthogonal to open source concerns

- Apple most certainly does get criticized for the incompatibility of their App Store with copyleft licenses, and the way they deliberately avoid GPLv3

- Amazon is the poster child for bad interactions with the open source ecosystem

- Netflix... honestly I don't know of any problems aside from DRM (and I'm pretty sure that's imposed on them, not really their choice)

- Google is absolutely criticized on a regular basis for how they interact with the wider community

Though of course even if your claim were correct, it wouldn't really matter; whataboutism is a bad argument because two bad actors isn't zero bad actors


> Hardly any different from FAANG so beloved by FOSS folks....

I don't see what you mean. I'm not sure about "FOSS folks" but at least in the HN community FAANG has little sympathy.


I think they were referring to how react, etc is looked upon favorably by HN readers while .net, etc is criticized, most of which is invalid.


I think this is a tendency of all internationals mega corporations. Law is not homogenous around the world, and since you are consequently anyway in violation, you learn how to use that in your favor and ignore it for quite a while. And then, once its start to be annoying, you can finance an army of lawyers to delay or even change the law.

For one part it is quite reasonable to work like that, on the other side it is really unethically and bad for the society as a whole.


The current system highly incentivises sufficiently large corporations to embrace the Nike principles: Break the rules, fight the law

The worst case scenario, if you lose a game stacked in your favour several times in a row, you pay a pittance, or performatively correct a now-obsolete injustice.

VScode telemetry will remain opt out because it yields very valuable information. Microsoft is not a democracy, and the outcry here is less than a rounding error, a footnote in some internal director’s morning agenda.


The current system highly incentivizes pretending not to know.

Obtaining power at any cost requires the internal director to pretend he doesn't know, what he's doing.

The vast majority of social capital is made by lying to people, pretending to not know you've done it and dropping relationships with anybody who is not pulling in your direction.

Silence is vastly underrated, I say ironically, so I shouldn't be typing this out.


Oh, I find this note quite interesting. I noticed that every time when a BigCo's shenanigans come to light, many people are ready to scream Hanlon's razor!one111!! - and BigCo is very happy to accept this reasoning every single time. It's like they are prepared in advance to use it as a defense, in case their doings come to light. Even if the same thing keeps repeating again and again and again and...


I do the same.,. But in the last month I got three traffic tickets.... One for not using blinker, one for not keeping my lane and one for speeding


To be fair if someone comments to me with things like:

> Please give an answer within the next week until the 16th of June.

I wouldn't respond to them either out of spite


The issue with society or one of them, is thinking its acceptable for a corporation breaking law to feel spite, the guy was not talking to a person, was talking to a shitty corp breaking law


Which law? Instead of shit talking, they can report it, file lawsuit.


G.D.P.R., it says so in the thread.

And Europe is not a litigious environment, we start with complaints first.


There is a suggestion that some data sent is in violation with the GDPR. There is no specifics about what it would be that is in violation however. I think 90% of sites with cookie banners are blatantly violating the GDPR - but whether I'm correct in that assessment is anyones guess. It would depend on court processes that hasn't happened yet. It's based on my understanding and interpretation of the regulation, nothing else. I guess it's the same with the complaint here. If there is a question of a violation then it's probably due to microsoft and the commenter having different interpretations about specific data such as hashed mac addresses (Which certainly isn't clear cut).


Yup, there are lots of violations: https://noyb.eu/en

And I do think Microsoft has good lawyers and believe they reviewed any activity prior to receiving consent quite carefully.


Are there any concluded legal processes that are concerning

1) the form of consent banners

2) consent vs legitimate interest for ip transmission as part of http request headers

3) whether ads are a legitimate interest for web sites?

Those seem to me to be the 3 “big questions” of the GDPR. The regulation and most legal processes however seem to focus more on large scale data storage cases, failure to answer user requests etc. And those are important from a privacy standpoint but from a technical standpoint to software developers the 3 above seem much more interesting, yet mostly ignored by courts? I get a feeling they don’t want to touch it because they are a can of worms


1) YES https://noyb.eu/en/where-did-all-reject-buttons-come

2) Consent vs legitimate interest is not for data but for data processing purposes. A company, say Paypal, may have a legitimate interest to process (and, therefore, collect) your IP address in fraud detection systems. If you don't give consent, fraud prevention dept cannot share your IP address with the marketing dept (and would have to erase it once it's no longer used by the fraud detection system). Which is why you get pestered with more consent requests than needed. Consent also has to be freely given: https://edpb.europa.eu/news/national-news/2019/facial-recogn... + https://noyb.eu/en/pay-or-okay-tech-news-site-heisede-illega...

3) Ads are not personal data (data that came from you and/or data related to you, not necessarily PII). GDPR is only about careful handling of all personal data. It does not prevent a company from showing you ads. But GDPR does prevent tracking to show targeted ads: https://noyb.eu/en/norway-temporary-ban-behavioral-ads-faceb...


The complaint should come from some authority or a legal backing. The poster assumes that they are breaking GDPR and seeking explanation with some shit talk to make it sound legalese.

Companies as a policy and by logic don't reply to such comments/post because the response becomes a legal document. So any expectation of answer is futile.


Articles 16 to 21 provide you, the end user, a range of grounds for a complaint: https://gdpr-info.eu/chapter-3/

Article 12 requires a response in one month. However, you shall not post comments on a repo issue to get a response but write to a DPO instead: https://learn.microsoft.com/en-us/compliance/regulatory/gdpr...

Read more on your GDPR rights and how to exercise them: https://noyb.eu/en/exercise-your-rights


Maybe you come from a place where citizens just kiss corporations and count nothing, but the complaint here for GDPR can come from everywhere, even citizen can sue https://commission.europa.eu/law/law-topic/data-protection/r...


You got no idea bud...


Well, sure, it's just a general observation, so knock wood.


Sorry did I offend your papacorp


Are you OP from git thread. why so butthurt?


Uh? No


The requested deadline is likely done ahead of filing a complaint in Europe, to show they gave ample warning.

Also remember he's not talking to a human, but to a soulless corporation. He was as cordial as could be given the circumstances.

And finally, remember that it doesn't matter if a product Microsoft develops to increase their control over developers (via vendor lock-in, mindshare, and forced telemetry) happens to result in a decent free text editor for the user. No one owes them gratitude. This isn't charity.

P.S. Did you know VSCode lets extensions not respect the user's "no telemetry" choice? It's been an open ticket for like 4 years now, that MS have no intention to ever fix, even though all it would take is a simple VSCode Extension Store EULA change.


I've written to companies in the UK before with similar deadlines, it can be statutory - I am giving you notice that this communication starts the clock on the 30 day period I am required to allow you to give me a satisfactory resolution before I will escalate this case to the relevant authority.

Last time I had to use that sort of language was with a deranged ISP who had failed to deliver an internet connection, then decided to chase a debt for unpaid bills for this non-existent connection two years later.


Virgin Media by any chance? I had them do that to me when I clawed back the money through my bank that they took for an install they never delivered.


This was Bulldog Broadband, back in the mid 00s when they were about the first to advertise 8Mbit in London at a relatively reasonable price, but were then swamped with orders and couldn't seem to even keep track of what they were doing, let alone deliver anything.

But you're right it could be any one of a number of them! I had problems with quite a few over the years.

Among the amusing misfeatures of bulldog broadband was their cancellation process, which required confirming by sending an email to "cancellationconfirmation@..."

Said cancellation confirmation address had not been set up and would just bounce.


It's not the only nor the first comment. They had plenty of time to comment back before.


Yes, but that’s our childish instinct to be affronted at being held to account for what we know we’re responsible for.


I suspect they are dating it to trigger some terms of the GDPR, eg., reasonable response lengths when notified of infraction


That opens another question: which means of communication would count for that? Does commenting on a GitHub issue really count? Wouldn't you have some sort of contact details specifically for that in a license agreement or similar?


Usually, for the statutory thing, you do have to be able to prove the counterparty received the communication, so registered mail is often the best way, because someone has actually signed for it at that point.

But if they respond at all using other channels, you probably still have enough.


Seems GDPR is pretty explicit since it requires specific documentation (Data Protection Statements/Privacy policy) and explains that they should contain complaint contact information. I think it would be pretty easy to contact Microsoft on the "expected" channel for a GDPR complaint.


Of course not. You have to go through a Data Protection Officer (DPO): https://learn.microsoft.com/en-us/compliance/regulatory/gdpr...


GDPR terms allow them to ask for any data about them personally. And Microsoft can say no if for example all the telemetry data is anonymous and aggregated. These attempts at sounding like a lawyer with demands to answer make the issue commenters sound like they are 14 years old and any engagement with that issue will never end unless it's locked.


Well, we are talking about GDPR. Setting a date to comply by is part of the enforcement of the GDPR afaik. I bet someone is setting points of a legal case, e.g. MS can say "oh no one explicitly stated a set date and GDPR" - now they cant use that excuse.


I don't see anything here[1] that mentions that made up karen legalese is any part of the process

[1] https://commission.europa.eu/law/law-topic/data-protection/r...


I always mention in my GDPR requests/complaints that I would kindly like to get a response within a month, in line with Art. 12 GDPR. Not because it's a "karen legalese" but to let the company know that I am exercising a specific right, not just asking for something random out of the blue.


A user should be able to configure a program (or all programs) such that outgoing communication is not possible, logged or both. It really shouldn't be up to the program to decide what it wants to send as it could easily scan the entire hard drive on the users behalf.


Little Snitch on macOS and simplewall on Windows. Must-have tools for anyone serious about their right to privacy.


Have you tried running a firewall with explicit prompts? Everything connects home now. It's infuriating.


The majority of FOSS programs don't connect anywhere - although there has been an increase, for sure.

Last year we had an argument this regarding LibreOffice, where an option to collect some telemetry was suggested as a nagging-opt-in. Opponents argued against this because some fraction of our users will press Accept just to get through the installation, or without understanding what they're accepting; plus we just didn't want this kind of mechanism in a respectable piece of software. For now the idea seems to be dead in the water.


"The majority of FOSS programs don't connect anywhere" is a very week position. Not sure is english has analogy for saying from other language but it translates to "trust but verify". So parents comment makes sense:

Get a global firewall for outgoing connections.


When the owner of a device is using it, they should have the right to inspect all data on that machine in plain language and to inspect all communications to and from that machine (again in plain language.) They should have the right to stop any communications at any level they choose using plain language menus.


they need to do this with cars too. Tesla and VW are collecting massive amounts of data. Should be opt-in


I'd support that law.



I'm not qualified to weigh in on the merits of the request, but asking a corporation to change something and then throwing in a bunch of legalese about compliance and GDPR seems like an excellent way to guarantee that the poor reviewer of the requests is not going to deal with it, let alone quickly.

At best, they raise it to their internal legal contact. The inhouse lawyer rapidly advises them to not respond in any written or recorded medium. Issue goes nowhere.

At worst, they realize that this is a hairball with "vaguely legal stuff" and decide to review some other issue instead for a more productive and less stressful day. Issue goes nowhere.


This language in the bug makes it easier to build a legal case against them.


It's very easy: Complaints should be directed to whoever is listed in the Personal Data Protection Policy issued by (in this case) Microsoft. The privacy notice (Which nicely seems to be the same one across microsoft products!) clearly says how to complain, as it should https://privacy.microsoft.com/en-us/privacystatement

And that method is not a github comment.

The commenter might have followed the correct route to complain too, but could then at least have said that "I have contacted microsoft at [..] as outlined in the Personal Data Protection Policy and expect a response within [..]"


Truly anonymous data is not subject to the GDPR. So the question is whether the data they are collecting is truly anonymous. They seem to be claiming or suggesting "Yes it is" https://code.visualstudio.com/docs/getstarted/telemetry#_gdp....


It is neigh impossible to send truly anonymous data as telemetry. As soon as you're using the internet, you're disclosing an IP address, which is PII. If you add anything to link two subsequent telemetry reports together, that thing is PII (e.g. a hash or a uuid). If the telemetry report is detailed enough that they become somewhat unique, it's PII.

That said, consent is not the only grounds on which you can process PII. Contract, legal obligation, vital interests, public task, or legitimate interests are also valid grounds. Of these, legitimate interests is the most applicable in this situation.


> As soon as you're using the internet, you're disclosing an IP address, which is PII

Yes it's PII which of course is why no one who does Telemetry in a GDPR compliant way would store the IP address. The fact that it's "sent" (in order to send anything at all over http) isn't relevant. Only what's stored, for what reason, and for how long.

> If you add anything to link two subsequent telemetry reports together, that thing is PII (e.g. a hash or a uuid)

Again, no. PII is only information about physical people. Unless the data becomes enough to identify a person (in itself or together with other data), the data is not PII. Having a browser history associated to a random guid might be PII (because the browser history might pinpoint the user, not the guid!). But having a random guid associated to say "has run VS code 12 times this year" is not.


>legitimate interests

No, telemetry is not something MS needs to fulfil the primary purpose of VS Code. Best example is that the OSS version is there, without any telemetry enabled by default, still doing by and large the same job.


Legitimate interest doesn't mean absolutely essential.

The OSS version obviously benefits from the telemetry (to the extent telemetry is useful) because it's downstream of the version developed based on the telemetry.


"Disclosing an IP address" maybe a matter of the medium of comms being inadvertently TCP/IP, if MS does not log or store the IP in a meaningful/reversible way, are they processing PII?


in the Google fonts CDN the court ruled that: it's irrelevant if the website or Google had the opportunity to link the IP address to the user. the mere possibility of this is enough to consider it as protected PII.


Question is whether Google Fonts CDN/server was storing the IP address or not. Linking to a user is secondary. If a server does not log or store raw IPs in the first place, where's the fault?


My man you are arguing with an established case verdict. https://rewis.io/urteile/urteil/lhm-20-01-2022-3-o-1749320/ The wording that is irrelevant what Google does with the IP (just the theoretical possibility of misuse is enough) is in the case verdict.


Per Wikipedia Germany's legal system doesn't have the concept of binding precedent. (And even if it did in no country is the decision of a trial court binding precedent).


With that argument - would it hypothetically be legal for anonymised telemetry to be submitted over Tor?


no, the IP should not be exposed to any third party not only to the final destination. Tor would hide the IP from the final destination but still expose it to the first relaying party.


The first relaying party would see the IP address, but none of the telemetry data. I think it's only the combination of the two that is legally a problem,


> It is neigh impossible

Haha sorry I couldn't continue past that! Neeeiiigggh!


Among the telemetry data:

> MacAddressHash - Used to identify a user of VS Code. This is hashed once on the client side and then hashed again on the pipeline side to make it impossible to identify a given user. On VS Code for the Web, a UUID is generated for this case.

A hash of a hash is about as expansive as a hash and it still uniquely identifies a machine, tying telemetry events to a specific user's machine. Microsoft's own telemetry description generator calls the field "EndUserPseudonymizedInformation". Pseudonymisation is inherently not anonymisation.

This bullshit is why I keep my PiHole on for my dev environment.


Unless there is any PII associated with the pseudonym, there is nothing specifically in GDPR that says you can’t or shouldn’t do this so long as it’s not information that can identify a physical person. Note that being able to attribute multiple pieces of data to the same anonymous person does not necessarily identify them (and it’s important to not accidentally do so):

It’s important though if you e.g have multiple products to use a _different_ pseudonymization (hash salt or whatever) otherwise you run the risk of storing data linking too much data on a user thereby de-pseudonymizing them in the worst case even though no individual app does. Having a users behavior across multiple applications could pose such a risk in extreme cases.

Edit: I think it's important to separate "hashing" and "hashing". A properly hashed identifier uses a salt that is generated on the client, so that it can't be used to identify the user. basically: the first time the app runs, you generate a random salt which is only stored on the client, and NEVER sent in telemetry. Anything you would like to transmit over the wire that would risk identifying the user (E.g. a computer name, mac address) you hash with this local salt. This way no one can try to go to the database on the server side and try to match any data e.g. check if the hash abc123 matches the computername jimbob bcause hash("jimbob")= abc123. Just sending hash(MacAddress) without a local random salt would NOT be properly pseudonymous because an attacker on the server side could ask and answer the the question "Does this come from the address macaddress?".


The hash used, at least when Iooked into it last, was a plain sha256 hash, no salt or pepper. That's a unique identifier.

I think the massive amounts of behaviour analysis Microsoft does should be considered PII. They know when you turn in visual studio in the morning, and when you leave. They know when you go to lunch and don't click any buttons for a while, and they can see the colleagues with you in that boring meeting also not clicking any buttons at the same time. This type of behaviour analysis over time can associate you and the people you interact with, even if it's not directly tied to a reversible hardware ID.

This is why pseudonymisation isn't anonymisation, and why pseudonymisation isn't sufficient to comply with laws liker he GDPR.

If the behaviour analysis was done without identifiers at all, you could say they're just counting button clicks, but they intentionally associate this data with your stable personal identifier for analysis over time.

MAC addresses aren't that big of a collision space either, any consumer GPU can generate a list of all hardware MAC addresses in use in a reasonable amount of time. MAC addresses may theoretically be 2^48 in size, but most of the space hasn't been assigned to vendors yet. It takes about 12 minutes to reverse any given MAC address when you rent a single cloud GPU. The double hashing should take about twice that time.

The weird thing is that Microsoft intentionally chose to use a MAC address rather than a UUID like they use on their web version. If this was just a unique user token, they wouldn't need to use any hardware identifiers at all.


You are right in the edit. The hash needs to be using a secret salt that is unavailable to any potential attacker to not be PII.

You're mixing up the termso psedonymization and anononymization, though. If something provably not PII, it is considered anonymous. Psedonymization specifically means to keep the data as PII, but where the risk of misuse is reduced by making the identification hard.

In practical terms, psedonymous data is data that someone like a data scientist will only be able to link to a person if making a deliberate effort to do so, which will almost certainly mean that she KNOWS she is breaking some law. And it may also mean that the link between the person and the pseudonym is stored in a locked down database where most data scientists (or others that may have interest in doing the linking) do not even have access.

The GDPR does promote the use of pseudonymization as a layer of protection, and if a business does keep some PII data around, properly categorizes their data as such (in compliance with Article 30 of GDPR, with a defined "Legal Ground" for processing activities) AND properly protects the data both through "Security by Design" and "Privacy by Design" (of which pseudoymization is an important element), their legal exposure can be either completely negated or at least radically reduced if the "Legal Ground" is challenged.

Overall, though, fully understanding GDPR is terribly difficult, as it requires significant understanding of both Law (International AND local within each country covered by the GDPR), Computer Science (development AND IT security) AND a good understanding of Data Science.

I rarely meet people with enough understanding of all 3 to assess practices that are in the gray zone.

Lawyers (and most DPO's) tend to have little understanding of the IT or Data Science aspects, but tend to be good at stretching a "Legal Ground" to whatever is needed by the business to continue to be profitable.

Data Scientists tend to know how to de-pseudonymize data, and may even be taught "Privacy by Design" (this usually has to be forced on them, though, as it makes their job harder). Most data scientists struggle with IT security aspects, though, and would in many cases happily download all data to their laptops if they could.

Developers/engineers may understand concepts such as hashing, and even know the difference between hashed and encrypted data. However, as they live in a boolean world of True vs False, using judgement to evaluate the risk impact of some practice for data subjects tends to be alien to them. In a black and white world, this group tends to think that every bad practice is equally bad, instead of going for the "lesser wrong" or "good enough". Especially if the measures needed to be "good enough" makes the coding harder or the system slower.

Finally, IT security (the experts, not the drones) MAY have a better understanding of degrees of risk than developers, but tend to know/care less about the actual data than any other group.

And each group tend to hold the other groups to a higher standard than their own. The lawyers tend to assume that all aspects of development and infrastructure is properly hardened. Data Scientists tend to interpret the "Legal Ground" to cover whatever they want to use the data for. Developers tend to think that the infra that runs their systems is fully secured by shell protection, and may even store "secrets" in more or less open git repos (and even if they delete it later, they don't clean up the git history or create new secrets). And networking often do not even care about anything in the "Application Level" or higher of the networking stack.

So in practice, any large corporation will have a huge number of vulnerabilities. The only way any sensitive asset (from a privacy, intellectual property or operational stability perspective) can be considered properly protected is to have multiple layers of protection, all or most of which must fail for major incidents to happen.


I use pseudonymization in the sense of having persistent identifiers for users/machines/etc that cannot be reversed on the server side.

Basically: just like the usernames on hn are pseudonyms it’s important they are persistent so you can follow who wrote what despite not being able to attribute posts to physical persons. That is: hn is a pseudonymous forum rather than anonymous.

The hash(localSalt + PII) is provably not PII. But it’s still making the data possible to correlate. The telemetry event I send on Monday can be attributed to the same source as the event I send on Tuesday.


what's the definition of truly anonymous? they don't know your name? or there isn't enough data to identify you? I've heard that in the US, birthday and postal zip code is enough to identify you in most of the country, but that could be considered anonymous.

if data of multiple users is aggregated, that is I think more of what people are thinking when they think "anonymous"


There are multiple definitions. The most basic (and common) is k-anonymity [1]. Basically, for a given collection of data you group by all variables that are already non-anonymous (like age, address, gender, occupation) and end up with groups of fewer than k people (where k=5 is common), any other data items in the data set linked to the same individual also become non-anonymous (PII).

Even if you have groups of size greater than k, though, information elements may be non-anonymous if there is not enough diversity in the group. For instance, if every 49-year-old male on a given postal code in a given occupation has a certain religion, then religion is non-anonymous for that group, according to l-diversity [2].

This can be narrowed down even more by t-closeness [3].

  [1] https://en.wikipedia.org/wiki/K-anonymity
  [2] https://en.wikipedia.org/wiki/L-diversity
  [3] https://en.wikipedia.org/wiki/T-closeness


There is no such thing as truly anonymous. in order to send any data you need to connect to a server. at that moment you are in violatation of GDPR because you are exposing the users's IP which is protected by GDPR. See the case where even linking to a CDN requires GDPR consent. https://www.cpomagazine.com/data-protection/leak-of-ip-addre...

And before the army of those who don't understand GDPR comes up with "but then the whole internet can not work"; the crucial distinction comes in the answer to the question: "can this tool fulfill its purpose without this connection? if no, then it's essential to it's functioning and does not require consent, if the tool can fullfll it's purpose without this conection it's optional and does require consent.

GDPR makes a disticntion for connection that are required to fullfill the purpose of the tool and connections that are not essential. So VS code connection to a microsoft Server to let's say update download an extension is allowed and does not require consent becasue without that connection VSCode cannot fullfil its purpose of providing functionality.

Telemetry is not functionaliy and VSCode can execute it's purpose without this connection so that makes it subject to user consent requirement.


By that logic, Ubuntu performs a connectivity check behind the scenes polling connectivity-check.ubuntu.com every few mins to detect if internet connectivity has been lost.

I do not recollect seeing any opt-in Privacy prompt enabling this feature. Surely an OS can function without the internet so it's not "essential to its functioning".

Same with Firefox's captive portal check [1] that helps determine if a Wifi network requires a web-based sign-in or acceptance of terms of use.

[1] https://en.wikipedia.org/wiki/Captive_portal


yes, Ubuntu is in violation of GDPR too if it does not connect for essential functionality. One essential functionality that is acceptable for any OS is that of checking for updates because Security is an essential part of OS.


Wouldn't even be checking Microsoft's server be an unnecessary connection? You could argue, that VSCode would still work, as updates are basically optional and could be triggered manually, too


Yes, I meant connecting to update/install in response to a user action that wants to install extension for "X functionality".


> There is no such thing as truly anonymous. in order to send any data you need to connect to a server. at that moment you are in violatation of GDPR because you are exposing the users's IP which is protected by GDPR.

This is misinformed. There is nothing in the GDPR that relates to "exposing" or "transmitting" anything (other than transmitting further from a processor to a third party). GDPR relates to how data is stored or processed. A program can make any number of http requests, for any reason no matter how unnecessary, so long as that PII (The IP, or similar) isn't stored or otherwise processed/transmitted to a third party in a way that the GDPR concerns. The download web server logs is such a storage (which is why you these days clear those every day, or never log IP at all in them).

> Telemetry is not functionaliy and VSCode can execute it's purpose without this connection so that makes it subject to user consent requirement.

No. It's required because the telemetry data is stored whereas the IP of the update request is not. Had microsoft wanted to store every IP of everyone downloading an update, then that database of IP's/downloads would of course have been subject to the GDPR too. The data isn't less sensitive just because it was from a necessary function. Microsoft's responsibility for that data is exactly the same.

But the easiest way of doing telemetry properly and not worry about GDPR is to not store anything that is PII at all. And it's pretty easy to do so too. Nothing is "Truly anonymous". Telemetry is usually pseudonymous. But it properly pseudonymous telemetry is normally not a privacy concern in any way. The true gripes about telemetry (there are a few valid ones) isn't about that, they are

- People getting a worse experience e.g. a slower product

- People not trusting the companies to adhere to the GDPR with the data transmitted, e.g. you might not trust the server to clear IP's from the transmission (basically the only piece of PII that can't be cleared on the client side because then the package never arrives). But if you don't trust the company to adhere to the GDPR then why would one trust their opt-out does anything? Running any kind of software basically means trust to some extent.

- People feeling cheated because of automatic or hidden opt-in

- People on paid internet connections spending money to send the telemetry.


I last studied the gdpr years ago but that most definitely appears false, provide your sources.

The GDPR deals with "processing" and this is the definition of processing:

" ‘processing’ means any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction; "

Note the "transmission, dissemination or otherwise making available".


I could be mistaken but I think whether the http request makes anything ‘available by transmission’ is down to the definition of who is the data controller and which data processors exist. So in the case of telemetry where no PII changes hands, and no PII is stored, then I can’t see how it applies. That is, assuming that the Telemetry backend here belongs to the same entity that made the app. Such as if a microsoft product phones home to its own backend.

Apps that make http requests to other endpoints belonging to third parties are much murkier.

As far as consent is concerned: Whether consent is required for making a http request containing an IP in the header based on legitimate interest is also murky. Consent is only one way of permitting the processing. Whether Telemetry is legitimate interest I don’t think is established. But it’s important to remember that not only “absolutely essential” functionality that is a legitimate interest. That is: something isn’t automatically not legitimate because it could be removed and still deliver the functionality to the user. Online ads are contested (because profit can be a legitimate interest). The same for telemetry. It’s certainly of interest to the developer to get the data. I have not seen any rulings yet on that but Microsoft has made a pretty decent legal analysis when they conclude that they will never need consent here.

A web server owner can even store data for some time since preventing denial of service attacks could mean they need to store IPs for a short while before deleting. As that’s a legitimate interest, this would not require user consent from visitors.


So first of all you said "There is nothing in the GDPR that relates to "exposing" or "transmitting" anything (other than transmitting further from a processor to a third party). GDPR relates to how data is stored or processed." .

That was false, since the definition of processing explicitly includes transmitting.

VS Code requires accepting the all-encompassing Microsoft privacy statement, and I couldn't find quickly what legal reasons they use for telemetry.

"Legitimate reasons" can practically indeed mean almost anything, and the only limits to it are those placed by subsequent guidances or interpretations of the central or local privacy authorities. It's what largely makes the gdpr a joke. It's very likely that Microsoft relies on it, whether that's acceptable or not.

You seem to consider a local software as part of the software's copyright holder infrastructure, and that appears ludicrous, transmission of usage data from a local application to an other company's server is most definitely transmission.

If VS Studio's telemetry is legal or not I don't know and I'm not interested in delving into it right now, if I had to use it I'd block it and probably wouldn't use it if it became impossible.


It would be also great, if VSCode stopped putting random directories into $HOME, even when running in "portable" mode.


No answer is forthcoming from the VS Code team, because they know you won't like the answer.

Microsoft trawls their[1] endpoints mercilessly for every bit of telemetry that they possibly can, and they go out of their way to prevent customers from disabling this.

Windows 10 or 11 with Office requires something like 200+ individual forms of Microsoft telemetry to be disabled!

Notably:

- They keep changing the name of the environment variables[2] that disable telemetry. For unspecified "reasons".

- They've been caught using "typosquatting" domains like microsft.com for telemetry, because security-conscious admins block microsoft.com wholesale.

- Telemetry is implemented by each product group, which means each individual team has to learn the same lessons over and over, such as: GDPR compliance, asynchronous collection, size limiting, do not retry in a tight loop forever on network failure, etc...

- Customers often experience dramatic speedups by disabling telemetry, which ought not be possible, but that's the reality. Turning off telemetry was "the" trick to making PowerShell Core fast in VS Code, because it literally sent telemetry (synchronously!) from all of: Dotnet Core, PowerShell, the Az/AAD modules, and Visual Studio Code! Opening a new tab would take seconds while this was collected, zipped, and sent. Windows Terminal does the same thing, by the way, so opening a shell can result in like half a dozen network requests to god-knows-where.

[1] You thought, wait... that it's your computer!? It's Microsoft's ad-platform now.

[2] Notice the plural? It's one company! Why can't there be a single globally-obeyed policy setting for this? Oh... oh... because they don't want you to have this setting. That's right... I forgot.

Windows: https://learn.microsoft.com/en-us/windows/privacy/configure-...

PowerShell: https://learn.microsoft.com/en-us/powershell/module/microsof...

DotNet Core: https://learn.microsoft.com/en-us/dotnet/core/tools/telemetr...

Windows Terminal: https://github.com/microsoft/terminal/issues/5331

Az module: https://learn.microsoft.com/en-us/dotnet/api/microsoft.azure...

Etc...


> They've been caught using "typosquatting" domains like microsft.com for telemetry, because security-conscious admins block microsoft.com wholesale.

This seems interesting. Do you have any references for this? I would assume that the main use of such typo-squatting domains is a simple redirect, a la [0][1].

[0]: https://gogle.com [1]: https://gooogle.com


We need a "just say no" campaign that boycotts companies employing these slimy behaviours.


I've been boycotting Microsoft for around 25 years now... But I've noticed most people, even in the tech world, don't mind supporting companies with slimy behavior.


Microsoft’s own telemetry solutions (AppInsights/LogAnalytics) seem perfectly capable of handing async/buffering/backoff etc.

I agree there should be a single place, at least in Windows to control Microsoft telemetry on a per app basis. It should be very easy to accomplish. On other platforms less so.

In a desktop product I do for work we had the dilemma of opt in/out and showing the query clearly and hiding it in settings. We ended up with the middle ground of showing it but having the checkbox checked (so uncheck to opt out). We were still worried this would leave too few opting in but it meant over 95% did.

For command line I’d be 100% happy with a note on first use describing that telemetry is enabled and how it is disabled. Leaving it disabled by default and requiring user action to enable is not realistic in such a situation.


A pre-enabled checkbox is invalid for obtaining gdpr consent


We are assuming here (incorrectly or not) that since no PII is transmitted or stored, the GDPR doesn’t come into play, and the consent is just asking for permission and not “gdpr consent”

Of course it’s impossible to actually transmit anything anywhere without including the source IP in the http header - a fact we are ignoring completely. But that’s similar to the topic of this discussion: Microsoft does exactly this under the same assumption, that non-PII data can be sent (even via http) without gdpr coming into play. Otherwise they couldn’t have it enabled by default. If there is a ruling that says otherwise then everyone will need to change.

It could also be that first party servers (Microsoft app talking to Microsoft servers) is acceptable and then everyone would route telemetry to their own servers.


I haven't checked how they handle it for VS Code, but you probably agreed to some term before using it, and they're probably relying on legitimate interest

My gdpr is quite rusty anyhow


this is why o&o shutup is invaluable


Is it fairly effective these days?


to the best of my knowledge, yes


I don't get people who request for software and websites to become nagware by asking for consent.


I don’t get why software needs a blank check to report and constantly send surveillance about you


It is information about how the software is operating.


The simple solution is to not do anything that requires consent?


Anonymous/pseudonymous telemetry doesn't necessarily require user consent other than for being polite. If you store PII you do, but if you do that you also aren't really doing anonymous/pseudonymous telemetry to begin with.


As was said elsewhere, since telemetry itself is not a functionality, ip address is personal information and requires consent.


Read the replies elsewhere. GDPR doesn’t care about whether a http request containing an IP is necessary or not. The GDPR is not in any way regulating how or why any PII is “transmitted” out of your system.


…at least not if the transmission is within the supervision of the same controller as we are discussing here.


They aren't


They aren't what?


> people who request for software and websites to become nagware by asking for consent

What? Lol. How is this the users fault?

That's just dark patterns by companies to bend users into enrolling. It doesn't have to be like this. It could be opt-in under settings, like just about anything else.

It all about power play.


>How is this the users fault?

If a user asks for the software to nag people and then the developers make the software start nagging proper then it is the fault of the user for suggesting that behaviour be implemented.

>It could be opt-in under settings, like just about anything else.

Or there could be an opt out in settings like how it already works.


Disingenous take IMO

> If a user asks for the software to nag people and then the developers make the software start nagging proper then it is the fault of the user for suggesting that behaviour be implemented.

People are not asking for software to nag. They're asking for the software to NOT send telemetry at all unless the user agrees to it. As it stands now, vscode sends out telemetry before the user has a chance to opt out.

What people want is for software to not be hostile to the users in that way. Failing that, at least give the option before the hostile behavior begins. But really.. It's not the users' fault. It's the software maker's fault for integrating that behavior in the first place and ramming it down our throat, whether we like it or not.


>People are not asking for software to nag.

This issue literally is and describes what the popup should include.

>hostile behavior

Telemetry is not hostile. It is a standard feature for understanding how a product operations or is being used.


> This issue literally is and describes what the popup should include. > Telemetry is not hostile. It is a standard feature for understanding how a product operations or is being used.

Except there already is a welcome screen that gives a "choice" (between quotes) about sending telemetry or not, if I remember correctly.

This however does not prevent sending telemetry. In fact, telemetry is sent to MS before the user has a choice about sending telemetry.

So whether you agree that telemetry is useful for understanding a product and so on (a whole separate discussion), the fact that the user does not have a real choice IS user hostile.

The user should have a real choice here.


Looks like the monthly “people absolutely lose their minds over VS Code telemetry”. The same people would then be complaining if VS Code crashed constantly from bugs that they also never report in place of no telemetry.


This rediculous false dichotomy of "if not for excessive telemetry it would be crashy" is so beyond reason. If it crashes just pop up the crash reporter and prompt the user with a button to send the crash report in. Done. No ethical issues there.

But no apparantly you think microsoft needs a constant faucet if your information to prevent crashes. Golly i wonder how developers managed before said faucets.


[flagged]


> How would they know if performance took a hit or any number of other issues that don’t result in a crash?

Testing, a QA team, and an opt-in bug reporting mechanism.


You're frothing in favor of telemetry


Not really, I'm not the one being vitriolic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: