Hacker News new | past | comments | ask | show | jobs | submit login

https://github.com/Microsoft/calculator#data--telemetry

"This project collects usage data and sends it to Microsoft to help improve our products and services. Read our privacy statement to learn more. Telemetry is disabled in development builds by default, and can be enabled with the SEND_TELEMETRY build flag."

Even on a simple calculator.




Here's what they are actually collecting: https://github.com/Microsoft/calculator/blob/master/src/Calc...


That's some pretty boring information overall.

One which might be cause for concern is LogInvalidInputPasted() specifically because pastedExpression is included in the Telemetry Event.

I also have questions on LogConversionResult(). Why is that recording NetworkAccessBehavior or the raw conversion values? Wouldn't knowing the FromUnit/ToUnit be enough to know how much something is being used?


Conversion between currencies requires access to the API that provides conversion rates.


I presume that LogConversionResult() logs the result of a currency conversion that has already accessed Micrisoft's API. So why does Microsoft need to know the results of my calculations? The function in question isn't GetCurrencyRate().


> might be cause for concern

Only "might"? If you just accidentally paste your password to the calculator...


Why have all the single-use constant strings instead of using the corresponding literals in-place? It’s additional lines of code and doesn't help readability/maintainability (there's already one unused const - EVENT_NAME_HIDE_IF_SHOWN). Additionally, the code mixes string literals with constants:

    fields.AddString(L"AddSubtractMode", isAddMode ? L"Add" : L"Subtract");
    LogTelemetryEvent(EVENT_NAME_DATE_ADD_SUBTRACT_USED, fields);
But then it doesn't use constants where it might make sense, when a string is used more than once. e.g. the literal string "WindowId" is used a dozen times.

Separately, I question whether anyone looking at the telemetry on the backend. In my experience, developers add this stuff because they think it will be useful, then it never or rarely gets looked at. A telemetry event here, a telemetry event there, pretty soon you're talking real bandwidth.


So nothing really nefarious then.


Well it must include your IP address too, and they know the time and date is was received. And then it gets bundled with the rest of the data they collected.

I don't even want them knowing when I'm using my computer.

“What gets measured gets managed.”


Errant pasted data sounds like a bad thing to log.


Because it is a simple calculator. Now think about what they might be collecting from the rest of the system.

All the concerning privacy issues with Facebook or any other company should be seen rigorously by its users. Telemetry at OS Level is one of the worst offending practices IMO.


You don't have to imagine, you can view the data for yourself: https://docs.microsoft.com/en-us/windows/privacy/diagnostic-...


If we're expected to trust the party spying on us to reveal exactly the extent to which they spy on us via their proprietary, closed-source spying data viewer tool, then we deserve exactly whatever the fuck it is we end up getting.


If they cheated on the documentation matching the binary to try and sneak in more telemetry, the backlash would be phenomenal. I don't know if Twitter could handle that kind of load =]

It's always good to trust but verify, although here the the safeguard is pretty strong.


It doesn't always have to be explicitly intentional as you described, but the simple fact is that they are incentivized to gather this data and they will not, in fact, suffer major repercussions commensurate with the amount of data that is collected/gathered. Look at all of the penalties that have been levied at all the other data breaches and how negatively affected those business were. [1]

[1] Very little to not at all.


"MAIL_FOLDER_LIST_EXPANDED", "REFRESH_INBOX_BUTTON_USED", ... :P


‘’’In a brightly lit but still somehow nefarious meeting room two devs discuss their latest findings: Dev1: “These users searching for ‘transfer bank savings’ also have a 15% chance of pasting these long invalid numbers into the calculator. Look at all of them.” Dev2: “Odd those look like bank accounts numbers...” Dev3: “I think we may have an early retirement opportunity...” ‘’’


You raise valid issue, but that issue has nothing to do with the telemetry itself, but with the brokeness of US banking system. Essentially everywhere else the bank account number is simply an public address that you can use to send money there and nothing else.


So microsoft should just ignore all the lessons that we've learned over and over about being data driven?


> Even on a simple calculator.

Why not? If they're paying people to build this software, they presumably want to make it good. With this data, they can learn all kinds of things to help improve future versions: when/why is it crashing, what features are people using most/least, are there patterns of clicks that suggest people are mis-understanding or mis-clicking certain features, etc, etc.

edit: Another one that's often useful is looking at different usage patterns across countries. For example, if nobody in Israel is using COSINE, then there's likely a problem with the right-to-left internationalization that you should fix.


Virtually all of the complaints Microsoft receives about telemetry could be addressed by simply allowing people a global way to turn all telemetry for Microsoft apps off. If you want to support the improvement of apps by allowing this form of telemetry, you can leave it enabled. If you consider it a violation of your privacy, you should be able to easily turn it off.

There's a lack of user control and a lack of transparency. Even if you are a Windows 10 Enterprise user (lucky you, you can turn off system telemetry, unlike the plebes who run Windows 10 Professional), and you turn off telemetry for Windows as a whole, does that include the Calculator or not? Maybe it does. It's really unclear to most people.


If they're paying people to build this software, they presumably want to make it good.

I find it more likely that the real answer is they're trying to "maximize shareholder value." If Microsoft wanted to make good software, there are several thousand other things it could do other than collecting information on people using a calculator.


> Why not? If they're paying people to build this software, ...

Even from people who Do Not Want their data collected for this purpose?


Basically every Android and IOS app you use collects telemetry (which you usually cannot disable) for UX purposes but suddenly when Microsoft does it it's a Big Deal. I don't see the issue with refining UX using anonymous data if it improves the product for the end users. Every high quality commercial app does this.


F-Droid maintains a repository of open-source only android apps, and either doesn't allow apps that has telemetry or at least flags apps which may have such anti-features:

https://f-droid.org/en/packages/


Basically every Android and IOS app you use collects telemetry

So it's OK, then. Jimmy jumped off a bridge, so it's OK if Johnny does it, too. No harm done!

but suddenly when Microsoft does it it's a Big Deal

No, it's a big deal because Microsoft feels entitled to collect information about people using... wait for it... a calculator.

using anonymous data

It's not anonymous. We know it's going to at least get a timestamp and IP address.

I don't see the issue with refining UX

Neither do I. Do it by bringing people in and testing the program on them, the way it was done for decades before "telemetry" became one of those SV euphemisms.

Every high quality commercial app does this.

No. They don't.


Is this data actually anonymous? I can't find anything saying so.

Apple does not collect data at anywhere near this granularity, e.g. which buttons were pressed.


You can effectively disable telemetry in Android by getting Blokada from F-droid


Can it run on regular Android?


You don't need to root your phone to run Blokada, if that's what you're asking... I installed it sometime last year and as of right now, it claims to have blocked 595944 requests.


I had an interview last year with the Windows and Devices Group. On the day of the interview, I was told it would be with people that work on telemetry.

One interviewer told me that keystrokes were being measured by conhost.exe. Not just for internal testing. He clarified that it was done on customer machines. It wasn't clear if they collected individual key presses because we were discussing something you could infer by knowing keystroke timing.

Windows 10 is the most prolific malware in the world.


Speaking as the engineering lead owning conhost.exe: I can't find anything in our code that would trace user input, or data effectively derived from user input, to even a local event stream.

There are a couple instances where we trace API call timings, but only that of API calls initiated by an attached console application.


Open the code up then..


I highly doubt he is the one who has the power to make that happen.


When are we going to see unlimited command line buffers :)?


Hey, I work on the commandline team too - We can't give out exact timelines for anything we work on, but I can tell you that it's pretty high up on the backlog. It's closer now than it's ever been, that's for sure.


That is such an outrageous claim I had to check for myself.

From looking at conhost.exe in IDA Pro, there are some references to telemetry in a function HandleKeyEvent: https://i.imgur.com/MwaBLcW.jpg

But looking more closely it's just used to increase some counters when Ctrl+C or Ctrl+V are used: https://i.imgur.com/29e2qLD.png

There is another one for Ctrl+A elsewhere in the function as well.

Seems these are for when clipboard operations happen, there is other telemetry nearby for when key activity causes a paste: https://i.imgur.com/xONKrGJ.png

So Microsoft are indeed collecting keystroke telemetry in conhost.exe, but it's just some boring old counters to measure clipboard usage.


But how can that be when "... the engineering lead owning conhost.exe: I can't find anything in our code that would trace user input, or data effectively derived from user input, to even a local event stream."


I mean, I suppose that the key sequences around copy/paste are technically user input. Those events get processed into telemetry as to whether they were done in processed or raw input mode, and whether Quick Edit was engaged at the time.

Additional telemetry points include whether the "Find" dialog is being used (explicitly, whether "find next" has been clicked, and how long the search string is -- no other user-generated search content is included in event logs), when a process detaches from the console, and when the window is resized.



Sorry. This is a subthread on the Console and conhost.exe, and I don’t consider myself qualified to talk about Calculator’s telemetry points.


I think this disassembly supports their assertion. The number of times a user copies and pastes isn't in itself input to the console, just a metric of some out of band events.


it's just some boring old counters to measure clipboard usage.

Today. After the next (forced?) upgrade...?


Built-in applications are great opportunities to dogfood new platform features. I'm speculating, but calculator may have added telemetry as a way of testing the relevant SDKs, APIs, data visualization systems, etc, not necessarily because a PM a Microsoft somewhere actually cares what percentage of users press the "multiply" button.


It's not dogfooding if you unleash it on the public.

Surely Microsoft has an accountant or two on staff. Let them test it in-house.


Well, not unless they consider their users to be dogs.


This seems to be the reason they put the telemetry in. They were solving a bug related to the time calculation feature. And, needed to discover if any customers had issues after the change was made.

https://github.com/Microsoft/calculator/blob/057401f5f2b4bb1...


On my windows machine I use Windows 10 Firewall Control (and outbound LittleSnitch type thing .. except crappy). I don't think it was calculator, but there was some other basic Microsoft app that wanted to make an outbound connection. I remember thinking, "WTF?! Why would this need to connect to the Internet ever." .. maybe it was calc. Is this telemetry currently enabled on the released version or is it only in the OSS code?


I'm not sure if I should be disturbed or amused by Microsoft putting telemetry in (at this rate) literally every one of its products.

Like, what the hell do they even track? I guess the obvious ones are things like application crashes, but now I'm curious if they have the data on the most common calculations, folks' favorite numbers, etc.


I'm pretty sure I've entered 58008 in every calculator I've ever touched. That's probably in my Microsoft file somewhere...


And I'm sure your ads are targeted accordingly ;)


The Scientific/Math/Programmer expanded calculators have a bunch of features. Maybe they can test how often the "cube root" button is used versus Arcsinh or something.


This is actually pretty relevant because TI has been goofing up the cube root function in its calculators for decades.


How is that telemetry relevant? The bug needs to be fixed no matter how many people use it.


TI doesn’t think anyone actually uses it in a nontrivial manner.


I have a hard time believing this and can't find any details online. Source please?


I should specify I mean the ones they market to middle and high schools , TI-89, Nspire, and etc


Is there a bug in there?


TI will tell you, “It’s a feature.” Play around with the cube root of negative real fractions ( with even and odd denominators) and integers (starting with -1) and see what you find.


I mean, I guess it would be cool to know what kind of operations people are doing most often? Maybe you send different ads to people who only use addition/subtraction than people who use sin / cos?


or target people who use large numbers with financial planning ads, for example?


Not like I'm going anywhere with this, just morbidly curious but...this is spoken (typed) in parody/jest, right?


Don't give them ideas. ;)


i can't even tell anymore...


Are there that many rich physicists?


What would you imagine with differentiate +/- people to sin users?


Telemetry is not used for ads, but since it is open-source you can check and get a pretty good idea what they actually send.


Here is the list of events, the jokes seem pretty on point to me: https://github.com/Microsoft/calculator/blob/057401f5f2b4bb1...

A small selection:

    constexpr auto EVENT_NAME_INVALID_INPUT_PASTED = L"InvalidInputPasted";
    constexpr auto EVENT_NAME_VALID_INPUT_PASTED = L"ValidInputPasted";
    constexpr auto EVENT_NAME_BITFLIP_PANE_CLICKED = L"BitFlipPaneClicked";
    constexpr auto EVENT_NAME_BITFLIP_BUTTONS_USED = L"BitFlipToggleButtonUsed";
    constexpr auto EVENT_NAME_ANGLE_BUTTONS_USED = L"AngleButtonUsedInSession";
    constexpr auto EVENT_NAME_HYP_BUTTON_USED = L"HypButtonUsedInSession";
    constexpr auto EVENT_NAME_FUNCTION_USAGE = L"FunctionUsageInSession";
    constexpr auto EVENT_NAME_BITLENGTH_BUTTON_USED = L"BitLengthButtonUsed";
Do you have a source for "not used for ads"?


How can you be sure it's not used for ads, their privacy statement says they do[1]:

"You provide some of this data directly, and we get some of it by collecting data about your interactions, use, and experiences with our products."

"Microsoft uses the data we collect to provide you with rich, interactive experiences. In particular, we use data to:

...

Advertise and market to you, which includes sending promotional communications, targeting advertising, and presenting you with relevant offers."

[1] https://privacy.microsoft.com/en-US/privacystatement


You're looking for: https://privacy.microsoft.com/en-gb/privacystatement#mainadv... (data seems different on the US site?)

TL;DR: AdvertisingID handles ad tracking in windows and can be opted out of during install, in settings and globally on your account. AFAIK you can also wipe any previously tracked data. AID tracking is unrelated to telemetry.

Telemetry does not contribute to advertising at all unless "tailored experiences" is enabled in settings, and even then the data is mostly used for Microsoft to suggest features and fixes for problems.


> Telemetry is disabled in development builds by default

So it's not ideal from a privacy perspective to have to worry about usage telemetry in a calculator. But at least there is the silver lining of having telemetry disabled by default _somewhere_, even if it is only for dev builds.


To be fair, it's a nightmare looking at reports that are all clogged up with crashes and whatever from your developers making messes at their desks running non-release code.

Obviously one option is just a dev build flag in your telemetry payload, but I can see the logic in just turning it off altogether.


There is always the HN beloved ChromeOS as alternative.


I'm just surprised it's opt-in for dev builds.


Developers use software in weird ways. I've written scripts to hit buttons thousands of times for hours on end. It makes sense that they don't want developers' janky experiments polluting data used to drive design decisions.


Then why the hell do .NET Core dev utilities have opt-out, and sometimes mandatory, telemetry?


The label "developer" is relative. Any given product wants to exclude telemetry from developers developing the product itself. There's no need to exclude telemetry from developers using the product like a user. The point is to exclude developers running the software in weird ways to debug it. Not just to exclude anyone that calls themselves a developer.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: