Tracking down regressions, crashes, and perf issues without good telemetry about how often it's happening and in what context. Issues that might have otherwise taken a few days to resolve with good info, become multi-week efforts at reproduction-of-the-issue with little information.
It simply boils down to the fact that we can't build a better browser without good information on how it's behaving in the wild.
That's the pain point anyway. Mozilla's general mission, however, makes it very difficult to collect detailed data - user privacy is paramount. So we have two major issues that conflict: the need to get better information about how the product is serving users, and the need for users to be secure in their browsing habits.
We also know from history that benevolent intent is not that significant. Organizations change, and intents change, and data that's collected now with good intent can be used with bad intent in the future. So we need to be careful about whatever compromise we choose, to ensure that a change of intent in the future doesn't compromise our original guarantees to the user.
This is a proposed compromise that is being floated. Don't collect URLs, but only top-level+1 domains (e.g. images.google.com), and associate information with that. That lets us know broadly what sites we are seeing problems on, hopefully without compromising the user's privacy too much. Also, the information associated with the site is performance data: the time spent by the longest garbage-collection, paint janks.
This is a difficult compromise to make, which is why I assume it took so long for Mozilla to come around to proposing this. These public outreaches are almost always the last stage of a length internal discussion on whether proposals fit within our mission or not.
I'm not directly involved in this proposal, but I personally think it's necessary, and strikes a reasonable balance between the privacy-for-users and actionable-information-for-developers requirements.
If that's what you're aiming at. Collect the data but keep it local. Install some sort of responsiveness/"problem" monitoring. Ask the user to send data relevant to the problem if a problem occurs. IMHO there is no need to systematically collect user data for that.
Or get the data from a random sample of users. You don't need data from everyone.
That seems like a reasonable compromise to me. I'm happy to send logs if my browser crashes whenever I visit a certain page, and if I know I'm gonna be monitored for that period, I'll isolate my browsing habit to only visit that page. I do not consent, however, to sending everything--even anonymized--on the offchance that Mozilla will see the crash events and use it to flag that domain and maybe fix the issue on that particular page.
That sounds way more reasonable to me.
To my amateur ear, that actually sounds like a good compromise to lessen the blow somewhat more. You should suggest it to Mozilla :)
And if I opt-in to data collection, why would it matter to me whether the stats I'm sending are a result of me being selected as part of a random sample or not? Might as well just _always_ send those stats; it doesn't matter to me.
What we plan to do now is run an opt-out SHIELD study  to validate our implementation of RAPPOR. This study will collect the value for users’ home page (eTLD+1) for a randomly selected group of our release population We are hoping to launch this in mid-September."
"this study will collect ... for a randomly selected group"
 - https://wiki.mozilla.org/Firefox/Shield/Shield_Studies
Then don't make the compromise.
As others have expressed here the reason few people opt in to data collection may be because they have chosen to use a Web browser that does not mandate the collection of data.
I'm assuming there will always be an opt out which I shall add to my list of things I have to do when installing Firefox.
There will be. Sorry for the hassle :(
The ESR track presumably will have the default flipped because corporations get funny about data transfers to remote servers - mind you Microsoft seem to be getting away with it for business who don't have a full on Enterprise set up.
I'm not sure I like that gamble.
I use Firefox and always opt into any telemetry that sends data back to Mozilla. You could say I am a fanboy. I think it is a HORRIBLE idea and Mozilla should scrap it yesterday and never bring it up again. If people bring it up again, send them to the roof team (if it doesn't exist, create one). If they come downstairs, fire them. You already have people like me who are willing to opt-in to every single thing you can try. For example, Firefox nightly on Android has consistently crashed for me about every five minutes or so since the last weekend and yet I keep using it. Don't throw away this goodwill.
Lack of reporting from non-technical people who aren't aware they can opt-in cannot be corrected statistically, as the two categories of people (technical, non-technical) use the browser very differently.
For made up example, if you type "Yahoo" into the search bar and then type "Search" into the field and then type your search into the third page, you'll be acting as many normal world users do, and you may uncover crashes on page #2 at Yahoo that a technical user would never encounter, simply because they wouldn't type the word "Search" into the search field at Yahoo and trigger a JS bug where "Search" or "Yahoo" gets used one too many places and ends up crashing the CSS parser because it race conditions with repaint.
If that problem affects 0.01% of the Firefox population, that's a lot of people who don't think technically, and do feel regret when we crash and can't help them because we can't see where it crashed.
(Yes, employed. No, I didn't talk to anyone else before I posted here. My own thoughts, I am not a number^Wcorporation, etc.)
> This is a proposed compromise that is being floated. Don't collect URLs, but only top-level+1 domains (e.g. images.google.com), and associate information with that. That lets us know broadly what sites we are seeing problems on, hopefully without compromising the user's privacy too much.
Sure, there's no problem with images.google.com because it's generically innocuous. But what about pornhub.com for users in Saudi Arabia? Or some Japanese site that's essentially child porn for users in the US? The top-level+1 domain in many cases is totally incriminating.
> Also, the information associated with the site is performance data: the time spent by the longest garbage-collection, paint janks.
Maybe so. But it's collection of the top-level+1 domain that's the problem.
> I'm not directly involved in this proposal, but I personally think it's necessary, and strikes a reasonable balance between the privacy-for-users and actionable-information-for-developers requirements.
Fine. But then, make it opt-in, to protect users.
1. You're proposing a mechanism for collecting data, and a strategy for extracting more data than you currently do. You have not figured out the type of data that you will finally need, only a set of things that you currently envision. Naturally, the data that you will collect in the future will be more than what you currently envision. There is built-in mission creep that is dangerous.
2. What you currently envision is not fleshed out as especially useful. You only believe it is useful. The pain point of biased data is red herring. Your concern is more about not enough data.
3. You have found a technology which you believe will allow you to collect a lot of data anonymously. But none of you seem to understand the technology very well. It seems like a shiny toy that you are eager to go to town with. I am not sure this is the right attitude.
4. You're proposing to use your users in lieu of proper testers, or to save time. There are many ways to properly test software and to save time. Have they been explored? There used to be a time when beta software was a thing. Prompt the users to become testers for your beta software. If users don't want to be testers then don't collect data from them. How much data do you actually need anyway? Have you fully utilized your existing data?
Over all, I see this as a nice-to-have luxury, not some life-and-death situation, and subverting the goodwill of users is not worth it, IMHO.
Firefox already has opt-in telemetry, and Firefox already has a beta channel. It's unclear to me how it would help to tie telemetry to the beta channel; that would just make the existing problems (not enough data, and biased data) even worse, since there are probably far more users willing to share telemetry data than to use beta software.
I don't know, is it? How would I check, if I consider apple an untrusted actor?
Its an interesting compromise... because without improved performance and features, we'll lose Firefox entirely, and all of the relative privacy / security gains that entails. This is a good example where "perfect" privacy that reaches only a few is the enemy of "good" privacy that reaches more people.
Only collect top-level domains of Alexa rank 1k. That users are using a highway is less sensitive than a specific street where there only exists 5 homes, and it reassures users that private domain names won't be leaked.
Send the data through Tor. That way you only get the data about the browser <-> site interaction, not user<->browser<->site interaction.
And make it opt-in and notify users of the purpose of the data collection. A good model to follow here is Debian installer and popcon. Follow the good practices of data collection in the free software world and do not use dark patterns.
EDIT: It should also be completely disabled in Private Browsing mode -- otherwise the optics are even worse than they are now.
The OP actually discusses a very interesting method for doing exactly that using differential privacy techniques. I personally think that's a very good compromise for this use-case.
From the OP we can also see that they don't intend to store IP-addresses, but it will always be possible. By using a anonymity network they can reassure the user and at the same time eliminate the risk that a malicious actor in the future will silently manage to start tracking information about which websites users go to. Additional benefits is that Mozilla also won't become a target for governments, a risk that no organization can ever be safe from if they start gather information about users.
It is not enough to strike a reasonable balance between the privacy-for-users and actionable-information-for-developers. You also need to find a balance between risk management and time spent on reducing risks. What I propose primarily is that they spend a bit more time on reducing risks, as that would benefit everyone.
EDIT: Don't forget that the DNS resolution for porn sites can be deanonymized and resold by your internet provider - there's nothing we can do to protect you from DNS being a cleartext, sniffable, mitm'able protocol.
1. Crash reports only report crashes. We need also want to see perf issues like GC and paint jank, etc.
2. Crash reports don't sample the general population, so statistically the information is less useful. If we get a perf issue, it's very important to know whether that issue is suffered by 10% of the users in general pop, or 0.5% of users in general pop. You want to prioritize the stuff that has the greatest impact on the general user population.
Lastly, crash reports are sort of a boolean filter - you only get the people that crash. The things I'd like to know to help in my development are things like "what is the histogram of max GC pause times on docs.google.com". Getting that info requires a good random sampling of the population, not just those who exhibit problems.
"Hi! It seems that this page is loading unusually slowly, would you mind submitting more details to help Mozilla diagnose the issue?
Click `More Details` to see exactly what information is being reported."
You even already have a good entry point for one of these - the "unresponsive script" dialog.
2. If the issue is reported 10x more often on docs.google.com than on obscure.yahoo.com only because docs.google.com is far more common (even though the problem happens only on 0.00001% of visits to docs.google.com but on 10% of visits to obscure.yahoo.com) it does indicate that the issue in docs.google.com is more important. Sure it is rarer per visit, but a user is still 10x more likely to encounter it.
Thanks for bringing that up.
PLEASE do not go down this road. Look where "optimizing" video card drivers has led the video game industry. Game engine developers and game developers are lazier than ever. It is not up to you to make sure docs.google.com runs well on your browser. It is up to you to provide browser that adheres to (and defines if it must) standards. It is up to the web developers at docs dot google dot com to make their application work on Mozilla Firefox.
A program written by a developer and used by a user is a relationship between that developer and the user. I just work on the platform that allows that relationship to exist. I feel it's overstepping our boundaries as platform providers to say "we're not going to make this platform faster for you because we think developers are writing bad code using that performance as a crutch".
It feels like I'd be setting myself up as a self-appointed clergy over moral matters in software development. It's not a hat I'm comfortable with.
Making bad code run faster is overstepping the boundaries.
But the example I mentioned: histograms of max GC pause times on a particular website. Or particularly bad janks, or long amounts of time spent in JS which might be the result of poor JS execution..
None of these optimize "bad code". They're just standard platform performance optimizations that help all programs. That will include "bad" programs as well.
If mozilla can't see how utterly insane this is then there is no hope left.
What about the relationship between you/Mozilla and the firefox users? This thread is evidence that at least some of the users are not happy that you are (in their eyes) sacrificing their privacy for future performance gains.
Again, the question should be if something "benefits the entire web", how can we discover it without an opt-out anti-feature? If the answer is we can't, then we don't want it. It is as simple as that.
Uh - these are the most important people. The. Most. Important. The people you just pissed off by taking a header in the middle of whatever it was they are doing. Your performance noodling is irrelevant if you aren't addressing those issues.
I'm sorry, but you make the team sound incredibly out of touch with statements like this. To offset the other platforms advantages in marketing visibility, Mozilla has to be better across the board to survive, so unless you guys aren't crashing at all now, I'd say that this should be job #1.
I look forwards to the fork.
They say they have to default to opt-in because otherwise users will not enable this type of data collection. That, in my opinion, should be the #1 indicator that users DO NOT want this collection happening in the first place. They default to opt-in because they know most people won't opt-out, either because they forget, aren't aware that it is happening, or various other reasons.
I'm okay with them collecting any data they want to so long as it is opt-in (because I never will). Mozilla is slowly eroding their original, core values.
Using "images.google.com" as an example is too convenient.
That would be great if you could also add whatever TLD+1 most people would rather keep private as another example right after "images.google.com".
Until sites start programmatically generating a unique subdomain for each [Firefox] user.
Do you consider images.google.com to be eTLD+1? The eTLD would be .com; so, eTLD+1 would be google.com; and hence, images.google.com would be eTLD+2?
Sorry I do not accept this compromise. Mozilla seems to have lost its way of late. Sad to see a company that was at the fore front of Privacy, and Security abandon that in name of market share and performance.
I would rather sacrifice performance for privacy, not the other way around.
From EME, to the adoption of Browser Extensions as the only customization option, now this.... Mozilla and FF is changing in ways that are harmful to the open, secure, and private web. Following the trends and policies of MS and Google is not the correct path.
That said, I don't feel that we have a choice but to compromise. If we don't build a better browser, then the other browsers will win by default, which means you lose all those privacy and security motivations anyway.
This is not some gleeful romp down the yellow brick road of data collection. It's a hard-searched, difficult compromise to a question that there are no good answers to, and LOTS of disagreement about.
I have used FF since Ver 1.0 for a few reasons the top ones being it is Open Source, it has always been the most privacy and security focused browser, and were strong advocates of Open Standards that where inter-operable on ALL platforms with out vendor lock in
FF is still open source.... the rest that seems to be in flux now.
I don't see it as an either-or, but rather a balance to strike. A perfectly private browser with no marketshare doesn't help users. A completely compromised browser with 100% marketshare doesn't help users either.
Mozilla is not happy with us current users though, they would much rather trade us for Edge and Chrome Users..
Mozilla has made it clear it does not value the Users that desire Privacy, Customization, and Power in the hands of the user. Mozilla has Dreams of "beating chrome" a pursuit I have no interest in and place no value in.
The only hope is that one of the forks of earlier versions manages to get enough developers and an institution behind it that they can bring it back to popularity, but before that happens we might be calling the internet "Chromenet" and google won't allow you to visit their sites unless they have been signed with a valid Chrome developer key.
Edit: I've been with you guys since the beginning, but the line is drawn here.
If Mozilla wants perf data, collect it and then prompt the user "crash reporting" style.
I would totally opt-in to prompts. Give it a threshold and ask, "This page seems to frequently perform less well on your computer, would you like to send us a report?"
Random sampling and privacy run into conflicts not just in the browser space, but everywhere else. For example, recently the Canadian government went through a period where it allowed census respondents to optionally answer some questions that were previously mandatory (using privacy arguments). The result was several years of poor census information. The recent government reinstated the mandatory census questions.
The browser is just one arena where this everpresent conflict between knowledge and privacy plays out.
What have changed so much in last 5 years or so that now you have to get all this data? What is wrong with just building a standarts compliant browser that runs JS fast and has easy to understand settings (where I don't have to go to about:config to disable the WebRTC/telemetry/Pocket etc) ?
To be honest, a lot. Once again, this is my personal take on the matter, not Mozilla's view.
First off, browsers were a LOT simpler back then. The sophistication and complexity in a browser has grown significantly in the last decade or so.
Secondly, browsers have matured. Remember that this software category has only been around for 20 years or so. Compared to the code quality in browsers today, browsers of 10 years ago were crude and simple. As a software category matures, the low-hanging fruit dry up, so it's harder and harder to improve your product.
Lastly, competition. Firefox has the luxury of being released when the biggest competitor (Microsoft) wasn't putting real effort into its browser product. Google will not make that same mistake with Chrome.
Basically, the information we needed back then was less, because the problems were much more obvious, because the whole industry was still pretty young. Now browsers are much more mature, the ecosystem is much more complex, has a much wider user base, and the problems are becoming harder and harder to pin down.
> A perfectly private browser with no marketshare doesn't help users. A completely compromised browser with 100% marketshare doesn't help users either.
But for things like perf and regression? Really?
You might miss out on issues if users don't submit, but each submission is an indication of problem (because it's Firefox that decides a problem is bad enough). And you can still prioritize based on how common that problem is.
A random sample of users experiences perf issues, a random sample of users opts-in to the collection, you get a random-sample of data. (If you suggest they opt-in to continued collection, you might even get a continuous stream of samples from the same user.)
Yes, that data won't cover the people who don't have issues, but do you need to optimize for them? It also won't cover people who have issues but still don't opt-in, but do you think that is somehow correlated to the severity of the issue? Otherwise the data will be mostly unbiased. The variance will be higher than if you made it opt-out, but if you are doing sound statistics, you will have to handle that anyway.
You people are going ahead with this idiotic plan - because that is what Mozilla does, asks for feedback and then proceeds to ignore feedback - and you will lose another 2% market share.
The reason is painfully obvious: You betrayed one of the core principles of Firefox, which is privacy. You pissed off a lot of people which will NEVER come back because you stabbed them in the face and spat in the wound.
You also gave Microsoft and Google a freebie. Now they have something else to throw back at you: your supposed "more private" Firefox phones home with your users' browsing history (not strictly true, but people don't dig that deep into the minutiae).
Hows that stopping them from winning by default? You basically just disqualified yourself...
Make this thing optional, otherwise you are dead meat. If you can't "win" without betraying your principles, it's time to either throw down the towel and give up or just be upfront and admit that you are going to go all in, users be damned.
That last option would actually probably gain you a few users.
It can be a factor in security both positive and negative as XUL was very powerful and could be abused, but it also was used by some projects to enhance the security of FF or provide other security related functionality that is now no longer possible unless FF allows or builds it into the browser directly. Same for Privacy.
So since Web Extensions / Browser Extensions was started by all 3 of those entities with FF adopting them I am very very cautious of them
Who decides what is a "better" browser?
1. Is it the authors? Do they write the software for themselves and agree to share it for free with anyone who may want to use it?
2. Is it the users? Do the authors solicit feedback from users to determine what users want? If users demanded a browser with no default telemetry, would the authors comply?
3. Is it third parties who have an interest in the behavior of users? For example, domain name industry, ad-supported businesses, their employees or advertisers themselves. Are the authors on salary, compensated indirectly from advertising revenue? Or does it come from somewhere else?
4. Is it all of the above? If we follow the money where does it lead? Whose decision of what is "better" is the most important?
Mozilla is descended from a defunct 1990's company that aimed to license a web browser to corporations for a fee. It would have been very clear in that case who the browser was being written for. But today, it is not so clear who Mozilla is serving. It resembles some sort of "multi-stakeholder" project.
It would be nice to have a browser that fits description 1 or 2. I believe there are plenty of folks, including some developers, who would appreciate a browser with no default telemetry. By virtue of the total absence of data collection, they might consider it "better" than alternative browsers that "need telemetry" for whatever reason.
The same concern will of course apply to any other data harvesters, but that's for another thread
Now, here's my concern. I DO NOT want compromises. I DO NOT want to balance anything. I DO NOT want this telemetry crud on my browser spewing out my browsing history to anyone, no matter how anonymous you people claim it will be.
I just want a decent web browser.
What are my options? "Mozilla's way or the highway"? Redirect evil.telemetry.things.mozilla.org to /dev/null? Go back to elinks?
Or will there be a "disable this piece of crap utterly and completely" button somewhere not hidden under an URL? Or even better, a compile flag?
The main reason to collect data is monetization. People don't like to think they're being sold, so it's justified on other grounds. That's a universal. Since the way data is monetized is to track and segregate users, claims that it can be done in a privacy-respecting fashion are, therefore, specious.
There is one conclusion to be drawn here, and it isn't that Mozilla is going to respect my privacy.
Also interesting: the method they plan on using for anonymising this: https://en.wikipedia.org/wiki/Differential_privacy#Principle...
If that is not sufficiently anonymous, then please submit the reasoning why to Mozilla.
EDIT: OK. It's boolean flags (like use of flash) plus an eTLD+1 (example.org; not myname.example.org?). Even so, I believe this tracking should be opt-in with a disclosure screen that explains exactly what Mozilla is recording. Informed consent is a practice we should be promoting, even if it seems unnecessary.
Doesn't the differential privacy system described above prevent even that from being an issue?
Not to mention, people will tend to visit the same websites repeatedly. The entire premise of DP is that the real data will stand out from the noise, creating a compelling picture of what an individual visits on the web. How will that aggregate data be anonymized, when it is reported with (a minimum of) an IP?
In short, this still requires a lot of trust in Mozilla, even with the DP algorithm, to not do the wrong thing with the dataset. And, in my eyes, making this opt-out and not opt-in already compromises that trust.
Mozilla has been violating even the minimal legal standards in the EU for years, and no one cares.
It’s insanity that an organization promoting its products with privacy doesn’t even meet the minimal legal standards. We’re seeing Google Analytics tracking in parts of the browser ("Get new Addons" page, for example), without even the legally required cookie warning.
EU law is clear on this, as soon as you store any data, do any tracking, connect to any third party, or transmit anything for analytics, you have get opt-in.
Wouldn't that still leak health information? Less overall, but if any is bad, this still isn't acceptable.
5%^10. Very very unlikely. Sounds very similar to "guilty beyond all reasonable doubt".
A URL must not contain PHI. If it does, a breach has already occurred.
And Firefox is only collecting the domain names, it looks like.
I'd argue that domains are the same- there are tons of domains that clearly indicate what they're about (e.g. stop-drinking.example)
You can't, but that can't be part of Mozilla's threat model, and it's not relevant here anyway because Mozilla isn't collecting it.
And even if they were, that's not considered PHI legally. You are free to type any information about your own health that you want anywhere; that doesn't make it legally PHI, unless you are providing it to a Covered Entity.
> I'd argue that domains are the same- there are tons of domains that clearly indicate what they're about (e.g. stop-drinking.example)
This information is not legally considered PHI. As for privacy, SNI means that all domains you visit are already visible in transit, even if you are using SSL. Domain names are not considered private.
Do you have any sources that go into more detail?
When I've worked on PII in analytics, even TLDs were treated carefully. (obviously not the same from a legal perspective...)
PHI is an incredibly well-defined term legally and is not equivalent to PII. Some things that constitute PHI actually wouldn't qualify as PII.
There are a lot of resources that explain HIPAA in great detail; if you want to know the specifics like here, you have to read the bill and the case law itself.
How normal everyday people actually use their product cant be a part of the threat model... Really?
That is scary...
That said, when a "breach" has occurred is a legal distinction involving the control of information -- when protected data moves beyond those who have a duty to protect it. Saying that a particular technical approach creates breach is inaccurate.
I very much hope that the Debian maintainers (and hopefully also the guys preparing Fennec in F-Droid) will disable such data collection mechanisms, either completely or hidden behind an explicit opt-in instead of the opt-out suggested in the e-mail.
Do you have a citation for that broad assertion? My understanding is that this is highly variable across legal jurisdictions and even in Europe, which typically leads the way in privacy, it's not that simple. See e.g.
https://www.whitecase.com/publications/alert/court-confirms-... discussing an EU Court of Justice ruling that had two requirements: the ISP can link that IP address to an individual AND the website operator can get that information from the ISP.
Legally though Firefox would be allowed to collect this anonymous data from the user by having him/her send the data e.g. to an API endpoint they provide via IP-based communication, they would just not be allowed to associate the data with the IP address of the user submitting the data. In the end, it comes down to trusting the party that collects the data, at least if they don't perform anonymization of the IP address via other means, e.g. by passing the information through a third party proxy server.
BTW, GDPR does forbid to turn on such data collection by default (privacy by default), so they would be required to get the explicit opt-in from the user for that.
My understanding is that many of these details are yet to be settled with GDPR. The case referenced above was not interpreted under GDPR, which has yet to take effect. The definitions of personally identifiable data data rather vague, and precedent has not been set. A quick search showed conflicting opinions, but one perspective to consider is quoted below:
> In addition, businesses should note that Recital 26 to the recently adopted EU General Data Protection Regulation ("GDPR") states that the test for whether a person is "identifiable" (considered in detail above) depends upon "all the means reasonably likely to be used" to identify that person. The CJEU in Breyer did not directly consider the issue of likelihood of identification. If the BRD was not reasonably likely attempt to identify Mr Breyer from his IP address, this could potentially give rise to a different analysis under the GDPR. Consequently, it may be necessary for the CJEU to revisit this issue after enforcement of the GDPR begins on 25 May 2018.
This is a few years old, so if you know of some new decision or regulation that clarifies it would be great to know!
Now, you could of course argue that often it's not possible to infer the identity of a person given an IP address (e.g. because it is a dynamically allocated IP address by an ISP or an IP address of a proxy server through which many users connect to the Internet) and therefore store it, it would be very hard to impossible though (IMHO) to ascertain that none of the IP addresses which you store could be used to identify a specific person (what e.g. if there are 5 % static IPs in your data?). This in turn would make treating all of your IPs as non-personal data a risky business to say the least, as there will almost certainly be a way to identify at least some of your users from their IP addresses. The fact that you don't know about a particular way of doing this identification is not relevant for this.
My advice: If you do not use a very robust method for making sure that all the IPs you store are non-identifiable I would recommend not storing them at all (or at least truncating them to 24 bits, which does also not always eliminate deanonymization risk though).
Are you saying that people can not be identified by their IP address?
I think it's important to talk about this issue – especially the importance of not storing it long-term — but from my perspective the real concern is the industry dedicated to linking and sharing your online activity. Without that an IP has little value and with it they can deanonymize most people without using IPs.
And requires an opt-in under EU law, which makes this entire thing even more ridiculous.
Otherwise literally everything that connects to the internet in some way would have to treated in that way, and that's not how the law is currently enforced.
Not true. Tor has demonstrated that it's entirely possible to transmit data over the internet without revealing your IP address to the party you're transmitting to.
Latency also doesn't matter here; this telemetry could take 5 minutes to reach its destination and it wouldn't matter, so long as the data is eventually received.
I don't want my browser to collect any kind of data.
No, seriously; why? I don't get this mentality at all.
Let's ignore the exact implementation here for a moment, and assume that Firefox is somehow magically doing this data collection in such a way that it is guaranteed the data collected cannot be traced back to you as an individual. (E.g. "sufficiently anonymous".)
What problem do you have with that, specifically? How does this harm you in any way?
That's kinda beside the point here though, as the GP seemed to be against collecting this data _regardless_ of whether or not it's anonymized or not. I'm interested in hearing why.
I'm ok with testing things and sending feedback, but when I switch to a production environment, I just want my tools to behave like my tools, not the testing farm for somebody else.
Why should I prove to you that it would harm me? I just do not want it, it should be enough.
Obviously that's ridiculous, right? If the default setting was to not seed, torrent clients would be much less useful for everyone involved. Browsers sending usage stats are much the same way. While no individual user benefits from _their machine_ sending those statistics, it's better for the user population as a whole if the default setting is to send them, since those stats help the browser vendor build a better browser. (And before you cry "privacy", remember that in this context we're talking about a situation where the statistics are being sent in a way that is "sufficiently anonymous" such that privacy isn't an issue. See the GP.)
So while I agree you certainly should have the right to disable sending usage statistics if you wish (just as many Torrent clients let you disable seeding), expecting that to be the default setting is a bit strange.
Imagine I came to your house, and photocopied all your documents.
Don’t worry, I blanked out the name, so it’s completely anonymous, and everything is where it used to be.
Would you be okay with that?
I certainly wouldn’t.
Making this opt-in or opt-out is a question of consent, and choosing opt-out shows that you don’t give a flying fuck about me, and only want your own benefit.
There is a swearword used, but it’s not in the context in any way uncivil, as the plural "you" that it is referring to is an abstract person, a hypothetical entity – not any actually involved person. (In this case, the potential future group of people at Mozilla who might decide to override an explicit choice I made for their own convenience)
The topic is a choice that Mozilla plans to make, and is questioning users, and more about.
The decision has not been made.
My argument is that, if Mozilla (and whatever users Mozilla asks to give an opinion), choose to override the current decisions of users who do not want telemetry, and require a new hidden opt-out, then that would be proof that they (as group) don’t really care about the users choices.
The user I was talking to has no power in making that choice, nor do I. Nor is all of Mozilla making that choice.
I was using "you" with the meaning of the German word "man", (I’m natively German): one; you; they; people (indefinite pronoun; construed as a third-person singular).
Remove all "we respect privacy" from the advertisement.
I like your faith. However, if this change goes in, and the capability is there, it will get misused. Because, statistically that's how these things go on this planet+capitalism.
Or are you saying you're worried that they could _start_ collecting non-anonymized data in the future? If so, I don't really get that argument either. People always have the ability to change what they're going to do in the future, Mozilla deciding now not to collect this data wouldn't change that.
1. Run an opt-out SHIELD study to answer the question: "how many people can find an 'opt-out' button?". That's all. You launch this at people with as much notice as you would plan on doing for RAPPOR, and see if you get a 100% response rate. If you do not, then 100% - whatever you get are going to be collateral damage should you launch DP as opt-out, and you need to own up to saying "well !@#$ them".
2. Implement RAPPOR and then do it OPT-IN. Run three levels of telemetry: (i) default: none, (ii) opt-in: RAPPOR, (iii) opt-in: full reports. Make people want to contribute, rather than trying to yank what they (quite clearly) feel is theirs to keep. Explain how their contribution helps, and that opting-in could be a great non-financial way to contribute. If you give a shit about privacy, work the carrot rather than the stick.
3. Name some technical experts you have consulted. Like, on anything about DP. The tweet stream your intern sent out had several historical and technical errors, and it would scare the shit out of me if they were the one doing this.
4. Name the lifetime epsilon you are considering. If it is 0.1, put in plain language that failing to opt out could disadvantage anyone by 10% on any future transaction in their life.
I think the better experiment that is going on here is the trial run of "we would like to take advantage of privacy tech, but we don't know how". I think there are a lot of people who might like to help you on that (not me), and I hope you have learned about how to do it better.
If they start opt-out tracking using the same approach as Google I do not see any reason to use it nor install it for my friends and family. That's some data for you, Mozilla.
You want Firefox to succeed as a browser, but to be able to better compete it needs better usage data.
Wouldn't you prefer for Firefox to be the best browser available, AND also be considerate towards your privacy rights?
At this point why would anyone stay with the company B which broke its promise once, just in the hope that it won't break the promise again? It has already lost the trustworthiness and it also has the worse product. Might as well use products from company A.
Privacy is not a boolean.
If yes (and that’s what you get when you choose opt-out), then we’re done. There is no gradual change there, it’s a binary question if you value the user or your own benefit more.
And that’s strictly incompatible with mine.
First ask, then fuck up. Is that concept so hard to understand?
If you’d do that IRL to someone they’d never talk to you again, it’s the same with Firefox if they do this.
And of course this all started with you saying that you may as well switch to another company's products, a company which you know violates your privacy quite significantly. You still haven't explained why Firefox collecting a small amount of data in a way that tries to minimize any privacy violations means you should just give up any semblance of privacy and use a product that tries to collect as much personal information as possible.
I’ve dealt with these issues before myself.
And I understand well what they collect, how, and why. I understand how painful it is when you have no data on what is used, and how, or not even crashreports.
But there also is a limit to how far you can go, and where consent is required.
And when transmitting anything, or collecting anything, consent is required.
You could make it dependent on situation. If a performance issue occurs, show a bar: "Is this website slow? Click [Here] to submit a report so it can be improved. [Details] [X] Always submit".
This gives the user a far better understanding of what is submitted, why it is needed, it is contextual, and it is still opt-in (but with far better conversion)
And if the way Mozilla gathers data is much more considerate, what results can I expect from it? Better parallel requests and data fetching, hardware acceleration, etc are all features that are missing for me as a Linux user. They don't need my dataset for that, it's probably all in their bug tracker.
I prefer absolute privacy over some minor advantages on irrelevant webpages.
How do you even think this system would work in restricted environments such as governments where even the presence of code that could collect data is an absolute no-go?
For more information, see https://en.wikipedia.org/wiki/Differential_privacy for instance.
If the mechanism works, fine, but why should I use Firefox over Chromium then? Opt-out data collection is in violation to my core beliefs and what I believed to be Mozilla's principles.
Collecting data without asking the user about it is - to me - in violation to the very definition of privacy and calling some way to anonymise data (who guarantees that the cryptographic approach to this is not obsolete in a few years?) "differential privacy" is at the very least dishonest.
Two - DP is only really private over a small data set per individual. If DP were enabled for even two days, you could get a very accurate picture of the sites I visit, since a majority of the domains reported would be necessarily be accurate values.
Two: That's an interesting question. You'd need to ask it to someone with more domain knowledge than me.
Here's a concern that comes up from that implementation option: any outliers from the set of existing domains (which would likely simply be implemented as a list of strings) would immediately be able to be called out as a "True" value, while a single reporting of a domain could reliably called out as a "False" value. Unless, of course, you choose a randomization algorithm which exhibits a very strong clustering trait.
You could also limit reports to those domains which are in the whitelist, but that would voluntarily neuter the reporting; something they seem less-than-eager to do.
Ultimately, it will all come down to the implementation details, which are unlikely to be available until after the opt-in release, and auditable by a remarkably small number of people in the open source community.
For Firefox we want to better understand how people use our
product to improve their experience.
No phoning home. No telemetry, no data collection. No "light" version of the same, no "privacy-respecting" what-have-you. No means No. Nada. Zilch. Try and shovel any of that down people's throats and the idea of Firefox as a user's browser will die.
And now this :-(
I have been using Firefox since before it was called that. I develop my apps in it, even though most of my colleagues have switched to Chrome years ago. Even though it is (or was for a while) slower than Chrome for things like Canvas.
But I use because I believe in Free Software. But Mozilla keeps disappointing. DRM, bundled 3-rd party apps, analytics, tracking... It is just so very sad. :-(
Also, I have 17 add-ons installed (11 active). At present, of these 17, only 2 will continue working after November when the switch to WebExtensions is enforced.
Where to go from here?
Mozilla fought DRM until the very end and lost. If Firefox is to have any chance at remaining a mainstream browser it needs to support Netflix and the likes. You can't seriously blame them for this, because they are damned if they do and damned if they don't.
EME is implemented as unintrusively, securely and privately in Firefox as possible. No DRM is downloaded or run on your computer until you specifically consent to it, and the DRM components run in a sandbox.
Yes I can, and I will, because they sold out. They sold out their principles for the sake of market share. (And looking at their marked share, fat lot of good that did for them anyway.)
Excuse me, but did you support Mozilla with time/money?
> They sold out their principles for the sake of market share.
12% is still better than 1%, and the thing that mostly changed the landscape was the fact that mobile Internet heavily disfavors Mozilla (e.g. Android ships with Chrome, iPhone with Safari), and Google has a heavy advantage when it comes to advertising and engineering.
Yes, I have done. Thank you for the snark.
"It's as if the order to block/redirect the network request was silently ignored by the webRequest API, and this causes webext-based blockers to incorrectly and misleadingly report to users what is really happening internally."
There are probably security reasons why add-ons can't modify about:add-ons. Imagine an add-on that could hide itself by modifying that page.
Please don't spread FUD.
In this scenario, how exactly would Firefox's actions here compromise anyone's privacy?
Instead, it's telling that they are choosing to force people to opt-out. They know that their users don't want this, but don't care.
They still _are_ planning to let people decide for themselves whether to participate (via opt-out), they're just using a default that's more likely to result in unbiased sample data.
Again though, what's your actual concern? Provided this feature doesn't compromise anyone's privacy even _if_ its enabled, what's wrong with having it be opt-out?
If, as some commenters here [have suggested], this telemetry would help improve Firefox by significantly reducing the amount of time it takes Mozilla to fix bugs and performance issues in the browser, what makes you think that's not worth the risk when other features (such as the performance fixes themselves) are?
You just seemed to be arguing that _any_ amount of risk would be too much, which in my view is ridiculous since, as I said, all new features carry with them some amount of risk.
Unfortunately that's exactly the kind of thing I was talking about, extending arguments to ridiculous extremes.
I have never said any amount of risk would be too much. In this particular instance, I think the risk and the unknowns are clearly too much.
But why? I don't claim to know enough about RAPPOR to say for sure that the risk _is_ worth it, but it seems a little presumptuous to claim it isn't without knowing _anything_ about the project or Mozilla's proposed use of it.
That's why I assumed you were arguing that _any_ amount of risk would be too much; you didn't include any sort of analysis of the risk/reward in your previous comments, and without knowing the risk the only way to conclude this feature is definitely _not_ worth it would be if you already considered the acceptable level of risk to be zero.
It’s Firefox without telemetry, or no Firefox at all.
The only Anonymous Data is data that is never collected. If they collect data it is a violation of privacy.
This doesn't really seem unreasonable to me. Obviously part of the inherent cost if not wanting to be tracked is going to be not having your raw user data included in evaluations of what people want.
> in this case you do not get to set it
Nothing's been decided yet. If this is something you want to advocate for, maybe consider suggesting that in the thread linked in the OP?
Attacks aside, the point is really that in this age of statistical machine learning we should be vigilant against even this sort of data collection. A leak is a leak. Ideally people can opt into providing just enough information for the statistics they want to participate in and no more; realistically, more is always collected.
If your answer is "nothing" then I think you're being unreasonable. Firefox risks compromising security/privacy with _every_ new feature they implement, not just this one, and it's clear from [other comments] in this thread that this feature is just as important for the overall functionality of Firefox as any other feature would be.
You must be kidding me.
That to me suggests the problem isn't that too many people are opting-out, it's that not enough people are opting-in.
This trend towards parentalism in software, especially software that is supposed to be user driven is frankly a steaming pile of garbage.
If you have any shred of pretense of being pro-privacy and pro-user dont do this mozilla.
Additionally, it's not that Mozilla just disregards user privacy here: differential privacy being used would mean that no user has to reveal their private information, but looking at all the data in aggregate would still allow Mozilla to gain useful information on how to make Firefox better.
Because most people don't care, it was decided to implement a feature that is flat out contrary to people caring.
Management decisions like this don't exactly inspire confidence about the future of the browser.
IMHO, this is a bad idea. Many people I know already use Firefox because they're weary to give Google (Chrome) all their data.
Firefox should make this feature opt-in only.
It's not just about the data, it's about the lack of consent. If you just ask people for permission on the initial startup, I'm sure most people will be fine with enabling it. Last time I installed Firefox, it just showed a tiny bar at the bottom of the window, which is pretty easy to miss. I'd expect fewer dark patterns from Mozilla, that's the kind of shady behavior you see coming from Microsoft. I always try my best to disable or block anything which phones home without explicitly asking for consent.
Tunnelblick  is a good example of this being done well. On the initial run they ask if you want to enable automatic updates. It includes the option to disable sending anonymous system information, as well as including a disclosure widget with a brief explanation and a table showing the information that would be sent. 
I agree, but note that they are explicitly trying to get more info than they can from the small, biased sample that is users opting in.
Just starting to collect your browsing data is a bad idea (tm) especially if your main claim is "more privacy".
Maybe because most people using Firefox use it precisely because they don't want the browser vendor to track their behaviour?
I wonder how the Torbrowser folks will deal with this.
They get good enough data from the people that have volunteered it. I don't know what makes them think it's biased but I seriously doubt that is true.
Informed, constructive opinion there.
One clear sign of the bias is that the crash rate of the browser goes up massively every time a new version transforms from beta to release. Clearly, it's not renaming that string that makes the browser crash. The populations are just fundamentally different.
To give an obvious example, beta users are overwhelmingly more like to have up to date video drivers. (Which can be seen in crash reports, but is also very logical).
Most users do not care because they do not understand the true ramifications of their not caring. It is not like that looked at all the data, then made made an informed choice to share everything.
FF should be at the heart of caring for users privacy EVEN IF THEY THEMSELVES DO NOT.
The average person does not understand technology, how much data they are leaking about themselves and how this data can be used against their interests
Taking advantage of that ignorance for any type of gain is unethical IMO, most companies willfully exploit this collective ignorance Mozilla should be better than most companies.
Quite the opposite if the focus too much on the regular users they might get too much noise and never notice issues in the more complex features that only power users tend to use.
You want the heavy users of your product sending in reports not the average Joe because he is less likely to even notice a issue.
Higher level features are less likely to be covered by tests and more likely to break just because of their complexity however you wont have many average people using them.
Former Opera people can tell a few stories there.
[EDIT: Firefox branding used to use the word privacy a lot. I can't find it on their website much at all anymore.]
Firefox doesn’t sell access to your personal information
like other companies. From privacy tools to tracking
protection, you’re in charge of who sees what.
Here’s how Firefox protects your privacy
For instance, from the same page 8 years ago: "we have experts around the globe working around the clock to keep you (and your personal information) safe."
Or this quote from the equivalent site 6 years ago:
"And, as a non-profit organization, protecting your privacy by keeping you in control over your personal information is a key part of our mission."
These users will be installing Chrome, not Firefox.
The switch barrier is non existent.
Most of the replies in this thread are ideological. Nobody is arguing CSS rendering speed comparisons and such.
People use Firefox because they like it.
People are irrational but like is huge. Toyota over Hyundai. Vacations at the sea instead of skiing. Firefox over Chrome.
The like is a habit, but if all things are equal and free, a very flexible one.
And this move will also drive them away.