Tracking down regressions, crashes, and perf issues without good telemetry about how often it's happening and in what context. Issues that might have otherwise taken a few days to resolve with good info, become multi-week efforts at reproduction-of-the-issue with little information.
It simply boils down to the fact that we can't build a better browser without good information on how it's behaving in the wild.
That's the pain point anyway. Mozilla's general mission, however, makes it very difficult to collect detailed data - user privacy is paramount. So we have two major issues that conflict: the need to get better information about how the product is serving users, and the need for users to be secure in their browsing habits.
We also know from history that benevolent intent is not that significant. Organizations change, and intents change, and data that's collected now with good intent can be used with bad intent in the future. So we need to be careful about whatever compromise we choose, to ensure that a change of intent in the future doesn't compromise our original guarantees to the user.
This is a proposed compromise that is being floated. Don't collect URLs, but only top-level+1 domains (e.g. images.google.com), and associate information with that. That lets us know broadly what sites we are seeing problems on, hopefully without compromising the user's privacy too much. Also, the information associated with the site is performance data: the time spent by the longest garbage-collection, paint janks.
This is a difficult compromise to make, which is why I assume it took so long for Mozilla to come around to proposing this. These public outreaches are almost always the last stage of a length internal discussion on whether proposals fit within our mission or not.
I'm not directly involved in this proposal, but I personally think it's necessary, and strikes a reasonable balance between the privacy-for-users and actionable-information-for-developers requirements.
If that's what you're aiming at. Collect the data but keep it local. Install some sort of responsiveness/"problem" monitoring. Ask the user to send data relevant to the problem if a problem occurs. IMHO there is no need to systematically collect user data for that.
Or get the data from a random sample of users. You don't need data from everyone.
That seems like a reasonable compromise to me. I'm happy to send logs if my browser crashes whenever I visit a certain page, and if I know I'm gonna be monitored for that period, I'll isolate my browsing habit to only visit that page. I do not consent, however, to sending everything--even anonymized--on the offchance that Mozilla will see the crash events and use it to flag that domain and maybe fix the issue on that particular page.
That sounds way more reasonable to me.
To my amateur ear, that actually sounds like a good compromise to lessen the blow somewhat more. You should suggest it to Mozilla :)
And if I opt-in to data collection, why would it matter to me whether the stats I'm sending are a result of me being selected as part of a random sample or not? Might as well just _always_ send those stats; it doesn't matter to me.
What we plan to do now is run an opt-out SHIELD study  to validate our implementation of RAPPOR. This study will collect the value for users’ home page (eTLD+1) for a randomly selected group of our release population We are hoping to launch this in mid-September."
"this study will collect ... for a randomly selected group"
 - https://wiki.mozilla.org/Firefox/Shield/Shield_Studies
Then don't make the compromise.
As others have expressed here the reason few people opt in to data collection may be because they have chosen to use a Web browser that does not mandate the collection of data.
I'm assuming there will always be an opt out which I shall add to my list of things I have to do when installing Firefox.
There will be. Sorry for the hassle :(
The ESR track presumably will have the default flipped because corporations get funny about data transfers to remote servers - mind you Microsoft seem to be getting away with it for business who don't have a full on Enterprise set up.
I'm not sure I like that gamble.
I use Firefox and always opt into any telemetry that sends data back to Mozilla. You could say I am a fanboy. I think it is a HORRIBLE idea and Mozilla should scrap it yesterday and never bring it up again. If people bring it up again, send them to the roof team (if it doesn't exist, create one). If they come downstairs, fire them. You already have people like me who are willing to opt-in to every single thing you can try. For example, Firefox nightly on Android has consistently crashed for me about every five minutes or so since the last weekend and yet I keep using it. Don't throw away this goodwill.
Lack of reporting from non-technical people who aren't aware they can opt-in cannot be corrected statistically, as the two categories of people (technical, non-technical) use the browser very differently.
For made up example, if you type "Yahoo" into the search bar and then type "Search" into the field and then type your search into the third page, you'll be acting as many normal world users do, and you may uncover crashes on page #2 at Yahoo that a technical user would never encounter, simply because they wouldn't type the word "Search" into the search field at Yahoo and trigger a JS bug where "Search" or "Yahoo" gets used one too many places and ends up crashing the CSS parser because it race conditions with repaint.
If that problem affects 0.01% of the Firefox population, that's a lot of people who don't think technically, and do feel regret when we crash and can't help them because we can't see where it crashed.
(Yes, employed. No, I didn't talk to anyone else before I posted here. My own thoughts, I am not a number^Wcorporation, etc.)
> This is a proposed compromise that is being floated. Don't collect URLs, but only top-level+1 domains (e.g. images.google.com), and associate information with that. That lets us know broadly what sites we are seeing problems on, hopefully without compromising the user's privacy too much.
Sure, there's no problem with images.google.com because it's generically innocuous. But what about pornhub.com for users in Saudi Arabia? Or some Japanese site that's essentially child porn for users in the US? The top-level+1 domain in many cases is totally incriminating.
> Also, the information associated with the site is performance data: the time spent by the longest garbage-collection, paint janks.
Maybe so. But it's collection of the top-level+1 domain that's the problem.
> I'm not directly involved in this proposal, but I personally think it's necessary, and strikes a reasonable balance between the privacy-for-users and actionable-information-for-developers requirements.
Fine. But then, make it opt-in, to protect users.
1. You're proposing a mechanism for collecting data, and a strategy for extracting more data than you currently do. You have not figured out the type of data that you will finally need, only a set of things that you currently envision. Naturally, the data that you will collect in the future will be more than what you currently envision. There is built-in mission creep that is dangerous.
2. What you currently envision is not fleshed out as especially useful. You only believe it is useful. The pain point of biased data is red herring. Your concern is more about not enough data.
3. You have found a technology which you believe will allow you to collect a lot of data anonymously. But none of you seem to understand the technology very well. It seems like a shiny toy that you are eager to go to town with. I am not sure this is the right attitude.
4. You're proposing to use your users in lieu of proper testers, or to save time. There are many ways to properly test software and to save time. Have they been explored? There used to be a time when beta software was a thing. Prompt the users to become testers for your beta software. If users don't want to be testers then don't collect data from them. How much data do you actually need anyway? Have you fully utilized your existing data?
Over all, I see this as a nice-to-have luxury, not some life-and-death situation, and subverting the goodwill of users is not worth it, IMHO.
Firefox already has opt-in telemetry, and Firefox already has a beta channel. It's unclear to me how it would help to tie telemetry to the beta channel; that would just make the existing problems (not enough data, and biased data) even worse, since there are probably far more users willing to share telemetry data than to use beta software.
I don't know, is it? How would I check, if I consider apple an untrusted actor?
Its an interesting compromise... because without improved performance and features, we'll lose Firefox entirely, and all of the relative privacy / security gains that entails. This is a good example where "perfect" privacy that reaches only a few is the enemy of "good" privacy that reaches more people.
Only collect top-level domains of Alexa rank 1k. That users are using a highway is less sensitive than a specific street where there only exists 5 homes, and it reassures users that private domain names won't be leaked.
Send the data through Tor. That way you only get the data about the browser <-> site interaction, not user<->browser<->site interaction.
And make it opt-in and notify users of the purpose of the data collection. A good model to follow here is Debian installer and popcon. Follow the good practices of data collection in the free software world and do not use dark patterns.
EDIT: It should also be completely disabled in Private Browsing mode -- otherwise the optics are even worse than they are now.
The OP actually discusses a very interesting method for doing exactly that using differential privacy techniques. I personally think that's a very good compromise for this use-case.
From the OP we can also see that they don't intend to store IP-addresses, but it will always be possible. By using a anonymity network they can reassure the user and at the same time eliminate the risk that a malicious actor in the future will silently manage to start tracking information about which websites users go to. Additional benefits is that Mozilla also won't become a target for governments, a risk that no organization can ever be safe from if they start gather information about users.
It is not enough to strike a reasonable balance between the privacy-for-users and actionable-information-for-developers. You also need to find a balance between risk management and time spent on reducing risks. What I propose primarily is that they spend a bit more time on reducing risks, as that would benefit everyone.
EDIT: Don't forget that the DNS resolution for porn sites can be deanonymized and resold by your internet provider - there's nothing we can do to protect you from DNS being a cleartext, sniffable, mitm'able protocol.
1. Crash reports only report crashes. We need also want to see perf issues like GC and paint jank, etc.
2. Crash reports don't sample the general population, so statistically the information is less useful. If we get a perf issue, it's very important to know whether that issue is suffered by 10% of the users in general pop, or 0.5% of users in general pop. You want to prioritize the stuff that has the greatest impact on the general user population.
Lastly, crash reports are sort of a boolean filter - you only get the people that crash. The things I'd like to know to help in my development are things like "what is the histogram of max GC pause times on docs.google.com". Getting that info requires a good random sampling of the population, not just those who exhibit problems.
"Hi! It seems that this page is loading unusually slowly, would you mind submitting more details to help Mozilla diagnose the issue?
Click `More Details` to see exactly what information is being reported."
You even already have a good entry point for one of these - the "unresponsive script" dialog.
2. If the issue is reported 10x more often on docs.google.com than on obscure.yahoo.com only because docs.google.com is far more common (even though the problem happens only on 0.00001% of visits to docs.google.com but on 10% of visits to obscure.yahoo.com) it does indicate that the issue in docs.google.com is more important. Sure it is rarer per visit, but a user is still 10x more likely to encounter it.
Thanks for bringing that up.
PLEASE do not go down this road. Look where "optimizing" video card drivers has led the video game industry. Game engine developers and game developers are lazier than ever. It is not up to you to make sure docs.google.com runs well on your browser. It is up to you to provide browser that adheres to (and defines if it must) standards. It is up to the web developers at docs dot google dot com to make their application work on Mozilla Firefox.
A program written by a developer and used by a user is a relationship between that developer and the user. I just work on the platform that allows that relationship to exist. I feel it's overstepping our boundaries as platform providers to say "we're not going to make this platform faster for you because we think developers are writing bad code using that performance as a crutch".
It feels like I'd be setting myself up as a self-appointed clergy over moral matters in software development. It's not a hat I'm comfortable with.
Making bad code run faster is overstepping the boundaries.
But the example I mentioned: histograms of max GC pause times on a particular website. Or particularly bad janks, or long amounts of time spent in JS which might be the result of poor JS execution..
None of these optimize "bad code". They're just standard platform performance optimizations that help all programs. That will include "bad" programs as well.
If mozilla can't see how utterly insane this is then there is no hope left.
What about the relationship between you/Mozilla and the firefox users? This thread is evidence that at least some of the users are not happy that you are (in their eyes) sacrificing their privacy for future performance gains.
Again, the question should be if something "benefits the entire web", how can we discover it without an opt-out anti-feature? If the answer is we can't, then we don't want it. It is as simple as that.
Uh - these are the most important people. The. Most. Important. The people you just pissed off by taking a header in the middle of whatever it was they are doing. Your performance noodling is irrelevant if you aren't addressing those issues.
I'm sorry, but you make the team sound incredibly out of touch with statements like this. To offset the other platforms advantages in marketing visibility, Mozilla has to be better across the board to survive, so unless you guys aren't crashing at all now, I'd say that this should be job #1.
I look forwards to the fork.
They say they have to default to opt-in because otherwise users will not enable this type of data collection. That, in my opinion, should be the #1 indicator that users DO NOT want this collection happening in the first place. They default to opt-in because they know most people won't opt-out, either because they forget, aren't aware that it is happening, or various other reasons.
I'm okay with them collecting any data they want to so long as it is opt-in (because I never will). Mozilla is slowly eroding their original, core values.
Who decides what is a "better" browser?
1. Is it the authors? Do they write the software for themselves and agree to share it for free with anyone who may want to use it?
2. Is it the users? Do the authors solicit feedback from users to determine what users want? If users demanded a browser with no default telemetry, would the authors comply?
3. Is it third parties who have an interest in the behavior of users? For example, domain name industry, ad-supported businesses, their employees or advertisers themselves. Are the authors on salary, compensated indirectly from advertising revenue? Or does it come from somewhere else?
4. Is it all of the above? If we follow the money where does it lead? Whose decision of what is "better" is the most important?
Mozilla is descended from a defunct 1990's company that aimed to license a web browser to corporations for a fee. It would have been very clear in that case who the browser was being written for. But today, it is not so clear who Mozilla is serving. It resembles some sort of "multi-stakeholder" project.
It would be nice to have a browser that fits description 1 or 2. I believe there are plenty of folks, including some developers, who would appreciate a browser with no default telemetry. By virtue of the total absence of data collection, they might consider it "better" than alternative browsers that "need telemetry" for whatever reason.
Using "images.google.com" as an example is too convenient.
That would be great if you could also add whatever TLD+1 most people would rather keep private as another example right after "images.google.com".
Until sites start programmatically generating a unique subdomain for each [Firefox] user.
Do you consider images.google.com to be eTLD+1? The eTLD would be .com; so, eTLD+1 would be google.com; and hence, images.google.com would be eTLD+2?
Sorry I do not accept this compromise. Mozilla seems to have lost its way of late. Sad to see a company that was at the fore front of Privacy, and Security abandon that in name of market share and performance.
I would rather sacrifice performance for privacy, not the other way around.
From EME, to the adoption of Browser Extensions as the only customization option, now this.... Mozilla and FF is changing in ways that are harmful to the open, secure, and private web. Following the trends and policies of MS and Google is not the correct path.
That said, I don't feel that we have a choice but to compromise. If we don't build a better browser, then the other browsers will win by default, which means you lose all those privacy and security motivations anyway.
This is not some gleeful romp down the yellow brick road of data collection. It's a hard-searched, difficult compromise to a question that there are no good answers to, and LOTS of disagreement about.
I have used FF since Ver 1.0 for a few reasons the top ones being it is Open Source, it has always been the most privacy and security focused browser, and were strong advocates of Open Standards that where inter-operable on ALL platforms with out vendor lock in
FF is still open source.... the rest that seems to be in flux now.
I don't see it as an either-or, but rather a balance to strike. A perfectly private browser with no marketshare doesn't help users. A completely compromised browser with 100% marketshare doesn't help users either.
Mozilla is not happy with us current users though, they would much rather trade us for Edge and Chrome Users..
Mozilla has made it clear it does not value the Users that desire Privacy, Customization, and Power in the hands of the user. Mozilla has Dreams of "beating chrome" a pursuit I have no interest in and place no value in.
The only hope is that one of the forks of earlier versions manages to get enough developers and an institution behind it that they can bring it back to popularity, but before that happens we might be calling the internet "Chromenet" and google won't allow you to visit their sites unless they have been signed with a valid Chrome developer key.
Edit: I've been with you guys since the beginning, but the line is drawn here.
If Mozilla wants perf data, collect it and then prompt the user "crash reporting" style.
I would totally opt-in to prompts. Give it a threshold and ask, "This page seems to frequently perform less well on your computer, would you like to send us a report?"
Random sampling and privacy run into conflicts not just in the browser space, but everywhere else. For example, recently the Canadian government went through a period where it allowed census respondents to optionally answer some questions that were previously mandatory (using privacy arguments). The result was several years of poor census information. The recent government reinstated the mandatory census questions.
The browser is just one arena where this everpresent conflict between knowledge and privacy plays out.
What have changed so much in last 5 years or so that now you have to get all this data? What is wrong with just building a standarts compliant browser that runs JS fast and has easy to understand settings (where I don't have to go to about:config to disable the WebRTC/telemetry/Pocket etc) ?
To be honest, a lot. Once again, this is my personal take on the matter, not Mozilla's view.
First off, browsers were a LOT simpler back then. The sophistication and complexity in a browser has grown significantly in the last decade or so.
Secondly, browsers have matured. Remember that this software category has only been around for 20 years or so. Compared to the code quality in browsers today, browsers of 10 years ago were crude and simple. As a software category matures, the low-hanging fruit dry up, so it's harder and harder to improve your product.
Lastly, competition. Firefox has the luxury of being released when the biggest competitor (Microsoft) wasn't putting real effort into its browser product. Google will not make that same mistake with Chrome.
Basically, the information we needed back then was less, because the problems were much more obvious, because the whole industry was still pretty young. Now browsers are much more mature, the ecosystem is much more complex, has a much wider user base, and the problems are becoming harder and harder to pin down.
> A perfectly private browser with no marketshare doesn't help users. A completely compromised browser with 100% marketshare doesn't help users either.
But for things like perf and regression? Really?
You might miss out on issues if users don't submit, but each submission is an indication of problem (because it's Firefox that decides a problem is bad enough). And you can still prioritize based on how common that problem is.
A random sample of users experiences perf issues, a random sample of users opts-in to the collection, you get a random-sample of data. (If you suggest they opt-in to continued collection, you might even get a continuous stream of samples from the same user.)
Yes, that data won't cover the people who don't have issues, but do you need to optimize for them? It also won't cover people who have issues but still don't opt-in, but do you think that is somehow correlated to the severity of the issue? Otherwise the data will be mostly unbiased. The variance will be higher than if you made it opt-out, but if you are doing sound statistics, you will have to handle that anyway.
You people are going ahead with this idiotic plan - because that is what Mozilla does, asks for feedback and then proceeds to ignore feedback - and you will lose another 2% market share.
The reason is painfully obvious: You betrayed one of the core principles of Firefox, which is privacy. You pissed off a lot of people which will NEVER come back because you stabbed them in the face and spat in the wound.
You also gave Microsoft and Google a freebie. Now they have something else to throw back at you: your supposed "more private" Firefox phones home with your users' browsing history (not strictly true, but people don't dig that deep into the minutiae).
Hows that stopping them from winning by default? You basically just disqualified yourself...
Make this thing optional, otherwise you are dead meat. If you can't "win" without betraying your principles, it's time to either throw down the towel and give up or just be upfront and admit that you are going to go all in, users be damned.
That last option would actually probably gain you a few users.
It can be a factor in security both positive and negative as XUL was very powerful and could be abused, but it also was used by some projects to enhance the security of FF or provide other security related functionality that is now no longer possible unless FF allows or builds it into the browser directly. Same for Privacy.
So since Web Extensions / Browser Extensions was started by all 3 of those entities with FF adopting them I am very very cautious of them
The same concern will of course apply to any other data harvesters, but that's for another thread
Now, here's my concern. I DO NOT want compromises. I DO NOT want to balance anything. I DO NOT want this telemetry crud on my browser spewing out my browsing history to anyone, no matter how anonymous you people claim it will be.
I just want a decent web browser.
What are my options? "Mozilla's way or the highway"? Redirect evil.telemetry.things.mozilla.org to /dev/null? Go back to elinks?
Or will there be a "disable this piece of crap utterly and completely" button somewhere not hidden under an URL? Or even better, a compile flag?
The main reason to collect data is monetization. People don't like to think they're being sold, so it's justified on other grounds. That's a universal. Since the way data is monetized is to track and segregate users, claims that it can be done in a privacy-respecting fashion are, therefore, specious.
There is one conclusion to be drawn here, and it isn't that Mozilla is going to respect my privacy.
Also interesting: the method they plan on using for anonymising this: https://en.wikipedia.org/wiki/Differential_privacy#Principle...
If that is not sufficiently anonymous, then please submit the reasoning why to Mozilla.
EDIT: OK. It's boolean flags (like use of flash) plus an eTLD+1 (example.org; not myname.example.org?). Even so, I believe this tracking should be opt-in with a disclosure screen that explains exactly what Mozilla is recording. Informed consent is a practice we should be promoting, even if it seems unnecessary.
Doesn't the differential privacy system described above prevent even that from being an issue?
Not to mention, people will tend to visit the same websites repeatedly. The entire premise of DP is that the real data will stand out from the noise, creating a compelling picture of what an individual visits on the web. How will that aggregate data be anonymized, when it is reported with (a minimum of) an IP?
In short, this still requires a lot of trust in Mozilla, even with the DP algorithm, to not do the wrong thing with the dataset. And, in my eyes, making this opt-out and not opt-in already compromises that trust.
Mozilla has been violating even the minimal legal standards in the EU for years, and no one cares.
It’s insanity that an organization promoting its products with privacy doesn’t even meet the minimal legal standards. We’re seeing Google Analytics tracking in parts of the browser ("Get new Addons" page, for example), without even the legally required cookie warning.
EU law is clear on this, as soon as you store any data, do any tracking, connect to any third party, or transmit anything for analytics, you have get opt-in.
Wouldn't that still leak health information? Less overall, but if any is bad, this still isn't acceptable.
5%^10. Very very unlikely. Sounds very similar to "guilty beyond all reasonable doubt".
A URL must not contain PHI. If it does, a breach has already occurred.
And Firefox is only collecting the domain names, it looks like.
I'd argue that domains are the same- there are tons of domains that clearly indicate what they're about (e.g. stop-drinking.example)
You can't, but that can't be part of Mozilla's threat model, and it's not relevant here anyway because Mozilla isn't collecting it.
And even if they were, that's not considered PHI legally. You are free to type any information about your own health that you want anywhere; that doesn't make it legally PHI, unless you are providing it to a Covered Entity.
> I'd argue that domains are the same- there are tons of domains that clearly indicate what they're about (e.g. stop-drinking.example)
This information is not legally considered PHI. As for privacy, SNI means that all domains you visit are already visible in transit, even if you are using SSL. Domain names are not considered private.
Do you have any sources that go into more detail?
When I've worked on PII in analytics, even TLDs were treated carefully. (obviously not the same from a legal perspective...)
PHI is an incredibly well-defined term legally and is not equivalent to PII. Some things that constitute PHI actually wouldn't qualify as PII.
There are a lot of resources that explain HIPAA in great detail; if you want to know the specifics like here, you have to read the bill and the case law itself.
How normal everyday people actually use their product cant be a part of the threat model... Really?
That is scary...
That said, when a "breach" has occurred is a legal distinction involving the control of information -- when protected data moves beyond those who have a duty to protect it. Saying that a particular technical approach creates breach is inaccurate.
I very much hope that the Debian maintainers (and hopefully also the guys preparing Fennec in F-Droid) will disable such data collection mechanisms, either completely or hidden behind an explicit opt-in instead of the opt-out suggested in the e-mail.
Do you have a citation for that broad assertion? My understanding is that this is highly variable across legal jurisdictions and even in Europe, which typically leads the way in privacy, it's not that simple. See e.g.
https://www.whitecase.com/publications/alert/court-confirms-... discussing an EU Court of Justice ruling that had two requirements: the ISP can link that IP address to an individual AND the website operator can get that information from the ISP.
Legally though Firefox would be allowed to collect this anonymous data from the user by having him/her send the data e.g. to an API endpoint they provide via IP-based communication, they would just not be allowed to associate the data with the IP address of the user submitting the data. In the end, it comes down to trusting the party that collects the data, at least if they don't perform anonymization of the IP address via other means, e.g. by passing the information through a third party proxy server.
BTW, GDPR does forbid to turn on such data collection by default (privacy by default), so they would be required to get the explicit opt-in from the user for that.
My understanding is that many of these details are yet to be settled with GDPR. The case referenced above was not interpreted under GDPR, which has yet to take effect. The definitions of personally identifiable data data rather vague, and precedent has not been set. A quick search showed conflicting opinions, but one perspective to consider is quoted below:
> In addition, businesses should note that Recital 26 to the recently adopted EU General Data Protection Regulation ("GDPR") states that the test for whether a person is "identifiable" (considered in detail above) depends upon "all the means reasonably likely to be used" to identify that person. The CJEU in Breyer did not directly consider the issue of likelihood of identification. If the BRD was not reasonably likely attempt to identify Mr Breyer from his IP address, this could potentially give rise to a different analysis under the GDPR. Consequently, it may be necessary for the CJEU to revisit this issue after enforcement of the GDPR begins on 25 May 2018.
This is a few years old, so if you know of some new decision or regulation that clarifies it would be great to know!
Now, you could of course argue that often it's not possible to infer the identity of a person given an IP address (e.g. because it is a dynamically allocated IP address by an ISP or an IP address of a proxy server through which many users connect to the Internet) and therefore store it, it would be very hard to impossible though (IMHO) to ascertain that none of the IP addresses which you store could be used to identify a specific person (what e.g. if there are 5 % static IPs in your data?). This in turn would make treating all of your IPs as non-personal data a risky business to say the least, as there will almost certainly be a way to identify at least some of your users from their IP addresses. The fact that you don't know about a particular way of doing this identification is not relevant for this.
My advice: If you do not use a very robust method for making sure that all the IPs you store are non-identifiable I would recommend not storing them at all (or at least truncating them to 24 bits, which does also not always eliminate deanonymization risk though).
Are you saying that people can not be identified by their IP address?
I think it's important to talk about this issue – especially the importance of not storing it long-term — but from my perspective the real concern is the industry dedicated to linking and sharing your online activity. Without that an IP has little value and with it they can deanonymize most people without using IPs.
And requires an opt-in under EU law, which makes this entire thing even more ridiculous.
Otherwise literally everything that connects to the internet in some way would have to treated in that way, and that's not how the law is currently enforced.
Not true. Tor has demonstrated that it's entirely possible to transmit data over the internet without revealing your IP address to the party you're transmitting to.
Latency also doesn't matter here; this telemetry could take 5 minutes to reach its destination and it wouldn't matter, so long as the data is eventually received.
I don't want my browser to collect any kind of data.
No, seriously; why? I don't get this mentality at all.
Let's ignore the exact implementation here for a moment, and assume that Firefox is somehow magically doing this data collection in such a way that it is guaranteed the data collected cannot be traced back to you as an individual. (E.g. "sufficiently anonymous".)
What problem do you have with that, specifically? How does this harm you in any way?
That's kinda beside the point here though, as the GP seemed to be against collecting this data _regardless_ of whether or not it's anonymized or not. I'm interested in hearing why.
I'm ok with testing things and sending feedback, but when I switch to a production environment, I just want my tools to behave like my tools, not the testing farm for somebody else.
Why should I prove to you that it would harm me? I just do not want it, it should be enough.
Obviously that's ridiculous, right? If the default setting was to not seed, torrent clients would be much less useful for everyone involved. Browsers sending usage stats are much the same way. While no individual user benefits from _their machine_ sending those statistics, it's better for the user population as a whole if the default setting is to send them, since those stats help the browser vendor build a better browser. (And before you cry "privacy", remember that in this context we're talking about a situation where the statistics are being sent in a way that is "sufficiently anonymous" such that privacy isn't an issue. See the GP.)
So while I agree you certainly should have the right to disable sending usage statistics if you wish (just as many Torrent clients let you disable seeding), expecting that to be the default setting is a bit strange.
Imagine I came to your house, and photocopied all your documents.
Don’t worry, I blanked out the name, so it’s completely anonymous, and everything is where it used to be.
Would you be okay with that?
I certainly wouldn’t.
Making this opt-in or opt-out is a question of consent, and choosing opt-out shows that you don’t give a flying fuck about me, and only want your own benefit.
There is a swearword used, but it’s not in the context in any way uncivil, as the plural "you" that it is referring to is an abstract person, a hypothetical entity – not any actually involved person. (In this case, the potential future group of people at Mozilla who might decide to override an explicit choice I made for their own convenience)
The topic is a choice that Mozilla plans to make, and is questioning users, and more about.
The decision has not been made.
My argument is that, if Mozilla (and whatever users Mozilla asks to give an opinion), choose to override the current decisions of users who do not want telemetry, and require a new hidden opt-out, then that would be proof that they (as group) don’t really care about the users choices.
The user I was talking to has no power in making that choice, nor do I. Nor is all of Mozilla making that choice.
I was using "you" with the meaning of the German word "man", (I’m natively German): one; you; they; people (indefinite pronoun; construed as a third-person singular).
Remove all "we respect privacy" from the advertisement.
I like your faith. However, if this change goes in, and the capability is there, it will get misused. Because, statistically that's how these things go on this planet+capitalism.
Or are you saying you're worried that they could _start_ collecting non-anonymized data in the future? If so, I don't really get that argument either. People always have the ability to change what they're going to do in the future, Mozilla deciding now not to collect this data wouldn't change that.
1. Run an opt-out SHIELD study to answer the question: "how many people can find an 'opt-out' button?". That's all. You launch this at people with as much notice as you would plan on doing for RAPPOR, and see if you get a 100% response rate. If you do not, then 100% - whatever you get are going to be collateral damage should you launch DP as opt-out, and you need to own up to saying "well !@#$ them".
2. Implement RAPPOR and then do it OPT-IN. Run three levels of telemetry: (i) default: none, (ii) opt-in: RAPPOR, (iii) opt-in: full reports. Make people want to contribute, rather than trying to yank what they (quite clearly) feel is theirs to keep. Explain how their contribution helps, and that opting-in could be a great non-financial way to contribute. If you give a shit about privacy, work the carrot rather than the stick.
3. Name some technical experts you have consulted. Like, on anything about DP. The tweet stream your intern sent out had several historical and technical errors, and it would scare the shit out of me if they were the one doing this.
4. Name the lifetime epsilon you are considering. If it is 0.1, put in plain language that failing to opt out could disadvantage anyone by 10% on any future transaction in their life.
I think the better experiment that is going on here is the trial run of "we would like to take advantage of privacy tech, but we don't know how". I think there are a lot of people who might like to help you on that (not me), and I hope you have learned about how to do it better.
If they start opt-out tracking using the same approach as Google I do not see any reason to use it nor install it for my friends and family. That's some data for you, Mozilla.
You want Firefox to succeed as a browser, but to be able to better compete it needs better usage data.
Wouldn't you prefer for Firefox to be the best browser available, AND also be considerate towards your privacy rights?
At this point why would anyone stay with the company B which broke its promise once, just in the hope that it won't break the promise again? It has already lost the trustworthiness and it also has the worse product. Might as well use products from company A.
Privacy is not a boolean.
If yes (and that’s what you get when you choose opt-out), then we’re done. There is no gradual change there, it’s a binary question if you value the user or your own benefit more.
And that’s strictly incompatible with mine.
First ask, then fuck up. Is that concept so hard to understand?
If you’d do that IRL to someone they’d never talk to you again, it’s the same with Firefox if they do this.
And of course this all started with you saying that you may as well switch to another company's products, a company which you know violates your privacy quite significantly. You still haven't explained why Firefox collecting a small amount of data in a way that tries to minimize any privacy violations means you should just give up any semblance of privacy and use a product that tries to collect as much personal information as possible.
I’ve dealt with these issues before myself.
And I understand well what they collect, how, and why. I understand how painful it is when you have no data on what is used, and how, or not even crashreports.
But there also is a limit to how far you can go, and where consent is required.
And when transmitting anything, or collecting anything, consent is required.
You could make it dependent on situation. If a performance issue occurs, show a bar: "Is this website slow? Click [Here] to submit a report so it can be improved. [Details] [X] Always submit".
This gives the user a far better understanding of what is submitted, why it is needed, it is contextual, and it is still opt-in (but with far better conversion)
And if the way Mozilla gathers data is much more considerate, what results can I expect from it? Better parallel requests and data fetching, hardware acceleration, etc are all features that are missing for me as a Linux user. They don't need my dataset for that, it's probably all in their bug tracker.
I prefer absolute privacy over some minor advantages on irrelevant webpages.
How do you even think this system would work in restricted environments such as governments where even the presence of code that could collect data is an absolute no-go?
For more information, see https://en.wikipedia.org/wiki/Differential_privacy for instance.
If the mechanism works, fine, but why should I use Firefox over Chromium then? Opt-out data collection is in violation to my core beliefs and what I believed to be Mozilla's principles.
Collecting data without asking the user about it is - to me - in violation to the very definition of privacy and calling some way to anonymise data (who guarantees that the cryptographic approach to this is not obsolete in a few years?) "differential privacy" is at the very least dishonest.
Two - DP is only really private over a small data set per individual. If DP were enabled for even two days, you could get a very accurate picture of the sites I visit, since a majority of the domains reported would be necessarily be accurate values.
Two: That's an interesting question. You'd need to ask it to someone with more domain knowledge than me.
Here's a concern that comes up from that implementation option: any outliers from the set of existing domains (which would likely simply be implemented as a list of strings) would immediately be able to be called out as a "True" value, while a single reporting of a domain could reliably called out as a "False" value. Unless, of course, you choose a randomization algorithm which exhibits a very strong clustering trait.
You could also limit reports to those domains which are in the whitelist, but that would voluntarily neuter the reporting; something they seem less-than-eager to do.
Ultimately, it will all come down to the implementation details, which are unlikely to be available until after the opt-in release, and auditable by a remarkably small number of people in the open source community.
For Firefox we want to better understand how people use our
product to improve their experience.
No phoning home. No telemetry, no data collection. No "light" version of the same, no "privacy-respecting" what-have-you. No means No. Nada. Zilch. Try and shovel any of that down people's throats and the idea of Firefox as a user's browser will die.
And now this :-(
I have been using Firefox since before it was called that. I develop my apps in it, even though most of my colleagues have switched to Chrome years ago. Even though it is (or was for a while) slower than Chrome for things like Canvas.
But I use because I believe in Free Software. But Mozilla keeps disappointing. DRM, bundled 3-rd party apps, analytics, tracking... It is just so very sad. :-(
Also, I have 17 add-ons installed (11 active). At present, of these 17, only 2 will continue working after November when the switch to WebExtensions is enforced.
Where to go from here?
Mozilla fought DRM until the very end and lost. If Firefox is to have any chance at remaining a mainstream browser it needs to support Netflix and the likes. You can't seriously blame them for this, because they are damned if they do and damned if they don't.
EME is implemented as unintrusively, securely and privately in Firefox as possible. No DRM is downloaded or run on your computer until you specifically consent to it, and the DRM components run in a sandbox.
Yes I can, and I will, because they sold out. They sold out their principles for the sake of market share. (And looking at their marked share, fat lot of good that did for them anyway.)
Excuse me, but did you support Mozilla with time/money?
> They sold out their principles for the sake of market share.
12% is still better than 1%, and the thing that mostly changed the landscape was the fact that mobile Internet heavily disfavors Mozilla (e.g. Android ships with Chrome, iPhone with Safari), and Google has a heavy advantage when it comes to advertising and engineering.
Yes, I have done. Thank you for the snark.
"It's as if the order to block/redirect the network request was silently ignored by the webRequest API, and this causes webext-based blockers to incorrectly and misleadingly report to users what is really happening internally."
There are probably security reasons why add-ons can't modify about:add-ons. Imagine an add-on that could hide itself by modifying that page.
Please don't spread FUD.
In this scenario, how exactly would Firefox's actions here compromise anyone's privacy?
Instead, it's telling that they are choosing to force people to opt-out. They know that their users don't want this, but don't care.
They still _are_ planning to let people decide for themselves whether to participate (via opt-out), they're just using a default that's more likely to result in unbiased sample data.
Again though, what's your actual concern? Provided this feature doesn't compromise anyone's privacy even _if_ its enabled, what's wrong with having it be opt-out?
If, as some commenters here [have suggested], this telemetry would help improve Firefox by significantly reducing the amount of time it takes Mozilla to fix bugs and performance issues in the browser, what makes you think that's not worth the risk when other features (such as the performance fixes themselves) are?
You just seemed to be arguing that _any_ amount of risk would be too much, which in my view is ridiculous since, as I said, all new features carry with them some amount of risk.
Unfortunately that's exactly the kind of thing I was talking about, extending arguments to ridiculous extremes.
I have never said any amount of risk would be too much. In this particular instance, I think the risk and the unknowns are clearly too much.
But why? I don't claim to know enough about RAPPOR to say for sure that the risk _is_ worth it, but it seems a little presumptuous to claim it isn't without knowing _anything_ about the project or Mozilla's proposed use of it.
That's why I assumed you were arguing that _any_ amount of risk would be too much; you didn't include any sort of analysis of the risk/reward in your previous comments, and without knowing the risk the only way to conclude this feature is definitely _not_ worth it would be if you already considered the acceptable level of risk to be zero.
It’s Firefox without telemetry, or no Firefox at all.
The only Anonymous Data is data that is never collected. If they collect data it is a violation of privacy.
This doesn't really seem unreasonable to me. Obviously part of the inherent cost if not wanting to be tracked is going to be not having your raw user data included in evaluations of what people want.
> in this case you do not get to set it
Nothing's been decided yet. If this is something you want to advocate for, maybe consider suggesting that in the thread linked in the OP?
Attacks aside, the point is really that in this age of statistical machine learning we should be vigilant against even this sort of data collection. A leak is a leak. Ideally people can opt into providing just enough information for the statistics they want to participate in and no more; realistically, more is always collected.
If your answer is "nothing" then I think you're being unreasonable. Firefox risks compromising security/privacy with _every_ new feature they implement, not just this one, and it's clear from [other comments] in this thread that this feature is just as important for the overall functionality of Firefox as any other feature would be.
You must be kidding me.
That to me suggests the problem isn't that too many people are opting-out, it's that not enough people are opting-in.
This trend towards parentalism in software, especially software that is supposed to be user driven is frankly a steaming pile of garbage.
If you have any shred of pretense of being pro-privacy and pro-user dont do this mozilla.
Additionally, it's not that Mozilla just disregards user privacy here: differential privacy being used would mean that no user has to reveal their private information, but looking at all the data in aggregate would still allow Mozilla to gain useful information on how to make Firefox better.
Because most people don't care, it was decided to implement a feature that is flat out contrary to people caring.
Management decisions like this don't exactly inspire confidence about the future of the browser.
IMHO, this is a bad idea. Many people I know already use Firefox because they're weary to give Google (Chrome) all their data.
Firefox should make this feature opt-in only.
It's not just about the data, it's about the lack of consent. If you just ask people for permission on the initial startup, I'm sure most people will be fine with enabling it. Last time I installed Firefox, it just showed a tiny bar at the bottom of the window, which is pretty easy to miss. I'd expect fewer dark patterns from Mozilla, that's the kind of shady behavior you see coming from Microsoft. I always try my best to disable or block anything which phones home without explicitly asking for consent.
Tunnelblick  is a good example of this being done well. On the initial run they ask if you want to enable automatic updates. It includes the option to disable sending anonymous system information, as well as including a disclosure widget with a brief explanation and a table showing the information that would be sent. 
I agree, but note that they are explicitly trying to get more info than they can from the small, biased sample that is users opting in.
Just starting to collect your browsing data is a bad idea (tm) especially if your main claim is "more privacy".
Maybe because most people using Firefox use it precisely because they don't want the browser vendor to track their behaviour?
I wonder how the Torbrowser folks will deal with this.
They get good enough data from the people that have volunteered it. I don't know what makes them think it's biased but I seriously doubt that is true.
Informed, constructive opinion there.
One clear sign of the bias is that the crash rate of the browser goes up massively every time a new version transforms from beta to release. Clearly, it's not renaming that string that makes the browser crash. The populations are just fundamentally different.
To give an obvious example, beta users are overwhelmingly more like to have up to date video drivers. (Which can be seen in crash reports, but is also very logical).
Most users do not care because they do not understand the true ramifications of their not caring. It is not like that looked at all the data, then made made an informed choice to share everything.
FF should be at the heart of caring for users privacy EVEN IF THEY THEMSELVES DO NOT.
The average person does not understand technology, how much data they are leaking about themselves and how this data can be used against their interests
Taking advantage of that ignorance for any type of gain is unethical IMO, most companies willfully exploit this collective ignorance Mozilla should be better than most companies.
Quite the opposite if the focus too much on the regular users they might get too much noise and never notice issues in the more complex features that only power users tend to use.
You want the heavy users of your product sending in reports not the average Joe because he is less likely to even notice a issue.
Higher level features are less likely to be covered by tests and more likely to break just because of their complexity however you wont have many average people using them.
Former Opera people can tell a few stories there.
[EDIT: Firefox branding used to use the word privacy a lot. I can't find it on their website much at all anymore.]
Firefox doesn’t sell access to your personal information
like other companies. From privacy tools to tracking
protection, you’re in charge of who sees what.
Here’s how Firefox protects your privacy
For instance, from the same page 8 years ago: "we have experts around the globe working around the clock to keep you (and your personal information) safe."
Or this quote from the equivalent site 6 years ago:
"And, as a non-profit organization, protecting your privacy by keeping you in control over your personal information is a key part of our mission."
These users will be installing Chrome, not Firefox.
The switch barrier is non existent.
Most of the replies in this thread are ideological. Nobody is arguing CSS rendering speed comparisons and such.
People use Firefox because they like it.
People are irrational but like is huge. Toyota over Hyundai. Vacations at the sea instead of skiing. Firefox over Chrome.
The like is a habit, but if all things are equal and free, a very flexible one.
And this move will also drive them away.
I think not having perfect information about the users is a trade off that should be made in order stay an alternative to most other browsers. There are still ways to get more data by other means, though. When it comes to most visited websites, for instance, the alexa ranking should give a good, if not perfect, idea.
Data is both highly alluring and addictive as evinced here by Mozilla potentially willing to shoot itself in the foot to get some. What's to keep this from becoming a frog in a boiling water kind of situation? How can I trust that Mozilla is going to adhere to their own stated standards? The easiest answer is that I won't have to because I can just use something else. Personally, the only reason I use Firefox is because it's slightly less convenient to set up a secruity-patched version of Chromium.
Other people in this thread have made the excellent points of the fact that not enough people opting in to data collection is in itself a critical piece of data. Moreover, things such as "Which top sites are users visiting?" can be answered by looking at data from page ranking services and then they can go to those sites on their own testing equipment to answer their other questions. A little investment in acquiring this data by not spying and maybe getting a wider array of testing equipment is probably less costly than the potential for loss in market share that they're already struggling to hold.
So we will toothlessly complain but then the changes will be shoved in our throats, because obviously why would one care what the non-targeted demographics whines about. And of course it will be framed as being 'for our own good' and half of the people complaining with just deal with it, just like the majority already does.
How does seeing which sites users use that need Flash drive their decision-making. Either they support Flash, or they don't.
And- ditto for "Jank" (not sure I understand that term, frankly- why is it capitalized?). Some developers don't optimize well- how is Mozilla going to use this? I think they do a good job over on MDN.
I guess I'd like to be sure I understand what problem they are trying to solve. Maybe they feel like without understanding their users they can't keep up with Chrome. I see people talking about how good Chrome is. And I must admit- it is sweet for me too. But that may be because (1) I don't have it loaded up with add-ons like I do Mozilla and (2) they have optimized for certain sites like youtube and gmail and I just can't get Firefox to work all that well on those sites.
But I'm not convinced that they need my data to fix that.
EDIT: On the other hand, Chrome seems to lose my passwords on every upgrade so it won't be my main browser until if fixes that little issue, which is going on, what, 5 years now?
"Jank" is our internal term for slow, non-responsive interaction with the browser (the capitalization of it in the original message is a little peculiar). If you click your mouse button, and then a second or more later, the item that you were clicking on the screen responds? That's jank. That input form that's not keeping up with your typing? That's jank. And so on.
We can (and do) collect statistics on how much jank people are experiencing, and we can look for ways to improve those statistics, but knowing what particular sites (not complete URLs, just eTLD+1 sites) jank occurs on is much more actionable. Browser developers can go visit particular sites to experience and analyze the jank for themselves, or we can see what janky sites are particularly popular in a given region and focus our efforts on improving those sites--either by doing things more efficiently in the browser, or reaching out to the site developers and asking them to consider changing things to make their site work better in Firefox. (Complete URLs would be even more actionable, but we don't want to collect your complete browser history.)
The argument for Flash is similar: we can get aggregate usage numbers for Flash, and perhaps see how that correlates to jankiness (or crashiness, or what have you), but having some information on what sites are using Flash makes the data even more actionable, for similar reasons as those given above.
I am a Mozilla supporter for more than a decade, but this is the wrong move.
I'm not saying I support this proposal or not, but here's an example of why this could be useful: Chrome was considering deprecating some API, because it wasn't supported by other browsers and they didn't think that it was used very much.
They collected generic statistics about how much it was used, but the numbers turned out to be much higher than expected, so they were considering leaving it alone. What if some fairly popular website they just hadn't heard of used the API? You might not want to break it, or at least you'd want to get in touch with the site to see if they could move to a more widely supported API.
In the end, they somehow (maybe through spidering, or somebody just happened across it in their own browsing) figured out that the high usage was due to being used by some ad network for fingerprinting. Not only was this not a reason to keep supporting the API, it was a reason to stop supporting it!
Well... there's actually a field for that. I forgot what they call that field because of how niche it is but my friend at google is doing just that.
He said there are math theorem to prove that it's sufficiently anonymize.
He gave an example of how Netflix competition with the data they gave researchers were able to deanonymize it. And his job was to prevent that at google.
I can see why if you're trying to sell users data while maintaining privacy.
Which, according to Google’s FAQ, https://support.google.com/analytics/answer/2763052?hl=en, just blanks out the last byte of the IP.
Which is useless, because it still includes enough personalized data as to be completely and utterly reversible.
In essence, Firefox will ask itself whether it visited website X and flip a coin and if it's heads, it will lie to the server and send a random boolean. If it's tail, it will not. This way there is no way for anyone (including Mozilla) to know whether you actually visited the website. But the statistics will work out such that the collective data from everyone will give a good representation of all users.
I find this a neat technology to collect data in a privacy-preserving way. And there's an opt-out (opt-in won't work because it creates bias and provides messy results).
I really, honestly don't understand why people are so upset.
I am starting to think that they just don't want people to use Firefox.
Yeah, I know it's free software, so I have no right to complain. I just wonder why?
People are insulting the developers, saying Chinese owned, VPN-operating Opera would be better for privacy... there is a lot of nonsense here.
IMO this is not the most needed feature, and I would be happy for Firefox to keep in mind its reputation as a product focused on user privacy.
2. Frankly even opt-out is not acceptable. I can't recommend any software that peridically asks users for data access, since there exist non-technical users who have a nonzero chance of clicking yes to everything. If they are related to me in some way this compromises my privacy also.
This isn't true. Panopticlick collects a ton of data about your browser that this proposal will not. There has been a lot of research done in this area and we know how to collect anonymous datasets. https://arxiv.org/abs/1407.6981
1. The concept is sound.
2. It is implemented as described.
3. It is implemented with no bugs.
4. Mozilla is trustworthy
5. Any third-parties Mozilla involves in this process are also trustworthy.
6. All of the above will remain true.
Doing this would take a tremendous amount of both time and expertise, if even possible. If every piece of software I use makes me do this every year or so, I would get nothing else done.
In practical terms, your argument is no better than just saying, 'trust us, we're good for it', regardless of the merits of your tech. And we know Mozilla baked Google Analytics into FF's addon page, so trust is in short supply.
This is true for the user, too. If the only viable choices are 'verify claims at great cost and no gain every few months', or 'use some other privacy-respecting browser', I am going to recommend the second.
It seems they've convinced themselves that the only way to improve the product is to collect data on their users, rather than continuing to push the idea of privacy - which, in my opinion, if marketed correctly, could win over a lot of users. The browser is still fundamentally awesome.
This seems like the kind of thing they could push through their TestPilot program and just market it, rather than pushing it to everyone by default. But I imagine they want to push it to everyone specifically so they can take advantage of those who are ignorant to the ability to opt-out.
Otherwise I might as well just use Chrome. Hopefully some PR guy will pour some water on this before it turns into a dumpster fire.
I would say that is none of the browser vendors business.
Please stay away with your opt-out stuff - it bothers me. Make it opt-in, always and forever.
Why participate in a no-op?
source? I never saw anything addressed other than "don't worry about it, it's for your own good"
This is the best I can do not being involved and two years after the fact.
1. You can not collect anything without explicit opt-in
2. You can not transmit any data to a third party
3. If a user requests it, you have to provide all data stored about them, and have to provide a way for them to delete all of that. (And you have to provide this at least once every 12 months via letter, fax or email for free) (compare §34 BDSG)
That includes IP addresses (just connecting to a socket without a user explicitly starting that action), names, emails, hashed IPs, it includes usernames, CC data, messages, interactions with webpages.
Anything that in any way is connected to a person is covered by this.
This directive is also the origin of the cookie disclaimers, which require opt-in before collecting statistics or loading any third party tracking solution.
But be aware, in May 2018 it all changes as the new EU GDPR comes into force, and that’s a bit more restrictive (and even applies to anyone processing or storing data of EU citizen, no matter where the processing entity is located)
(well, unless you're going to switch to Safari, since Apple also cares about privacy, though, spoiler alert, Safari's also using Differential Privacy to collect data).
First it was killing customization.
Now they are killing Privacy.
Why should I use this browser again?
Who runs Mozilla, do they understand why anyone would choose Firefox over Chrome?
Maybe it's time to put a spotlight on the management and decision making structures of increasingly important open source projects like Firefox to ensure they are being run in the public interest.
By now people should be aware that it is not just the content that is important, but also the metadata. A browser that phones home with information on users' browsing habits is not acceptable to many of us, who will move to forks or a different browser altogether. This from one of the people who "doesn't complain, but just never goes back."
Could someone explains to me how this information is useful to a browser vendor? It's not as if they are optimizing on a site by site basis.
I'm sure you already know all this and I'm sure people are getting sick of hearing rants about it every time it comes up. This is the second time in a week for a Mozilla product. I suspect they're trying to exhaust the ranters so they're just left with the users who don't care, "have nothing to hide", or think it's their duty to help the browser vendor squash bugs. No software or service should be trusted until it's absolutely necessary to get the job the user wants done, not the job the browser vendor wants done. It will never be necessary for a browser to send browsing data back to the browser vendor to get to a website.
How many people here use website/app analytics to improve products they work on?
No, it is not. Especially not for something such as a browser which is mostly transparent to the content.
Or why any external data would be needed at all, let alone why the opt-in data would not be sufficient?
> Name resolution APIs and libraries SHOULD recognize "invalid" names as special and SHOULD always return immediate negative responses. Name resolution APIs SHOULD NOT send queries for "invalid" names to their configured caching DNS server(s).
It's only SHOULD, not MUST. And in fact, the glibc resolver (and I bet also other major implementations) does send such queries to the DNS server.
Using the glibc resolver as baseline is a bad idea, it’s broken beyond hope.
Try resolving http://-emmawatson.tumblr.com/, which is a valid URL under newer standards, and works on all other systems. The Glibc authors refuse to merge patches fixing this, because they disagree with the standard.
For those interested in understand more about this project and why we're doing it, here you can find an introduction of Differential Privacy and what we're trying to do. https://twitter.com/Alexrs95/status/896366072240144385
1. You will absolutely obliterate any trust you have with actions like this. This is important. Because if you continue to ignore this and you will have tons of data but you will be absolutely clueless as to why your product and brand are completely abandoned.
2. This data isn't worth that much to begin with. Here is a crazy idea, try to make a better browser instead.
Also, just because a site is part of the top-N doesn't mean that it's part of the top-N for Firefox users.
Meeting the standard of true differential privacy is one of the strongest known unconditional privacy guarantees. It will prevent Mozilla from being able to answer any user specific questions. For example, they might have an accurate count of how many people visit Google.com (say 60% of their user base), but they will be mathematically unable to point to exactly which 60% visited the site.
Differential privacy in the RAPPOR implementation is peer reviewed and well understood. We can also review the actual code that ships in Firefox, which is a big plus over the Chrome implementation. There are some caveats -- what epsilon are they setting, are they adding an appropriate amount of noise, how do they protect against repeated queries, etc. but all of these can and will be reviewed by the differential privacy community.
I am not affiliated with Mozilla or Google, though I do work in the field of differential privacy. On mobile now, but I am happy to provide links or answer questions to people who might have any when I am back at a laptop.
> We can also review the actual code that ships in Firefox, which is a big plus over the Chrome implementation.
Sure, but note that it's been implemented already and will be pushed to the users as an add-on, without going through the full release process. Even this HN post seems to have been prematurely buried.
It would have been possible for this to be deployed without anyone knowing. Post-hoc reviewing of functionality as sensitive as this is not the ideal solution.
More Data Collection < Less Data Collection
I design optimization algorithms and software professionally and the majority of that software is released open source. Now, does my software likely run terribly on some problems that my users give it? Absolutely. That probably costs me business because they get frustrated, give up, and go somewhere else. And, to combat that, I could absolutely engineering my libraries to send anonymized information about their problem structure back to my company. Certainly, it would help me improve my software and algorithms. I also view it as horribly unethical, a breach of my customers trust, and an unacceptable course of action. Look, I want my software to work well for everyone, but it's part of my job to figure out when things don't well and fix that beyond scraping information about my customers uses automatically.
I contend this is a terrible idea and very much would like Mozilla to abandon it.
If it is opt-out, then Mozilla would have to be extremely open about how to opt-out and exactly what is tracked.
The real issue here is with ethos and perspective. I use Firefox because the ethos of the company and its employees, and their general "take" on issues like this allows (or has allowed) me a general sense of trust in them.
Even the very existence of this discussion erodes that trust. This says to me "the people making this browser don't understand the importance of consent, and have a vastly different perspective on the value of privacy to mine".
If your developers need more data from Telemetry, get consent and collect more data. Establish trust in users in what you do with that data.
> One recurring ask from the Firefox product teams is the ability to collect more sensitive data, like how features perform on specific sites.
> [for example]: "Which sites does a user see heavy Jank on?"
On the other hand, everyone complains Firefox is slow.
So,few pay, few opt in, and everyone complains.
There's the answer. And the response? "Tough shit", we'll take away that choice granularly. For our own good, apparently.
Moz has been giving tough shit with caveats for more than a few years now. Perhaps that is why market share is falling?
The question is are people saying no because they are privacy conscious, or because they don't care. My money is on latter. In general more people care about Firefox being fast than security.
What's a bigger issue for Firefox is deprecating its add-ons. That's going to hurt its marketshare way more than telemetry data.
I really don't understand this of the user, while previous sentence he writes:
"but I will say that I believe Opt-in is pro-privacy, while Opt-out is anti-privacy."
FHR is also opt-out, i.e., enabled by default. If you do not like this, you may want to disable this as well.
On Windows, Microsoft Edge is of course a lean and capable browser, but the OS itself is also collecting telemetry on you at all times, including browsing habits.
Hopefully someone will fork Firefox for Windows/Mac/'nix and strip out all the telemetry and data gathering bits, otherwise there's not much choice left for a privacy focused, full featured, fully supported browser on all platforms.
But you are correct, if Firefox falls as the last bastion in the web browser world to protect users privacy there is little choice left of really nice browsers.
If / when Firefox does this however there is really no real reason to even pick Firefox to begin with since all the other major players do the same.
"Brave makes money by taking 5% of any donations and -- after it is fully implemented -- a small cut of advertising that is placed. Brave even shares some revenue with you -- at least as much as we receive."
Then there's this:
If they are planning to inject ads into the browser and somehow pay their users a kickback, how do they expect to maintain a reputation as a privacy focused project? Even if they offer to pay in cryptocurrency, they are still tying browsing habits and targeted advertisements to a trackable user. No thanks.
I had switched from Chromium back to Firefox after Google was caught injecting binary blobs into Chromium at build time, and so last year when I decided to drop all Google products from my life, I already had a great (if slow) browser making it less of a hurdle. Now, though, I think I'll stick with Safari on my phone and Mac, and find a way to sync bookmarks from Safari to Midori on my other systems.
But I much rather just continue to use Firefox.
The real problem seems to just be marketing though. Regular people either don't see any reason to consider looking for an alternative browser or don't understand the differences. Years ago Firefox had a larger market share because the internet as a whole had a larger share of tech-savvy people and they had IE as a competitor.
Greed and monetization of user's data - this is the only real business model which brings profit. There is, probably, already a long queue of customers willing to pay for the data.
And, of course, "anonymity" is nonsense. The whole point of collecting data is to classify users into target groups and model user's behavior. In other words - to collect the data for machine learning algorithms and sell access to the datasets and other services.
With the 300+ upvotes and active discussion still occurring at hour 7 since posting, care to explain how the algorithm has relegated this discussion to the fourth page and is still dropping? Is the much less active, day old posting of a SF author's death more heavily weighted than an inconvenient topic that is important to several more factors of readers?
In my experience, the quickest, most reliable way to contact the mods is via the Contact link in the footer. You might want to try that as well if you're looking for an expedient response.
One of the last truly shiny examples of open source is losing the plot. Not only that it requires pulseaudio (alsa?), it is getting harder and harder to use it normally with FreeBSD. Now this.
I've had enough, testing links -g and it works well for most of my browsing needs.
In theory can be done, but in practice they are competing with Chrome and their team has waaay more data to use. And th is data gives them an edge at least on the decision which parts are worth improving.
So they can either start collecting some data and really piss off their most vocal privacy minded users and try to use this data to improve FF and steer it away from the death spiral it's on. Or they can keep the vocal privacy minded people happy continue to work in the dark and pretty much ensure that FF will become one of the insignificant .3% market share browsers.
Because somehow I kind of feel that multimilion fundraisers to make FF popular again aren't gonna happen second time.
The problem with our own browsing data--by which I'm assuming you mean the browsing habits of our ~1000 employees--is that it's wildly non-representative of the broader population. For instance, people here routinely have browser sessions with 10, 100, or even 1000+ tabs. These numbers also indicate that the browser is an application you start, and then you just leave up for a while, perhaps until you restart your computer or you have to update for whatever reason.
The latest statistics we collected on a broader sample of users indicates that the average number of tabs is...2. The average session length is on the order of minutes, not days. Such knowledge leads to very different choices when deciding what browser features to prioritize.
And it's not just browsing usage, either: most employees probably have a top-of-the line (or close to it) Mac laptop, Windows desktop, or Linux desktop; developers have a machine with four, eight, or even more cores. These machines are hardly representative of the wider Firefox user base: a significant majority of our users (~70%) has a machine with two cores, and users with a single core in their machines outnumber users with 8+ cores. We'll not even cover graphics hardware or screen resolution here; see https://hardware.metrics.mozilla.com/ for more examples.
Using our own browsing habits and our own machine specs for making decisions is not feasible.
I value the expertise at Mozilla. Could you point to a browser that might fit me?
It is not on a death spiral bevause of the missing user tracking. It's on a death spiral because of Google and Chrome. And tracking is a way of catching up. It's much easier to improve when you know what your users do with your software.
> They don't plan on collecting URLs, just (eTLD+1).
This is true as of right now, but can change at any time in the future. From the post:
> What we plan to do now is run an opt-out SHIELD study  to validate our
implementation of RAPPOR. This study will collect the value for users’ home
page (eTLD+1) for a randomly selected group of our release population
This test consists of collecting domains, indeed, but that doesn't say anything about what will happen in the future.
> Note: "planning" means "reaching out for feedback about".
Planning means planning. Today they're reaching for feedback, and the plans might change or not.
> Hello, Redditors...
This is my fault, I suppose, for posting the link here :). Many of the angry comments are uninformed, but the users, educated or not, are stakeholders here and Mozilla should be prepared for the fallout. There have been situations in the past (Pocket, Google Analytics) where well-formulated feedback from users was raedily dismissed.
> One recurring ask from the Firefox product teams is the ability to collect more sensitive data, like top sites users visit and how features perform on specific sites. Currently we can collect this data when the user opts in [...].
Does anyone know what this is about? Telemetry? Because I will disable it if so.
> Allow Firefox to install and run studies
This is from the Nightly settings page but is pointing to https://support.mozilla.org/en-US/kb/shield, which doesn't exist (yet?). For anyone interested, there's a wiki page about them https://wiki.mozilla.org/Firefox/Shield/Shield_Studies.
> What we plan to do now is run an opt-out SHIELD study  to validate our implementation of RAPPOR.
This still sounds bad enough to forever poison "SHIELD" for me. It's also terribly named because it doesn't "protect" anyone.
> No telemetry, no data collection.
Without telemetry it would be almost impossible for the developers to figure out what works or not, and what's fast or not in Firefox. There's a whole spectrum here from "no telemetry" to "creepy". Please don't ignore this.
> Now they are killing Privacy.
Please try to get informed. A Mozilla employee in this thread (alexrs95) posted a series of tweets about what's being proposed: https://twitter.com/Alexrs95/status/896366072240144385. It's short enough, so please read at least that before complaining.
> What's your favorite open-source browser?
> I've removed all URLs from about:config and replaced them with localhost (search for "http"). This should help with privacy-related issues as long as no API endpoint is hardcoded.
Beware of SHIELD, as Mozilla may still have the ability to push extensions to the browser.
> He said there are math theorem to prove that it's sufficiently anonymize.
I've not dug deep enough into the RAPPOR paper, but they do consider in passing the possibility of an attacker that has access to all of the collected data (think https://en.wikipedia.org/wiki/National_security_letter).
> Everyone else
Please be kind.
EDIT: Looks like this post might have been pushed back from the front page by a moderator. I'm not sure I'm fine with that.
My reasoning is that people might search for my comment (I sometimes do when others post), but by the time I wrote the it the first comment page was full and it ended up on the second one.
It's like describing to a spouse a system of sex with strangers that includes blindfolds and a hazmat suit. Such a system could be a great way for a person to learn more about their sexual tastes and improve coitus overall. If the spouse is anxious about the system, then all they have to do is find a problem with the hazmat suit that would endanger the...
Wait, spouse, you haven't even studied the system that I so carefully designed to protect you from the possibility of...
Has anyone seen my spouse?
EME is not DRM, it's a fully open source spec to support third-party DRM modules. If you don't actively choose to install a DRM module, there is no DRM in your Firefox.
> 3-rd party apps
Like what? Pocket is fully owned by Mozilla.
> analytics, tracking
So far this has been 100% opt-in. It might change with this new thing, but even that's not for certain.
Please don't use uncivil internet tropes on HN. If you have a substantive point to make, make it thoughtfully. Your comment would be fine without the first sentence, but experience unfortunately teaches that flamebait has more impact than its accompanying substance does.
We detached this subthread from https://news.ycombinator.com/item?id=15072694 and marked it off-topic.
Can we please stop with using the word FUD for things that are not? The very idea of accepting DRM as a possibility in the browser was a slap in the face to those who believe in internet freedom.
> Like what? Pocket is fully owned by Mozilla.
Check your facts: it was added to FF 1.5 years before Mozilla bought it. It was an example of them just not giving a shit and adding it anyway.
> So far this has been 100% opt-in.
Check your facts: the Google Analytics in the extensions page that I linked is explicitly "opt-out" and even that happened only because people found out about it and (rightly) raised a stink.
(And really, of all the analytics choices, they fucking picked Google Analytics?!)
Pocket's integration during that time amounted to a button which sent a request to a potentially useful service. Such buttons have existed in firefox long before pocket ever existed, and you are of course welcome to no click on them.
I think perhaps you are saying that a pattern of information should not be locked up, but instantiations of a pattern can be.
So, it exists only to facilitate DRM. And I could have sworn that it defaults to enabled?
> Pocket is fully owned by Mozilla
It wasn't when it was added.
>> analytics, tracking
> So far this has been 100% opt-in
Other than the Google Analytics on the add-on pane, maybe.
It does not.
firefox.exe 3588 TCP pc-name 49172 ec2-35-167-184-4.us-west-2.compute.amazonaws.com https ESTABLISHED 3 667 5 3,334