Hi all, Founder and CEO of DuckDuckGo here. I’m literally just waking up and reading the comments here.
I’m new to this issue and happy to commit us to move to doing this locally in the browser and will have us move on that ASAP.
That said, I want to be clear that we did not and have not collected any personal information here. As other staff have referenced, our services are encrypted and throw away PII like IP addresses by design. However, I take the point that it is nevertheless safer to do it locally and so we will do that.
If anything, it's much better than 'no big deal'. It's "We made this design decision, thought you would like it -- we've learnt, changed, and will avoid it later".
Can you imagine Google doing something similar? Heck, they're just about to throw the Android rooting community under a hardware-attestation DRM-filled bus.
Could have been quicker but it's still far better than some smarmy non-apology and continuing as usual, which is sadly what we've come to accept. I don't think they've done too poorly.
They started a fire through mild negligence, denied the fire existed, and only put out the fire when the entire neighborhood started yelling.
It was a forgivable-but-negligent decision to write/approve that code in the first place. It was a sign of a bad process that a reported security vulnerability was not escalated to people security-conscious enough to immediately identify this as a major problem.
I don't agree with the outrage. Anyone who has followed DDG knows they're legit. They just need to do a bit better. They probably will.
Their main feature is privacy. They should be at least as sensitive to privacy vulnerabilities as their most aware users.
DDG should announce that they now pay out privacy-related vulnerabilities like this and send the reporter $5k. It would be good honest PR and well worth the expense.
Correct me if I’m wrong, but by default DDG uses redirects to prevent leaking your search queries through the referrer, so they already can technically see every URL you visit. Except their whole product and system is designed around protecting privacy and not storing that data. If the favicon endpoint respects the same rules (which it obviously does), it is no different.
Wow. That was really not very complex at all... Did they leave off some capabilities that the server side version performed, it was the complexity overstated?
Creating a throwaway account to disparage a particular point of view is also questionable, no?
At what point did we stop taking people's word and commitment as valid? Sure, I too want to see proof that they are doing the right thing here (because I don't understand the design decisions that led to the creation of that service in the first place), but because these changes are not immediate, this statement does at least answer some of the questions I personally had (Like "will this be fixed"? "when"?).
> At what point did we stop taking people's word and commitment as valid?
At the point that they where caught violating privacy while claiming to respect it, and standing to make a profit off of violating it.
If I trusted people as blindly as people seem to trust DuckDuckGo, I would have trusted Google to not be evil and never have switched to DDG in the first place. This breach of trust destroys the whole point of using it, so I switched my default search to Searx between reading this submission and writing this (had been procrastinating the switch for a while and this was exactly the motivation I needed).
I will of course be keeping an eye on whether or not they follow through, but the admission of this being a problem and committing to fix it is -far- more than we see from most companies these days. Is it everything I ever could have wanted out of the company? no. However it is a positive direction I would like to see more of so displaying my appreciation is hopefully validation for the company to continue in that direction.
It's a little confusing to see 2 accounts, with very different usernames (epi0Bauqu and yegg), that appear to be posting as Gabriel Weinberg . Are these both legitimate? And if they are, what's the reason for two of them?
Yeah that's my bad. epi0Bauqu is my original account, but since no one knew who I was under that account I, some years later, made the yegg account and I try to post from there. The issue here is I logged in to post the comment, and then switched to my laptop where I was already logged in this other account.
It was only a minor issue that got resolved quickly thanks to your question, but DDG's response to this whole issue of losing users' trust seems to be accurately characterised by the phrase "a succession of embarrassing own goals". :-/
We never used this data, other than showing favicons.
In fact, we didn't (and don't in general) collect any user-level data in the first place, per our strict privacy policy: https://duckduckgo.com/privacy
In this case, the way it works is you hit our favicon service and it returns the favicon, not using any PII in the process, and our web servers are configured not to log any PII. In other words, our system is technically designed and we are legally bound by our privacy policy to not use this data for anything other than showing favicons.
You might want to clarify you never retained any PII, unless you can 100% confirm that that favicon request didn't include any embedded user- or installation-id in cookies, headers or the like?
However, this problem demonstrates gross incompotence for a browser team supposedly concerned with privacy. Will you please do a post-mortem on how this code made it through your code review process in the first place, as well as how it managed to stay in place for a full year after it was pointed out that it represented a privacy problem?
"Sends every URL you visit to the vendor's servers" is the single worst thing DuckDuckGo could have done for privacy in this web browser, and that needs to be accounted for. There was a major failure in the code review process, ticket review process, and in how you treat your community. A standard marketroid "by design" response with washy promises that "we'll take very good care of this highly sensitive personal data, just trust us" is not something I want to see in the future from this team.
I’ve worked with many companies who have demonstrated “gross incompetence” when it comes to privacy and information security. This is absolutely not an example of gross incompetence.
I agree that for a company built around privacy even the appearance of impropriety needs to be avoided. DDG holds themselves to a higher standard and their users hold them to a higher standard.
This was a design flaw and a process flaw. DDG prioritized speed and efficiency over privacy (or in this case, perceived privacy) and I suspect there isn’t a soul on HN who hasn’t made that trade off at some point. They assessed the cost/benefit and risk/reward and it turned out their assessment was wrong. Now they’re fixing it. It happens. But to call this gross incompetence is really blowing it completely out of proportion.
I'm not blowing it out of proportion. This one specific "design flaw", if we're being generous, has been raised many times with many different browser vendors and add-on vendors as a very bad thing that you cannot do. There is plentiful wisdom on this issue.
The first rule of privacy is never handle the private data in the first place. An accidental leak is one thing, but deliberately designing a feature whose side effect is exfiltrating heaps of private data, then doubling down on it for a year after it's pointed out to you, then doubling down again when it's raised on HN - this is gross incompetence.
You can’t think of anything they could have done that would be worse than sending URLs to their lookup server? It’s the single worst thing?
My browser syncs URL history between my devices, and that’s a feature that I value about it. Your comments on this topic seem to suggest that all users are making the same decisions about what is acceptable usage of their data, and that’s pretty obviously not true.
Maybe you’re taking this a bit too far? They explicitly state they will not store your data anywhere, and the main safeguard you have for that is your trust in them, not this one specific line of code somebody happened to notice which can’t even break that promise on its own.
Privacy is never built on trust. It's built on mathematical and logical facts. The only effective way to keep data private is to never handle it in the first place.
Looks like they developed the favicon service for their search engine so they could show favicons in search results.
It actually increases privacy there, DDG already know what domains your search returnedy and the alternative would be fetching favicons directly from websites, leaking information to them before even clicking.
Then they re-used the favicon service for their browser without rechecking the privacy issue and realising that browsers have different privacy needs.
Sure humans make errors, but this service is for a privacy-concered-company like you or me walking over a red traffic light without looking to the left or right and beeing crashed by the biggest Truck one can imagine. Anyone who belives in an error is beyond help.
The issue was highlighted from open-source code from a company who proactively works to reduce privacy.
It seems rather reasonable to assume that in the goal of maximising privacy in search engines, it would be wise to add user-friendly and convenient features so that the alternative is appealing and does not sacrifice too much convenience.
> That said, I want to be clear that we did not and have not collected any personal information here.
So, you do collect information just not the kind you would classify as 'personal information'. I wonder if my personal domain with my full name qualifies?
There is no way this feature would be created in a company built on privacy considerations.
Sure it could. A junior programmer aware of the values but not the repercussions could easily overlook this without any malice or disregard for privacy.
> So, you do collect information just not the kind you would classify as 'personal information'. I wonder if my personal domain with my full name qualifies?
Considering they have operations in the EU, I would imagine that would fall under what the GDPR considers personal information (https://gdpr.eu/eu-gdpr-personal-data/), or risk being in breach of it.
To be more clear, your staff, and you, have said PII ‘like IP addresses’, and have said ‘thrown away’ some places and ‘not collected’ others.
Contrary to this framing, it’s not possible to not incidentally become aware of every single browser users’ usage timing and user IP addresses if the browsers are phoning home this way — a colloquial understanding of ‘collect’, not the James Clapper NSA dodge definition of ‘collect’. Most normals think of collect as become known not as permanently store. You knowing it means others can know it if you break trust or are required to comply with authorities.
And regardless of end-to-end encryption, that this user is phoning home to your fave icon endpoint, when, and from what IP, is revealed to every ISP in the chain. You’re leaking browser usage telemetry to every single party to that traffic — the source IP address PII you mention is in unencrypted metadata.
The fact this browser connects to that endpoint reveals demographics (choice of privacy browser) and behaviors (when and how much web surfing) to e.g. ISP or nation state firewall operators who are certainly not bound by your ‘just trust us’ privacy policy.
Privacy policies are a patch for insufficient privacy engineering.
To be a strong privacy browser you could consider what it would take to be “NSL proof” such that if handed a national security letter with gag order, you cannot comply. That is not the case with this faveicon telemetry endpoint.
Just to be fair, as a matter of fact, you surfing that site is revealing you, surfing that site to your ISP and state actors, in the first place. A change, where to get the icon from (origin vs ddg), will not change this fact.
It is all about ddg not getting to know, which sites you are surfing, when not searching for it on ddg. Which should, indeed, be a no-brainer.
"You surfing that site" not adding a single surveillance point for every user of a particular browser, it's not compromising your other privacy measures (e.g, VPN that prevents your ISP from knowing) by sending your site surfing data out to that surveillance point, etc.
Getting the icon from each site means surveillance would have to be at origin or every site, while telemetry going to DDG gives a single surveillance point.
I've already posted this somewhere else, but I'll copy it here again as well:
It's not immediately obvious whether it is more privacy preserving if the client automatically makes a request to each site in the search results while scrolling through the results, especially since you're already trusting DDG when performing the search.
Maybe this should be an opt-in rather than an opt-out feature?
All in all its really not as big of an issue as people here make it out to be.
This was not about search results, where their favicon webservice is in fact privacy increasing, but about the privacy browser and the favicons it displays, where it is privacy decreasing as it involved sending information about visited sites to a central authority while you are not on the DDG search engine. For example the TabRenderer will fetch the favicons from DDG instead of from the site you are actually visiting: https://github.com/duckduckgo/Android/blob/db728523240e37727...
Thanks for pointing out that the service was already in use on their search results pages. To me, this goes a long way toward explaining how this could have happened:
Scenario #1 - "We need to show favicons in our browser tabs. Lets develop an API that requires every domain be sent to us!"
Scenario #2 - "We need to show favicons in our browser tabs. Hey look, we've already got a service that provides this. We know it collects no PII and our users trust it already."
Obviously the second scenario is flawed thinking, because (of course) it's better to not send that info at all. However, I can easily see how their developer(s) may have arrived at the conclusion that this is still compliant with their privacy ethos.
The fact that the favicon service already existed (and was trusted by users) before this was implemented, makes it much easier to understand how this could have been a legitimate mistake and thus, they deserve the benefit of the doubt.
Ah I guess I should have read TFA l, because search results have a similar feature (that is opt-out). For a browser I agree it makes more sense to get the icon directly from the site being visited without any privacy risks.
Yes, after consulting with staff, I understand we thought it was more privacy protecting because we know our services are already encrypted and throw away PII, and so to get the favicon you could either (a) make another request to our known anonymous service or (b) make a request (or possibly multiple) to a non-anonymous service. On the other hand it is another request to a distinct domain that traverses another path on the Internet, albeit an encrypted one.
Exactly. Everyone should decide for themselves how big of an issue it is for them and not one person who just shrugs it off. That person made their statement sound like a universal fact.
I think the most charitable (and probably the default) reading of that clause has an implied "I believe" prepended. It certainly does not have an implied "It is a universal fact that" prepended.
By way of analogy, when you earlier said "This is ludicrous. Can we switch to the metric system already?" no one thinks that you're speaking for everyone, but rather are advancing your own belief in the "This is ludicrous." sentence.
I’m new to this issue and happy to commit us to move to doing this locally in the browser and will have us move on that ASAP.
That said, I want to be clear that we did not and have not collected any personal information here. As other staff have referenced, our services are encrypted and throw away PII like IP addresses by design. However, I take the point that it is nevertheless safer to do it locally and so we will do that.