Hacker News new | past | comments | ask | show | jobs | submit login
Duck Duck Go: Illusion of Privacy (2013) (etherrag.blogspot.com)
248 points by awaisraad on Sept 23, 2017 | hide | past | web | favorite | 123 comments



I think DuckDuckGo is unfairly singled out here. They do more than most companies to protect privacy, and most of their users are specifically trying to deprive Google of more feed for its data silo. Of course they can't protect you from the NSA. Extremely few actors can.

If your threat model includes actors within the US Federal Government (especially the intelligence community), run. Yesterday. That's a statement about our times, not about any particular company.

The solution ought to be browbeating the US Government for unethical practices, not browbeating a company that does privacy better than most, and not as well as would be necessary to stand toe-to-toe with some of the most powerful and far reaching organizations in the world.


The article was a response to a guardian article that ultimately cited https://siliconangle.com/blog/2013/06/14/duckduckgo-the-pris...

> “By not storing any useful information, DuckDuckGo simply isn’t useful to these surveillance programs,” says Weinberg. “We literally do not store personally identifiable user data, so if the NSA were to get a hold of all our data, it would not be useful to them since it is all truly anonymous.”

DDG is "unfairly singled out" for good reason, namely that company representatives made an incorrect assertion. DDG is still useful for ongoing surveillance, as the article pondered:

> But what if DuckDuckGo provided a splitter-feed to the NSA? DuckDuckGo can claim without lying that they store no personal information, but that speaks nothing of a collaborating partner storing it.


Not to get off topic but there's a part of me that suspect the Equifax hack has the NSA (or will ultimately filter back to them). When I read Dragnet Nation a couple years ago one of the things that left an impression on me was the fact that the gov can buy "private" personal data on the open market just like anyone else can. That is, it's not spying (and a violate of right / laws) if the data is on the free market.

Obviously, DDG isn't perfect. I'm not naive. But for me there's some value in trying not to succumb to Google's desire to have us all assimilate.


> there's a part of me that suspect the Equifax hack has the NSA

This is ridiculous. It makes no sense why the NSA or any other members of the US Intelligence Community would cause this sort of reputational damage and nationwide outrage to a company when they could simply walk up to them with one of those fancy national security letters with a built in gag order and take the information with much less of a fuss.

Other than that, I agree. Google should buy me a god damn drink if they really want to get to know me. :)


Ridiculous? Not at all.

In the context of the history of the CIA and NSA it's standard procedure. And that's just the stuff we know about. Certainly Snowden taught us anything is possible, that they have no restraint.

B

On the other hand, politely asking an outfit like Equifax for (just about) ALL their data with no real reason would be out of character. Why bother? Why go on record?

It's pretty simple. Why would they get via hack? Answer: Because they can.


> Snowden

Snowden, a man who worked from home in Hawaii as an NSA contractor making over $200k a year with his smoking hot girlfriend, who suddenly decided he had a moral compass, was able to download tons of classified files over VPN to his laptops and run off to meet Greenwald et. al. to divulge all this information, then flee to Russia where he sends out anti-Russian tweets.

Meanwhile, nothing has happened. The NSA is still spying. PRISM seems to be something everyone has forgotten about (did it even exist? Most companies denied it). His evidence were all rededicated power point slides.

Question the narratives that are given to you.


What does Snowden's girlfriend have anything to do with... well, anything?


I'm not sure what the poster was getting at - are security contractors with smoking hot girlfriends making 200k a year less credible such that we should explore conspiracy theories?

If so, poor security contractors (and their apparently less appealing girlfriends) with our cynical assumptions.


Probably just salty that Snowden gets to have a hot girlfriend and also be a hero.


The accessed data at Equifax are SSNs, DOB and address. Why should the state hack a company to access data they already have? After all, SSN is state-issued and addresses at Equifax will be known to at least one agency (probably IRS).


That's a good question. My reply would be I'm not sure the intelligence community / legal profession can collect and hoard such things without legal approval.

Certainly Equifax has other data. Maybe they didn't get what they wanted? Maybe Equifax isn't being transparent? Maybe they don't know?


That's a lot of maybes to support an already unlikely scenario.


Unlikely? Because in the history of the NSA or CIA there has never been an "No way??!! They did what???"

What do you think the NSA is doing with all that computing (and storage) power? Processing cat photos? :)


This vaguely reminds me of the time a paranoid coworker told me not to store tax documents on the cloud because the government could get at it.


You're missing the point. That is it's possible, if not probable (per books like Dragnet Nation, Snowden, etc.) that data is being collected and you have no idea why, who, etc.

I'm not saying i think that's what happened. I'm saying there's a part of me that has good reason it it's possible it could be that. Crazy? Not more than some of things we know the intelligence community has done or tried to do.


I'm not missing the point. I'm not saying it isn't happening because it's some crazy Tom Clancy plot. I'm saying the intelligence agencies have better ways of getting that data without also giving it to a ton of other bad actors


You should read Dragnet Nation. I'm not talking about bad actors. I'm talking about legit data collection and resale outfits.

We're being tracked. One click/visit at a time.


>But what if DuckDuckGo provided a splitter-feed to the NSA? DuckDuckGo can claim without lying that they store no personal information, but that speaks nothing of a collaborating partner storing it.

Not to mention they don't need to provide anything themselves. Unless DDG has their own cables to users homes, after DDG connect to the internet backbone and before the user connects to DDG, the various agencies have all kinds of opportunities to get their feeds. It surely isn't SSL that will prevent them.


Can you be a bit more specific on your claim that the various agencies can break TLS?


They don't have to DDG runs on AWS. In theory, court orders could grant access to the underlying VMs via Amazon without DDG even being notified.


They provide a tor hidden service.

3g2upl4pq6kufc4m.onion


If I were a conspiracy theorist, I'd think there was something nefarious going on when I see articles like this. What if the intended result is not actually browbeating DDG but, rather, making people think that DDG is no better than Google in the privacy arena so why invest the energy in switching?

If DDG isn't any better than maybe nobody is, so we might as well get used to the lack of privacy. Why switch if you're just going to get worse results, expend more energy, and not actually get increased privacy? You might just as well give up the struggle...

Fortunately, I'm not a conspiracy theorist,


Conspiracy just means multiple people working together in secret toward some common goal. People do that all the time. Why just last week we had a closed door meeting at my company about how we might conspire to create more interest in the user community for our product. That's a conspiracy.

Of course there are nutty conspiracy theories but there are nutty math and physics papers too. I feel like the blanket ridicule of any notion of conspiratorial action is meant to discourage any form of investigative journalism or deeper probing below the surface of the narrative.

Your hypothesis is reasonable and plausible.


Sorry, my post was mostly satire/tongue-in-cheek.

If I were in a position of power, and I knew that DDG was more difficult to get user data from, then I'd absolutely find a way to have articles published like this. I imagine that disenfranchised people are less likely to seek alternatives and are more likely to resign themselves to being monitored. It also seems like it would be more politically acceptable as well as offering a layer for plausible deniability.

Were I in charge, and a nefarious actor, I'd be slipping stories like this into the media quite frequently. It's low cost, low risk, and probably has a good chance at success with at least some percentage of the population.

I'm not sure I have the lack of ethics required to really do such a job. I'd like to hope that I'd resign before I helped do those sorts of things.


If you were in that position of power and had this article published you might not be doing such a great job. Streisand effect maybe?

Since this article I have, which was in the making for some time already, switched from Chrome to Firefox with duckduckgo as my main search engine. Added Disconnect and Ublock origin as well. And yes I realize it doesn't help against NSA tracking of a certian magnitude.


It's good to finally see someone else say that. If you want more data to use, I tried to hit this in a systematic way in a counterpoint to Bruce Schneier on psychology of conspiracy theory:

https://www.schneier.com/blog/archives/2013/06/the_psycholog...

I meant to turn that into a standalone essay, too, but forgot about it. Thanks for the reminder.


>Conspiracy just means multiple people working together in secret toward some common goal.

I checked various dictionaries available online and all of them have a common definition like "a secret plan by a group to do something unlawful or harmful."


I don't get why Google is always singled out either. When was the last time Google leaked your data. As you said, very few actors can protect your data from the NSA, so the next best thing is to have it protected from hackers and leaks. Every other company out there keeps on getting hit left and right, but for their size, Google is only of the only company who has never messed up with user data, and you know they are probably one of the most targeted, especially by state actors such as china and russia.


As others have noted, Google gets singled out because they collect so much information.

It's true, however, that they don't want to share it. It may even be true that they specifically don't want the NSA etc to steal it from them. And indeed, I recall an unofficial Google response to Snowden's leaks:

> Fuck these guys.

> I've spent the last ten years of my life trying to keep Google's users safe and secure from the many diverse threats Google faces… But after spending all that time helping in my tiny way to protect Google -- one of the greatest things to arise from the internet -- seeing this, well, it's just a little like coming home from War with Sauron, destroying the One Ring, only to discover the NSA is on the front porch of the Shire chopping down the Party Tree and outsourcing all the hobbit farmers with half-orcs and whips.

https://www.theverge.com/2013/11/6/5072924/google-engineers-...

So anyway, DDG arguably gives users some privacy from Google. But neither has a better claim to providing privacy from the NSA etc. Indeed, maybe it's Google that does, given their greater resources.

If you want more than that, use some mix of VPNs and Tor. Or if you're not feeling lucky, stay off the Internet ;)


>When was the last time Google leaked your data[?]

According to the State of California Department of Justice [0], the last publicly acknowledged data breach from Google was March 9th, 2017. Before that it was August 10th, 2016, and before that March 29th, 2016.

[0] https://www.oag.ca.gov/privacy/databreach/list


According to the notification letters on that site, those three incidents all involve Google employees' information being leaked by third parties, not Google leaking users' data.

Here's a quote from one of them:

"We recently learned that certain hotel reservations made for Google business travel were among the many reservations affected by a security incident impacting a third-party provider’s electronic reservation system that serves thousands of travel agencies and hotels. This did not affect Google’s systems. However, this incident impacted one of the travel providers used by Googlers, Carlson Wagonlit Travel (CWT)."


I'd say every single company sitting on billions of assets are on the target list. Facebook, Apple, Microsoft, Google, Amazon, Uber, everyone (although Microsoft is the least attacked here). But the illusion here is we only see the big guys, we rarely question the smaller guys, unless someone publishes a CVE or publish a disclosure.

Many users are paranoid about ToS and how companies (esp the big guys) make money either by learning your behaviors (search preference, sites you visited) or selling your data to a partner (Foursquare, although they claim to only sell location data which are anonymous). We have a blind-trust with service providers. We let service providers to collect everything about ourselves, but internally they can decide whether to discard "sensitive data" early on or not during data processing (but web logs would have the trace).

Has anyone every inspect the traffic, or reverse the API in <your wearable device> (e.g. Fitbit?) What about file sharing companies? Are they storing your data in a secure way and without reading what's in your file? What about sites that let you compare prices across multiple stores? What about medtech startups? What about when the company is acquired?

Because the giants are more eye-catching, we don't see the smaller guys; but we are willing to give away sensitive and private data to the smaller guys because? If the argument is "well the big guys should have known better and have more resources to do the right thing", then I argue that by 2017 the new startups are doing the right things (not making the same well-known security flaws for example). I have doubt; I doubt many achieve 50% of what is on the imaginary checklist. The claim "we build MVP" is the equivalent of the big guys saying "we know what we are doing, don't worry."

No, we don't know better. No, your MVP should be secure enough so users can trust you. I almost never try a newly launched service because I really don't want to be a lab rat. I am sorry if that sounds cynical, but I don't trust myself doing everything right. If you let me choose between a new file sharing startup vs others, I'd go with either Dropbox, Google Drive or OneDrive (FWIW, Google is replacing Drive with a new service). Why? Because if the big guy is compromised, well, shit, thousands or millions will be affected. The least cynical version is, well, they are too big to do stupid things (of course not true in reality).

Our biases create illusions.


As others have kinda pointed out: it's the amount of data they store.

I think it is less about who you should trust to do the right thing and more of how much you trust an individual group to be able and willing to maintain the security of the data they have.

That is to say: it is more about the ratio of <stored data>/<security of stored data> rather than either of those values.

Google stores more data than I personally trust anybody to be able to store. Largely because they make themselves a huge target.

Equifax is a great example of storing WAY too much data. Sure, their practices are to blame, but I'd argue that them storing half the information they do is better than doubling their security (whatever that means). This is due to the compounding effect of being a smaller target.

DDG at least claims to not store as much of that information.

> When was the last time Google leaked your data.

If Google has one major leak, does the historical quantity of leaks matter?

EDIT: more thoughts.


Google has a vast number of security researchers hired. In terms of security, there's probably no other company that can do a better job.

That Google has your search or email data and that they can always sell it, or that the NSA has a direct link to them, those are separate discussions.


Protection from the Nation State Actors cannot come from companies. One must implement protections on one's own client-side. Proper encryption always. Tor when needed. Software and, where possible, hardware only from trusted sources.


Even then the concept of "trusted sources" is a dubious one. A source only needs to be trusted until it sells you down the river.


We need more than "trust". We need open software and open hardware.


So Equifax were good guys until they ditched all out data?


I care about "random idiot at NSA sitting in front of a computer being able to make my life miserable".

I don't care about "NSA has decided to invest $10 million to make me personally miserable." At that point, they will simply fabricate the evidence they need.

If someone at the NSA wants to come after me, I simply want someone to actually have to sign a piece of paper--cut a check, file a warrant, send a human being, etc.--rather than just write a script in Python.

If someone in a bureaucracy has to sign their name and take responsibility, I'm fairly safe. Simply the possibility that something might cause political fallout will stop 99.9% of all such actions.


If they use Amazon servers, NSA can get data from Amazon without dealing with DDG at all.


My beef with this article is that it's unreasonably reductionist to conclude that DDG provides an "illusion" of privacy based on the fact that they're as vulnerable to being targeted by the NSA as anyone else. The issue of privacy is so much bigger than that.

If you use Google Search and someone obtains access to the data they have on you, legally or illegally, they could end up obtaining many years of your browsing history. If you use DDG they have nothing, and the most they can do (as the article states) is start collecting your search habits from that point onward.

I don't want huge companies to amass giant archives of data about me. There are so many ways it can be abused by a multitude of actors. It's a selling point to me when a service retains little or no information, and if it needs to retain something, it requests limited permission in clear and simple terms.


The only conclusion I can make from this article is to avoid services hosted in the USA but even that is not guaranteed to work -- having in mind that US agents have been known to go abroad to request access to foreign company's servers. (They were even supposedly thrown out from Iceland once -- assuming that wasn't a honey pot propaganda operation to lure people to host stuff in Iceland, of course.)

What's left for the people who aren't criminals but don't like being spied on? PGP and keys that are exchanged physically, by hand?

If somebody can physically spy on the infrastructure cables that your traffic goes through, will SSL protect you? As written in the article -- no it will not, because the certificate can be obtained, even if it takes some time and strong-arm effort to do so. But when a country can order you to give up private keys and keep quiet about it, really, what can you do?

At this point, full decentralization, mesh networking and something times better than Tor encoded in 100% of the network code seems to be the only way out. Maybe a combination of IPFS and FreeNet, full packet-level encryption and keys that expire in 1 minute and are auto-generated for every transaction?


I've argued here at HN before that I don't think this is a technological problem, but a social one. There is nothing that stops a powerful enough actor from breaking encryption with a rubber hose, except for a strong stigma against that kind of behavior. We need to give digital privacy the same social protection. The other problem with making a purely technical solution is that you leave out people who are not capable of using that solution because they do not have the resources, education or capability.


I agree. The biggest problem in this age is having a strong encryption that is user-friendly. The common wisdom says it's impossible to combine the two. I disagree with it but I don't have the time to try and work in the area, nor am I an expert. IMO it's a good cause to work on anyhow.

Furthermore, spies aren't stopped by social stigma. Even if the whole planet agrees in one voice wiretapping shouldn't be done (never gonna happen) the spies can always deny that they're spying. It's not like any of us can actually prove that any agency is indeed wiretapping.

Morality is, in the technical sense, optional. It cannot be enforced. Thus it's unreliable.


> strong encryption that is user-friendly

Not enough, in my opinion. Where most people will fail is on the opsec side. They just don't understand security best practices. I realize that not all problems can be solved, but technology is not just encryption, it's understanding how the technology works so you can avoid leaking information in the hundreds of other ways that are possible on the Internet.


Solving the problem with technology is 1000000 times easier than solving it from the "social" side.


Is it? For who?

Do you think people's data would be more secure at the border if

- You kernal-hacked iOS so that it booted into a vanilla account upon entry of a certain passcode, and encouraged people to install your hack from GitHub, potentially borking their phones

- People couldn't be compelled (or face being denied entry) to allow search of their electronic devices

?

What about trying to do everything via a VPN and spoofed UA strings vs. PII being banned from sale, heavily taxed, or a meaningful opt-out existing? Or even just DNT having a legal basis?


Is it? For who?

For HN readers. And probably in general.

As for your questions, I would definitely like to first use software which doesn't compromise my privacy and security and only as a very distant second have some bureaucrat who would maybe in the best case scenario fine a company which leaks my data.

The vote I cast by running a Tor relay is much more meaningful and valuable defence of privacy than a vote in the general elections. By orders of magnitude.


Don't get me wrong, I'm all for using technology to preserve privacy.

> The vote I cast by running a Tor relay is much more meaningful and valuable defence of privacy than a vote in the general elections.

I kind of agree, but, not if the new leader outlaws using Tor tomorrow, as they have in China and Iran, and are attempting to do in Russia and France.

> bureaucrat who would maybe in the best case scenario fine a company which leaks my data

You may be selling the power of legislation short. It has the power to entirely transform the default business model of the Internet away from surveillance capitalism, for example. In terms of privacy, this would eliminate entire classes of "threat".

We frequently under-estimate our power as technologists to influence these things; we have very much more than a single vote. The narrative of 'technology' in the media is almost entirely that of billionaires and spies, and the media are gradually starting to realize that they're being led along. Getting the voice of technologists to explain the societal impacts of technology policy is a problem that we need to identify with as group.


Let's say you have your data secured with Ultimate Encryption Method. A couple of thugs appear at your door and mention that if you don't un-encrypt that data, they will send these very compromising photographs of you to very public places.

Ultimate Encryption Method has just been trivially bypassed.


Agreed, and I am very sure governments have been doing that many times since they can force you to keep quiet -- and they got away with very criminal methods probably tens of thousands of times.

Hence, the good crypto is really only the most basic first step. Full privacy should be the second and a very important step. Fine, I might have an account in <insert-trendy-social-network-here>. Fine, my real name is known there.

OK.

But there are plethora of cases where having a user profile and personally identifiable information is simply not necessary -- like web search. Even e-commerce like Amazon can be mostly anonymized: you can pay with an e-gold kind of currency (Bitcoin, Ethereum) and Amazon can only ask politely another API if the goods are deliverable to your address (while Amazon won't have an idea about your physical address). Then, even if that mystical-another-API has your physical address, at least it's not in the hands of Amazon -- fragmentation of personal info gives us a small degree of protection, even if Amazon can eventually procure the info under the table. Hey, every little bit helps. Make surveillance expensive and many will drop it (not everybody of course, I recognize that). Again, every little bit helps.

Of course, due to a mountain of vested interests, nobody is talking about such technical solutions. "They" want your info all over the net. It's easier for them, so why change anything? The current status quo is sadly very logical.

Back to your example, thugs will just be angrily gnawing at their nails if they have no idea who you are and where you live.


A decentralized technology only works until some corporation with large advertising resources creates a proprietary centralized service around it and most people flock to their offering. E.g. Bitcoin is becoming centralized around popular online wallets, exchanges and pools; email was supposed to be decentralized, but now most people use Gmail; etc. We will not achieve full decentralization until it becomes both technological and social norm.


Sadly I agree, Homo Sapiens hasn't collectively evolved enough to recognize the dangers of centralization.

However, what we can do is make decentralized tech absolutely idiot-proof and put it in the hands of non-technical users. If it's convenient, fast and reliable, it will at least have an equal footing against the centralized services. Let's get to that point and fight the other battle you mentioned then.


Sure, if you're a fairly affluent and educated hacker news reader. But there are far more people who I routinely work with who struggle with the concept of a password but need to use the internet to apply for work, register for disability, social security, communicate with family, etc. Do those people deserve less privacy?


Nom they deserve the same privacy. I wish there was a solution for them, but I don't think there is.

Better software and newer generations that know how to use them will, however, come before anyone can make a government that respects his population.


Stuff like Apple's FaceID might help. I am eagerly awaiting to see how will Apple entrench it in their services because I am sure they will.


If one's threat model includes state actors that target that person then all regular methods are useless. The best we can do is to protect against passive attacks and that's where PGP, double ratchet schemes, Tor etc come in handy.


True enough, sadly. Still doesn't mean we shouldn't push back with the means we have in our hands. Open-source firmwares might be a VERY good first step in that direction (one of the reasons I am replacing my ASUS router with Mikrotik; another being that ASUS routers are laggy pieces of crap, even those that cost north of $200).

Also, we all know about Intel ME, right? It's baffling how most people using PCs have hardware-level backdoor and the world hasn't lost its shit. It's a very sad epoch we live in. :(

A solution right now is to simply not get on the state adversaries' bad side, maybe. And utilize the blockchain for anonymity, I guess.


Here's another solution, from the late Pieter Hintjens: https://www.indiegogo.com/projects/edgenet#/


> The only conclusion I can make from this article is to avoid services hosted in the USA

The thing is, there are absolutely no guarantees it's any better in other countries : the fact that NSA activities were revealed doesn't mean other countries don't do as bad.

It's probably still a good idea to segments services across countries, though, so that it's not a single country who have access to all data.

I was thinking something else, lately (and it was really weird to me, since I'm a webdev): why do we need webapps for everything? Maybe we wouldn't have so many problems if we weren't centralizing so much data. There are probably many apps for which native apps and p2p would do.


I have asked myself the same. I see two-fold answer:

(1) Business who are familiar with web tech and for them it's simply good business to not invest in entirely new (for the dev teams in there) stacks and data formats.

(2) Political pressure -- we know that NSA successfully pushed a broken crypto once in the past, so it's not a stretch to theorize that influential people were whispering in the right ears at the right time.

Even though #2 is just as likely as #1, please note that I think #2 probably happened much, much less than #1. We should never attribute to malice that which can be attributed to incompetence and short-sightedness.

EDIT: There are several very good candidates for data formats out there which are much more efficient than JSON. Adoption is seldom an issue; if one corporation pushes for the format, it's a matter of several months at the most before that format becomes widely supported. Replacing JS might be much harder but efforts continue even today.


Indeed, especially #1. There's a thing with the web : it's easy. If we start making more native apps, especially desktop apps, we then have to consider users' environment (we do not control the execution environment, like with a server), we have to implement error reporting, manage the propagation of updates, etc. Centralizing data is easier too : no need to manage data synchronization, it's just about putting everything in the same database.

It seems like we'll have to wait for users to complain more about privacy issues before the effort is worth it, from a business point of view.

The alternative, of course, being to find a way to make native apps as easy. Nothing prevent us to make updates automatically apply (many apps do that, actually). Applications can run in some sort of VM/container so environment is controlled. And regarding distributed databases, we already have a good idea of what it could look like with blockchains, even if it would obviously be way better to build something with databases in mind from the start. For example, servers could be used for people to register to a service and be added to groups. Then, clients contact servers to get a list of peers, and then they sync their group database by executing transactions (somehow like migrations).


From previous discussions I've learned that USA companies are the only ones that actually are protected from the USA government.

So feel free to build a company in Sweden, but the US is actually legally permitted to wiretap the crap out of it.


Except that's not how it works. They will take everything including those USA companies' data and then they will go to a judge to ask for a warrant to make interception after the fact legal. It's US citizens that are exempt and their data is only looked at in exceptional cases but on the whole you should not assume the data is not recorded. The legal fiction used to protect this abuse of your rights is that they claim that as long as nobody looks at it your data wasn't really collected.


From all the leaks we've seen in the last 4 years, the three lett3r agenc1es weren't deterred by the laws at all, wouldn't you agree?


There are two levels at play here: who is saying what and who is talking to who. The second one is extremely hard to protect against and already plenty useful on its own.


A fact is a fact. IMO we should approach the problems one by one, even if that means that privacy will remain a pipe dream for a while still. But we can gradually tighten the grip and I think every little bit will make the adversaries panic a little.

Example: make a strong standardized crypto (I still have to workaround API requests to several servers and hardcode TLS v1.2 as a requirement which is not okay!), then work on making a Tor-like net tech, then integrate the blockchain in the picture so anonymity is stronger, then probably put all that in mesh networking, etc. etc. The current internet is broken, many of us know it.

I am not an expert so my example might be naive and stupid -- but recently I am very interested in the area and I'd contribute. After I educate myself first, though.


Recently I have been using the free and open source Searx more and more (admittedly mostly using the !searx shortcut from DDG). Results seem better than DDG sometimes. Would be interesting to try and host my own instance or write something that picks a random public instance.

https://asciimoo.github.io/searx/


Along that same line is YaCy.

https://yacy.net/en/index.html


I think YaCy didn't give me very good results last I tried it. It's not a fair comparison, as it has been a few years for sure, but if I had to guess then I would say that Searx, being just a search engine aggregator (it doesn't do crawling), gives better results.


I tried it last about six months ago. It was a marked improvement over what it was a couple of years prior. I like that it is decentralized and stores no private data. To me, the decentralized part is important. But, I'm not sure I'd say it's ready for prime time.

With Searx, you still need the regular search engines. I suspect that your traffic can still be identified, say through timing requests or from piecing together behavioral data. I haven't really dug that deep to investigate the risks.

I will give it a shot tomorrow, just to try it out. I've seen it in passing but you're the first person I've seen, in the wild, that uses it. I'll give it a test run.


Wow, very cool project! Thanks for sharing.


i saw there's an Installation page but do you know of an easy step by step tutorial for setting this up? Perhaps with a low cost recommended host, etc.?


There's a Dockerfile, just spin that up. (Or, if needed, find a brief step-by-step on running a docker container, because that's all there is to getting searx running.)


To be fair here is ceo response quote

Hi, this is Gabriel Weinberg, CEO and founder of DuckDuckGo. I do not believe we can be compelled to store or siphon off user data to the NSA or anyone else. All the existing US laws are about turning over existing business records and not about compelling you change your business practices. In our case such an order would further force us to lie to consumers, which would put us in trouble with the FTC and irreparably hurt our business.

We have not received any request like this, and do not expect to. We have spoken with many lawyers particularly skilled and experienced in this part of US and international law. If we were to receive such a request we believe as do these others it would be highly unconstitutional on many independent grounds, and there is plenty of legal precedent there. With CALEA in particular, search engines are exempt.

There are many additional legal and technical inaccuracies in this article and I will not address all of them in this comment. All our front-end servers are hosted on Amazon not Verizon, for example.


Here is the link to the CEO's response for those interested: http://etherrag.blogspot.in/2013/07/duck-duck-go-illusion-of...


The amount of unearned confidence the author of the blog displays in his reply is kind of embarrassing. "I read a CNET article once that contradicts what you're saying!"

He created that blog account just to write that article and it's the only one up there. Stuff like this makes me wonder what motivates some people.


Like Google, by default DDG tracks what results the user clicks on. URLs are prefixed with a DDG URL. Users HTTP requests are forwarded through DDG servers.

By default, DDG "lite" does not set cookies or use Javascript. However, if the user wants to change the default "settings" (HTTP has no state so this is a fiction), then AFAICT she has to enable Javascript and accept cookies. Privacy conscious users do not want Javascript or cookies.

DDG could achieve the same result by simply providing an alternate URL, something like /lite2 in addition to /lite.

Whether DDG saves this data I have no idea. But one has to wonder why, if privacy is a goal, DDG is collecting it to begin with.

If DDG believes it is doing this for the benefit of users, it is not convincing because there are alternative ways to achieve the same benefit that do not require prefixing URLs, Javascript or use of cookies.

For example, browser settings already allow the user to control HTTP Referer headers, assuming queries were submitted using GET. The user can change the settings in the browser so that no referer is sent, or to send a custom referer of her choosing.

Another example is if DDG accepted queries via POST method in addition to GET. No search terms would be leaked in the URL or in any HTTP referer.


DuckDuckGo staff here - just want to clarify a couple of points:

* We don't track result clicks. URLs are no longer prefixed with a DDG URL by default except for old browsers (although this is controllable in the settings: https://duckduckgo.com/settings#privacy ), but even if this is in effect we don't store which sites users visit. We started stripping search queries in referrer headers in 2010 and you're right, current web standards make it possible to do this without us having to redirect through our own servers.

* We have an alternative non-JavaScript URL - https://duckduckgo.com/html - which tries to offer a fuller search experience than the minimalist https://duckduckgo.com/lite

* Cookies are used to store settings but if users prefer to block them, preferences can still be "saved" by using URL parameters, listed here: https://duckduckgo.com/params These can be used either to form a local bookmark/start page or anonymously in the cloud with a password only (no username or other data).

> But one has to wonder why, if privacy is a goal, DDG is collecting it to begin with.

I'm not sure which data this is referring to but we don't collect or share personal information. There's more on this in our privacy policy: https://duckduckgo.com/privacy


I forgot to add that we do accept POST request queries as an alternative to GET, again from the settings.


According to DDG, they prefix the links to prevent the destination websites from obtaining the search terms used via the HTTP Referer header.

Source: https://duckduckgo.com/privacy


Most of the points is arguing that NSA could compel the company Duck Duck Go, Inc to install equipment and then forbidding the company from disclosing that fact.

Doing so does carry quite a bit of political risk. There have been quite a few lawsuits from EFF and ACLU in regard to do so, and as the comment from CEO of Duck Duck Go says in the comment thread, all existing cases has been about turning over records. Going the extra step of compelling people to install hardware and keeping the operation going would be a further step.

I doubt ddg is currently worth the political risk. There is likely much easier targets to attack first in order to get 100% of the worlds search data.

*down votes? Explanation?


I'm not sure if they really need to compel DDG in the first place.

I know if I was a three letter agency, I'd start a "secure" service like DDG myself as kind of a honeypot.

Not that I'm saying that such an agency is actually behind DDG -- I have no way of knowing. But I would be very surprised if a large number of services promising "security" and "privacy" weren't run by such agencies or their agents.

That's why I believe that frequent, independent third-party auditing (by multiple trusted groups like the EFF) would be necessary to gain any kind of confidence in such services. Even then, it'll be no guarantee that they're not compromised, but it would just make such compromise significantly more difficult and less likely to be effective.


While that is always possible, I think its more plausible that they then simply buy out key companies rather than found a bunch of new companies in hope that one will succeed. The question then is, what is the likelihood that NSA is the secret owner and operator of Microsoft, Apple or Yahoo, which each would likely be the cost effective choice if one wanted access to all search queries done on the Internet.

Independent third-party auditing is useful. There is the occasional fund raising for auditing of software (Truecrypt comes in mind), but I don't recall hearing one about search engines.


"I think its more plausible that they then simply buy out key companies rather than found a bunch of new companies in hope that one will succeed."

How many new search engines were focusing on privacy and security as their main differentiator from the competition?

I know of only one: DDG


Here is a meta article with the clicky 5 best search engines that respect your privacy with the lesser clicky official title: Privacy Search Engines 2017 Group Review. :"https://www.bestvpn.com/privacy-search-engines/

https://en.wikipedia.org/wiki/List_of_search_engines

That list is long and show that there has been quite a few people have tried to get momentum in the search engine space. Would be interesting if anyone did a meta study to see how many uses the words "security" and "privacy" as marketing to gain users.

I think we should also include open source search engines and p2p, since if NSA developed them they could build backdoors in them. Most of them seems to have "privacy and security" as explicit goals.


The hard part of being a search company isn't starting the company, it's attracting users. Which is almost completely a crapshoot.

Far more viable to let someone else take those risks, then provide a compelling case for them to assist you in your inquries and/or interests.


With DDG's focus on privacy, I wonder why they never had any warrant canaries setup?


>Most of the points is arguing that NSA could compel the company Duck Duck Go, Inc to install equipment and then forbidding the company from disclosing that fact.

You don't need to do anything like that. DDG doesn't crawl the web itself, it uses API providers like Bing (Microsoft/NSA) and Yandex (Russian/FSB). They're legally required to disclose that on their site.

It's possible to identify people solely through anonymised credit card transactions[0] so doing the same for search results is pretty much the same.

DDG isn't private, it just gives the illusion of privacy, same as TOR. That said though if you're a high profile target then there's much more direct means to track what you're searching.

[0]http://www.sciencemag.org/lookup/doi/10.1126/science.1256297


Recently I had a series of unfortunate plumbing mishaps at my home that set me back a bunch of money. I did very minimal google searching (just confirming the spelling of the plumber's name), but ads offering emergency home loans have started popping up in my browser.

If I can go to a search engine that doesn't sell the fact of possible financial problems to whatever loan shark is willing to pay the most to get to me, I see that as a win.


Privacy requires full transparency. We're is documented with what foss software ddg works and where can I find trusted audit reports?


Even if they were completely open source how would you verify that they are using the same software on their servers? That the hardware is not compromised?

Audit reports? How trustworthy are they if Symantec was able to provide good reports for such a long time for their certificate issuance when things were clearly not ok.


For example if they used AGPL software, it would be much harder for them to cheat. But you can never get 100% confidence.


ddg's server and js are closed/proprietary, if I remember right.


Duck Duck Go is a company that I want to succeed, as they are clearly making a stand on user privacy.

However it never made sense to me why people would use those DDG bangs.

I mean privacy is the main selling point, so why in the world would you send the searches you make on other websites to DDG, when the browser is perfectly capable of being configured for "search keywords".

In Firefox, go to amazon.com (or any website you want), right click on their search bar and select "Add a Keyword for this search...". Add "!a" or whatever you want. There, you've got your own bangs.


Not all browsers are configurable to search keywords (particularly on mobile).

DDG is consistent across hosts / browsers / OSes.

DDG maintain (and fix) the bang searches as they break (which ... happens).

I appreciate being able to !bang away in my Android browser(s) navbars. There are other options. Surfraw (a Linux CLI utility) is an example, though my problem is that 1) I can't remember the aliases and 2) they interfere with other commands I use (there are ... a lot of surfraw elvi).


If you're worried that DDG may log your IP you can simply use it with the Tor Browser (it's the default search engine) or use their onion service (https://3g2upl4pq6kufc4m.onion/) for increased security and anonymity.


Tor is far from perfect and there are several ways in which one could connect traffic at some endpoint with a user at a specific IP. Do not rely on Tor if you really want anonymity.


> Tor is far from perfect

No one made such claim, not even the Tor Project themselves.

> and there are several ways in which one could connect traffic at some endpoint with a user at a specific IP.

DDG traffic is e2e encrypted especially if you use their onion service since it wont use exit nodes, the best they can know is that someone did some unknown search on DDG.

> Do not rely on Tor if you really want anonymity.

Tor is the best solution low-latency anonymity system currently.


Tor offers a layer of protection. It is possible to stay anon on Tor.


> It is possible to stay anon on Tor.

That is an extremely dangerous statement to make and one I do not agree with.

Keep in mind that:

- you will have to trust that a large chunk of the nodes is not in the hands of someone that you count as your enemy

- that even if your enemy is not in charge of a substantial part of the network they may still be monitoring entry and egress and that that alone can be enough to figure out who is talking to who

- that any data present at egress that can be intercepted might still reveal who you are

So no, Tor is not 100% secure and it is very well possible that even if you use Tor your identity will be connected with some activity or even all of your activity while using the network.


You didn't actually disagree with your parent comment. It is possible to be anonymous on TOR. It does not come for free.


Well, if you want to trust your anonymity to luck or not being monitored then yes, you can be anonymous on Tor. But that's little comfort. It's possible to cross a highway blindfolded too. But it isn't smart to do so and it is even less smart to assume that it will always work just because you can't see the danger with your silly blindfold on.


Thankfully, none of your assumptions are true if you consider TOR one part of anonymity as your parent comment did.

Not to mention, you conspicuously avoid comparing degrees of anonymity. Obviously TOR is better than SSL, which doesn't provide any anonymity.


If you use Tor as part of a group of privacy measures then you can protect your privacy online, but it's useful to know what the potential weaknesses of Tor are so you know what other measures to take.


> you will have to trust that a large chunk of the nodes is not in the hands of someone that you count as your enemy

Why not run your own guard node or even an obfuscated brige and then connect to it? That way you can make sure that no one will do traffic correlation (except, of course, a global adversary) since that would require controlling both the guard node and the exit used in the circuit (which changes every 10min, and in the Tor Browser you get a new circuit for each website).

> that even if your enemy is not in charge of a substantial part of the network they may still be monitoring entry and egress and that that alone can be enough to figure out who is talking to who

That's not possible in practice, quoting from the Tor Browser design documentation [1]:

> In the case of this attack, the key factors that increase the classification complexity (and thus hinder a real world adversary who attempts this attack) are large numbers of dynamically generated pages, partially cached content, and also the non-web activity of the entire Tor network. This yields an effective number of "web pages" many orders of magnitude larger than even Panchenko's "Open World" scenario, which suffered continuous near-constant decline in the true positive rate as the "Open World" size grew (see figure 4). This large level of classification complexity is further confounded by a noisy and low resolution featureset - one which is also relatively easy for the defender to manipulate at low cost.

> To make matters worse for a real-world adversary, the ocean of Tor Internet activity (at least, when compared to a lab setting) makes it a certainty that an adversary attempting examine large amounts of Tor traffic will ultimately be overwhelmed by false positives (even after making heavy tradeoffs on the ROC curve to minimize false positives to below 0.01%). This problem is known in the IDS literature as the Base Rate Fallacy, and it is the primary reason that anomaly and activity classification-based IDS and antivirus systems have failed to materialize in the marketplace (despite early success in academic literature).

> Still, we do not believe that these issues are enough to dismiss the attack outright. But we do believe these factors make it both worthwhile and effective to deploy light-weight defenses that reduce the accuracy of this attack by further contributing noise to hinder successful feature extraction.

And just recently netflow padding has been added to Tor 0.3.1.x.[2]

> So no, Tor is not 100% secure and it is very well possible that even if you use Tor your identity will be connected with some activity or even all of your activity while using the network.

That still doesn't disprove the fact that Tor is the best low-latency anonymity system and that not using Tor is much much worse than using it.

[1] : https://www.torproject.org/projects/torbrowser/design/

[2] : https://bugs.torproject.org/16861


Tor is designed to make sure that the most active sheeples stay close and so it is cheaper to deal with them.


The issues brought up in this post apply to every single service operating online, and it only applies to DuckDuckGo in any special way because of their increasing size. This includes "client" encrypted webmail and similar applications: they can be forced to deliver malicious JS that gives up your keys, or the JS client delivery can be MitM'ed.

Many people seeking enhanced privacy from DuckDuckGo are seeking privacy from Google, not from state actors. For that, you'd need additional measures like Tor, for which DuckDuckGo provides a convenient .onion service. Even if DDG is secretly tracking all our searches, they have less data to correlate it with.

My current privacy complaint on DuckDuckGo, combined with browser search UI issues (looking at you, Chrome) is over the !bangs. If you're doing "!w [sensitive topic]" instead of tabbing to Wikipedia search in your browser and searching that way, you're risking DDG or anyone who's compromised DDG seeing your Wikipedia searches, when the search should go straight to Wikipedia, Twitter, Stack Overflow, and so on.


DDG has https://DuckDuckGo.com/lite

For non js. There are of course other vectors and many not even search engine dependent.


Thanks for that. We also have https://duckduckgo.com/html

(Disclaimer: DDG staff)


I use Duckduckgo because I don't like monocultures.


Very noble.


I am participating in a peer-to-peer search engine based on free software, http://yacy.net. But I am not sure it can save us from NSA... We have to take political steps against them anyway.


comparison of using DDG vs Google over tor is enlightening (GIF):

https://twitter.com/cyphunk/status/849615910545620992


Google's anti-bot measures DoS the service for me rather frequently.

(Perhaps I am a robot?)


Collecting meta-data is not benign at all, it's trivial for the usual suspects to de-anonymise, and profile based on browsing habits.

Fat protocols should marshal the true web 2.0 along with DAOs.


So does DDG produce a transparency report and if not then why not?


This is something that's always been fascinating to me. In any thread about privacy, there's always a comment along the lines of "if your threat model is a nation-state, then you're screwed." You hear it about DDG, Tor, client-side but web-delivered encrypted email, etc.

What if your threat model is a nation state? What's the proper way to ensure your privacy that does not require abstaining from the internet? Is a high degree of privacy even possible?


Privacy from the state never really existed, even before the internet. Paperwork always allowed the state to know things. Information has always been power, and information is an important tool for governments so they can be able to work. I think it always has been.

I'm more worried about privacy from private interests. The issue is what the governments do with data, and if the government let private parties access it, and where do you draw the line between the government having right to access, and companies being allowed to access it, because you will often have situations where things are not clear.

To be honest I will always have a problem with the whole privacy/surveillance debate, because there are things the government should know, but only because it is the government. Private companies are now being able to track people and have the same kind of data the government has.

So there is a big nuance, and it is often shut out by the outrage, which frankly comes from a libertarian agenda, which I have a problem with.


I wrote my own search engine and using it. Not very difficult.


Is it a proxy to other search engines or do you search your own database of websites?


Sorry for the delay in my reply.

I search from my own 1GB database; It covers 80% of my needs;


Interesting. Thanks for your reply.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: