Servers - hosted using Amazon (AWS)
Bangs aren't safe. For example typing “!g kittens in basket” and hitting return, drops you off on the Google website to display your results (thus logging your IP, search term and browser info immediately).
DuckDuckGo is owned by Gabriel Weinberg who is is the founder, current CEO and controlling shareholder. Investors/shareholders include Union Square Ventures and several others. DuckDuckGo generates it’s income from advertising (Bing Ads) and collects affiliate revenue (Amazon, eBay).
Duckduckgo and Yahoo partnership
Duckduckgo has no audit
Duckduckgo gives out a HTTP header field that identifies the address of the webpage.
Both companies were asked "if you were ordered to compromise your service/customer privacy in any way would you"
DuckDuckGo – Gabriel Weinberg said: “No one is preventing me from doing that.”
The jurisdiction and infrastructure aren’t very relevant, DDG is about protecting yourself from advertisers, not the US government.
Bangs aren’t supposed to be safe. DDG is quite upfront about that.
> Remember, though, because your search is actually taking place on that other site, you are subject to that site’s policies, including its data collection practices. 
Yes, DDG did use Yahoo to power the searches, now they use Bing. Searches are forwarded anonymously. More than enough to protect you from advertisers.
They do advertise based on search keywords. This does not conflict with their mission. Same with affiliate links.
The lack of an audit is a fair criticism.
Can you elaborate on the header?
I would expect DDG to comply with government orders, because I don’t expect DDG to protect me from the government.
> Bangs aren't safe. For example typing “!g kittens in basket” and hitting return, drops you off on the Google website to display your results (thus logging your IP, search term and browser info immediately).
That's the whole point of bangs is that they redirect you to the website. How do you imagine ddg redirecting you to google without giving google your ip address?
> DuckDuckGo is owned by Gabriel Weinberg who is is the founder, current CEO and controlling shareholder. Investors/shareholders include Union Square Ventures and several others. DuckDuckGo generates it’s income from advertising (Bing Ads) and collects affiliate revenue (Amazon, eBay).
Is that a bad thing? Ads are keyword based, why is that a bad thing?
> Duckduckgo gives out a HTTP header field that identifies the address of the webpage.
What does this even mean? Referer?
Even with these hangups, I still think any near-viable option to Google should exist.
[EDIT] - just realized it's only been 20 minutes (not a day) -- looking forward to DDG's response
In practice, DDG revenues depends on tracking visitors.
DDG asks third-parties to track (Yahoo/Bing ads), so if someone asks "do you track? NO! WE DON'T".
They don't need to create a user profile, the keyword based ads pay enough.
This is the catch, "we don't track, but we ask our partners to do it for us".
The same with Amazon affiliate revenue.
Did you know that DuckDuckGo had access to the whole history of items bought through their affiliate link ?
It's a detailed item list, like "3 x Happy Belly Dried Mango, 500 g"
Every, single, item.
No tracking (◠ ‿ ◠)
I think people don't want to be reminded that they're being tracked, which is possibly why it went poorly, but this will be a problem with any company that wants to do opt in tracking.
ISP's make money with the data they gather. Once you remove that as a revenue stream, it'll naturally increase the prices those companies are willing to charge.
So you are going to decide whether they are making enough money? Don't like tracking? Switch over to a privacy respecting ISP.
As if this was a readily available option.
If I happen to know that you like cheese sandwiches, is that really data? When did we decide that knowing something about somebody else was such a big deal?
I feel like this whole brouhaha about data has left me behind.
When data became a commodity that can be analysed, acted upon, sold and used to make predictions about groups or individuals.
Credit reporting agencies were always all about knowing something about somebody else en mass as a commodity that can be analysed, acted upon, sold and used to make predictions about groups or individuals. But these are companies best represented by people who wear proper suits and play proper golf. Nobody batted an eye.
Contrast that to today's barrage of tracking and advertising. Firefox Focus tells me to date that it has blocked 65k trackers in the past ten months. I mention a specific tool I need for a project to my dad, he searches for it, and then I see advertisements.
The "nerds" didn't forget to wear suits and play golf, they expanded the data collection net and feedback loop for advertisements to an unprecedented scale.
Exactly. It's intellectually bankrupt to suggest that tech is just like everything else before, all while conveniently ignoring the change in scale. Scale matters, and scaling up changes a system's implications and risks. It's irresponsible to think otherwise.
Just because I'm OK with using party poppers doesn't mean I want a flashbang to blow up next to me.
A very simple and generic example is if you prime someone with a picture of an American flag, they're more likely to vote Republican. So with a slight tweak to the algorithm, on election day we can guarantee more or fewer Republican votes just by changing the ranking of posts with a flag on Twitter.
And the more we know about you as an individual, the more we can control you by priming you in this way.
Would be careful with that. AFAIK priming studies generally fail to replicate.
The sheer scale of the intrusion as technology has advanced is the problem. You're not telling someone "I love pizza," they are stalking you across the internet and deciding that you must love pizza, because their mountains of data confirm it.
You won't have much luck getting these algorithms to change their mind about you, since they're pulling it from your browser footprint, your IP, your connected networks, social media buttons, cookies, data sold by companies you've shopped with, etc...
Why yes, Pampers, I am in the market for adult diapers. And denture cleaner.
While I think this may help maintain my privacy, I worry that on a societal level it may help us lose our sanity.
With accompanying HN discussions from 2017: https://news.ycombinator.com/item?id=14002995
If this technique became widespread, advertisers could compare users A,B C, and … Z to filter out the “random clicks” (since they would all just as equally click these advertisements). This will leave A’s extra clicks still in the data sets letting the advertiser build a profile for A.
I mean sure it _might_ work against something very basic but adnauseam is a bit more advanced than that. I also avoid trusting any marketing lingo as technological solution.
The fact is it would be very hard to catch this, even the most basic implementation. Not only that but it's absolutely not worth doing it as huge huge minority of profiles are compromised, we're talking 0.01% of all profiles here.
Plus, let's go to yellow pages and randint(1,999999) for a business listing and spend some time browsing that.
Plus, a random selection of Google trends in various niches.
Plus, maybe the best selling items in a few Amazon verticals.
Maybe toss in some "how to treat"+ WebMD topics.
How about beginning to target dating sites to your partner because they've figured out you are having affair before the partner has? Or divorce attorneys?
Or notifying your work recent changes in pattern make you look like a high flight risk? Or your insurance company that you may have an unreported condition? Same basic data sources can feed into that analysis.
Those are just obvious ones, and I'm not claiming they are specifically currently being done ... but the basic technology isn't a barrier.
This is why it is a bit disingenuous to echo the common refrain "what's so bad about better targeted ads" as if that adequately describes what is at stake.
And it's not just how it's used today. It's how it can be archived and analyzed 10 years down the road with all the additional information gleaned from you in that time.
And that's using 7 year old tech.
If one day I develop diabetes and start researching insulin and products, I don't need a company automatically sending me flyers for blood strips.
Oh look, now we historical catalog of all the customized mail that was sent to you, by companies who know what you're looking for.
And we all know how reliable and secure government and corporate databases are. /s
With the risk of pulling a semi-Godwin, you should visit the Stasi museum here in Berlin. It gives you a good insight into how powerful “knowing something about somebody” can be.
Except that the internet has made all transient data permanent (i.e. my jokes from high school are free to see on Twitter), and our social systems simply can't haven't kept up with this change.
Insurance is in a funny spot, because the whole premise of insurance is to protect against the unknown, but better and better data is making things known that weren’t before, requiring a change in pricing (lest their competitors beat them to it).
Would an individual change their practices if they were told that something they were doing was harmful? But, they were never given that choice. Instead, the data is collected, categorized, and sold. Insurance companies can collaborate all the data and extrapolate trends at scale.
Now, these insurance companies also have a fiduciary responsibility to lower costs and raise profits to their shareholders. What that amounts is that every insurance company will have to do the same, with slight variations. But we the populace will be worse off with unknown black boxes saying worse things about us (mortality, health, safety) and for us few ways to change things aside gross generalizations our doctors give us.
This is a thoroughly debunked lie
Yodlee almost certainly has a record of the vast majority of your credit/debit transactions. It doesn't sell the ability to advertise to you as a specific demographic, it sells that data directly.
If you log into the free public wireless network available in most areas, your presence at that location will be logged, correlated to your identity and sold. Again, Foursquare sells this data directly, not the ability to advertise to you.
Thasos sells your location history based on mobile data. Slice sells your location data based on mobile SDKs it provides developers in return for your data. I can go on and on.
Is there at least an attempt to anonymize this data? Usually, yes. Does that work in practice? Sometimes, often even. But frequently it does not. You need fewer than 33 bits of independent data to uniquely identify anyone on Earth. There are only so many other people in your area and age demographic who share similar waking hours, interests, habits and location data.
It is a myth that companies don't sell user data directly because it's their golden goose. Facebook and Google don't, but most of the data they have can be reconstructed independently from a variety of other sources that freely advertise it.
The problem becomes even bigger depending of the company agenda and the political context.
I get to decide who knows whether or not I like cheese sandwiches. If I am a free person,equal with you under the law then how is it ok for you to violate my consent and persistently stalk me to find information about me for your own gain?
If you eat lunch at the same place, and the guy who takes your order notices you like cheese sandwiches, did he "violate your consent"?
If the same guy happens to be in line behind you a few days in a row and notices that you always order cheese sandwiches, did he "violate your consent"?
I agree the behavior is a bit creepy in the same way I (very) mildly dislike it when the sandwich shop guy knows my order, but it is also unreasonable to expect information you share with others to be kept private. It's not "stalking" for someone to observe things about the world that were made available to them. To change that would necessitate some pretty ugly regulations that would infringe on the rights of everyone (including you).
I don't like that sort of thing because you can't depend on it. Even if you had the perfect workflow for it, what about any hotel booking you have that doesn't send an email for Gmail to parse, or Gmail can't parse it? Just seems like an annoyingly probabilistic system. Aside from the obvious issues of privacy and creepiness, like how did Google Maps get information that was supposedly just between me and the hotel?
But my girlfriend absolutely loved it. She just wished it also showed us our bus route and schedule like it apparently sometimes does for her.
Roses are red,
Violets are blue,
The rest of the content
is on page 2.
This is representative of the exchange a regular person gets when browsing mainstream sites without an ad blocker.
It's not "not personal data" because it's trivial. It's not "not personal data" because I voluntarily gave it to someone.
I only want sites to store data they necessary to carry out their function and only for as long as they need it to do so (note: keeping lights on with ads does not count as function).
Data and personal data are two different things. "Data" is any information. "Personal data" is information that is about you, connected to you, or identifies you personally. I recommend poking around at how GDPR defines personal data, for example, since it's at the heart of the new emerging global privacy debate.
> If I happen to know that you like cheese sandwiches, is that really data?
Yes. It's both data, and personal data.
> When did we decide that knowing something about somebody else was such a big deal?
That's not the issue. The issue is companies knowing something about everyone else, and selling that knowledge to people you didn't give it to, and using that knowledge to do things you don't really want, starting with advertising.
Is data really that ephemeral to people? Maybe its bc i worked with intelligence systems in my first career, but the idea of controlled data seems pretty straightforward. What i choose to reveal about myself should be in my hands, and part of that decision is brokering trust that ot isnt shared.
What Is Data?
Its a record. It isnt handwavy to me or most.
If you don't think that's a big deal then you need to get to a point that it IS a big deal for you.
Okay I hear you saying “well maybe your rates should go up if you eat a lot of red meat. You’re costing everyone else a lot of money.” I’d say you have some highly misguided morals, and I’d say that such a scheme would never lead to savings for people who don’t eat red meat, but almost surely additional costs for those who do. The thing to remember is these schemes never, ever benefit you. They make other people money, generally at your expense.
Sure, sometimes (as a matter of probability). But are you really claiming that all data collection is a zero-sum game wherein only the collectors can possibly win?
Data collection isn't a zero-sum game. Between you and collectors, it's positive-sum game with you almost always having negative gains. At the society level, this game is embedded in the negative-sum game of advertising.
I mean MI5 might be tracking me right now, but it doesn't make any difference whatsoever until the point at which they take action.
 but not me!
What is asymmetrical information permanently captured?
Is ignorant coercion the same as consent?
So, yes, it is personal data.
People generally prefer freedom over being controlled and manipulated.
The only reality left is radical vulnerability. If google knows I was caught DUI one night 6 months ago because I was a couple percent over the limit driving home from a nice party, then everybody has to know. I have no choice but to tell the truth all the time, lest exposure be my downfall. This applies to every single bit of data gathered and every bit of information gathered from the analysis of that data.
There is a defcon talk that describes a German judge that jacked off to porn in chambers nearly every day and during times he really shouldn't be. No-one in his social or work life knew or was negatively effected. But these defcon engineers were able to deanonymize his data and reveal his behaviour. This applies to everybody now. Radical vulnerability is the only thing we have left to maintain social order. It's hell for those that want to minimize vulnerability for the sake of extreme competence.
People in positions of power or capital can access that information and I gain no social security or social safety from having secrets anymore.
It's whatever I want it to be, and I should be able to change my mind about it. It's not hard.
so FB/G send that advertising and you harvest the click-data and now have a catalogue of people that FB/G believe have heart problems. You then sell that data, or a company owning that data, or whatever, to insurance companies to use in their premium pricing.
If the information is going to be used to select users to show a particular advert to then that information is effectively able to be liberated to benefit the advertiser (the one buying the adverts on FB or G) or their customers (the ones buying the amalgamated data).
Then there's all the FB linked surveys, like "When will you die? Answer 10 questions to find out!" and that information just happens to be prime info for insurance people to use that they're not allowed to ask you for directly.
Maybe the survey company has to make a new "life metric" and sell that data to avoid the insurance company doing something they're not allowed, but surely this is how it's done?
As privacy and floss advocate "personal data" just makes absolutely zero sense. There's private data and public data. Private data is your passwords and stuff not exposed to public internet - everything else is public data and it's no longer "yours" unless it's copyrightable content.
The system would be all encompassing but organized by agreement type - titles, loans/mortgages, insurance, purchases/warranties, ad networks, bets, etc... All managed by an open system but used by private parties. I don't advocate for this to be done by say, etherium, I still think the classic system can be used to decide disagreements, but at least all of what you agreed to will become very explicit. And there can be ways to "break out" of agreements, with whatever very explicit ground rules to be followed after that.
It's not even hard in concept. Of course because data export/import mechanisms are so baroque and error-prone it will take effort to implement but that's already true with all existing systems.
Any time you export data you sign the transfer. Anyone else who then re-exports it has to sign, incorporating your signature in to the export, and so on.
It would actually make keeping corporate-held data clean and healthy rather much simpler, which is something people spend considerable time and money on already. And it's a basic policy mechanism to implement subject-dictated controls rather than vague, invisible, and unenforceable corporate-dictated controls such as exist today.
We gladly set up large pipelines and infrastructure to let data flow from users, through message queues, into databases, and from there into analytics workflows. But we balk at the thought of this process being anything but unidirectional, or in implementing exportable logs to track how data is tranferred, combined, or analyzed.
If the way user data propagates through third parties where auditable and visible, it would definitely at least double the work of setting up user analytics. But, other industries make do with similarly powerful regulations. If you can't afford to let users see what you're doing with their data, should you be allowed to collect user-level metrics anyway?
We could also blacklist a limited set of data types, as is effectively done with HIPAA, to better enforce privacy. However, even HIPAA is not restrictive enough, and there is a whole subfield of academia engaged in privacy research which has shown that even HIPAA compliant (in the sense they don't contain certain columns of data) datasets can be used to reveal senstive information using relinkage against public forms of data [0, 1, 2, 3, 4]. But the tech industry is better equipped than any other industry to enforce algorithmic privacy and be good stewards of data. We just don't want to, because it's hard. Building structures up to building safety code is also hard (and, in some ways, too bureaucratic/poorly implemented. Sometimes private companies can actually copyright building code laws), but it's good that we do it, in general.
 https://en.wikipedia.org/wiki/Differential_privacy famously used by Apple
P.S. : I do think HIPAA, GDPR, etc. have their flaws. But that just means we should try to do better, rather than just blindly oppose any attempt to do better. The vast majority of privacy gains can be accomplished with the simplest changes: anonymization, pseudonymization, limits on time/spatial granularity, etc.
Data monetization is an inherently harmful and unwanted business practice.
I’m thinking something like the minimum civil court valuation per month of storage or use. Likeness and data would be defined by a jury of their peers.
Our societies are shaped as much by technology as by the incessant greed of a few often couched in euphemisms like 'innovation' and 'drive' to justify their value but these only accrue to a few. Behavioral targeting and surveillance have negative externalities for everyone not making money from it, and even for them in the wider societal and long term context.
If this is the behavior we are incentivizing then either we provide strong regulations to counter greedy and unethical behavior or accept these as our fundamental driving values without fabricating a 'feel good' alternative reality as a fig leaf or feigning shock at mercenaries in our midst.
Actually, he was saying we should _have to_ opt in to data tracking, if it's something you want. [/Pedant]
It would be nice having a global clickbait variable to be attached to an URL or a domain, so that it can be community updated when titles such as this one are used, then a browser extension reads it and warns before clicking it.
Of course I would expect heavy abuses of such a tool...
PS: Firefox Quantum is great
Nice overview with examples (if somewhat lengthy):
Maybe it's just an interesting (sad) fact that an article about data privacy is full of trackers?
I think we all got what you were trying to say and we're just being pedantic, sorry.
One (which, to be clear, is not how I interpret your comment) is the tired gotcha "if you hate society, why do you partake in it?", which is simply not substantial enough to answer.
The other argument I've seen is actually worth some thought, though I'm not ready to give a definitive answer in either this case or as a general principle:
Entities that play a certain game effectively endorse that game by playing it.
The devil lies in the corollary to that observation: whatever prizes are awarded by that game will only go to entities that are willing to play that game.
What was the question?
I just can't find the right word to describe a point raised about a paradoxical relationship. When I read it, I find an implicit question, though I cannot put to words the exact inquiry. Maybe "Why?"
In a way, all replies on a public forum are implicit questions. (This is where I get even more hand-wavy, sorry)
Worst case for the game-hater is play once and fork off.
In some debates, it is used as a quick way to endorse the status quo without having to analyse it further. A way to "win" a debate that shouldn't be won by either party.
Instead of debating the current practices or the proposed alternative approach on their own merits and circumstances, you first require that one of the participants (the one proposing a different approach) step down from the debate and only come back when the proposed approach is fully baked, working, and has all the optimizations of the current approach in place.
It ignores the differences between the two participants — be it background, power, money, skills, context, future commitments.
Then yes, you are not tracking.
what should they do instead?
What is your suggestion for an alternative message? "We are trying to make money like everyone else, please use our search engine that is not an altruistic public good and is trying to influence regulation that will hurt our competitors".
Doesn't really roll off the tongue.
If health insurance companies lobbied the government to require that all citizens have health insurance, I'd take a ton of issue. But if health insurance companies lobbied to require doctors to get consent if they want to harvest tissue or organ samples from patients for personal research, I'd support that change.
It's all grey area but as of yet I don't think DDG is at all in the wrong.
What do you mean by that?
Regulation like GDPR only makes it more difficult (impossible) for competition to emerge. It basically secures the future dominance of big players like Google and Facebook.
Making people pay a monthly fee to get their data collected and analyzed seems like the next big business model. We will soon reach a limit on the usefulness of passively collected data and will need to switch to a more active model.
I would prefer to have complete access to data about myself, but it would be unreasonable to expect a monopoly on it.
To a certain extent, I'd agree. Specifically in the case of things like analytics data.
The real problem is people not knowing this data is being collected in the first place where you can't make any kind of informed choice on opting in.
"the data is not available publicly without explicit, informed consent, and cannot be used to identify a subject without additional information stored separately. No personal data may be processed unless it is done under a lawful basis specified by the regulation, or unless the data controller or processor has received an unambiguous and individualized affirmation of consent from the data subject. The data subject has the right to revoke this consent at any time.... Data subjects have the right to request a portable copy of the data collected by a processor in a common format... violators of the GDPR may be fined up to €20 million ..."
Key phrases here: 'explicit, informed consent' ... 'lawful basis' ... 'right to revoke this consent' ... 'fined'