Hacker News new | past | comments | ask | show | jobs | submit login
Interview with DuckDuckGo CEO Gabe Weinberg (vox.com)
323 points by tchalla on May 27, 2019 | hide | past | favorite | 167 comments

Duckduckgo - Based in US / very weak privacy laws

Servers - hosted using Amazon (AWS)

Bangs aren't safe. For example typing “!g kittens in basket” and hitting return, drops you off on the Google website to display your results (thus logging your IP, search term and browser info immediately).

DuckDuckGo is owned by Gabriel Weinberg who is is the founder, current CEO and controlling shareholder. Investors/shareholders include Union Square Ventures and several others. DuckDuckGo generates it’s income from advertising (Bing Ads) and collects affiliate revenue (Amazon, eBay).

Duckduckgo and Yahoo partnership https://web.archive.org/web/20160724030640/



Duckduckgo has no audit

Duckduckgo gives out a HTTP header field that identifies the address of the webpage.

Both companies were asked "if you were ordered to compromise your service/customer privacy in any way would you"

DuckDuckGo – Gabriel Weinberg said: “No one is preventing me from doing that.”

I'm not sure what's the purpose of this comment? You're really grasping for straws here if you're trying to point out that DDG is same as google/bing etc.

> Bangs aren't safe. For example typing “!g kittens in basket” and hitting return, drops you off on the Google website to display your results (thus logging your IP, search term and browser info immediately).

That's the whole point of bangs is that they redirect you to the website. How do you imagine ddg redirecting you to google without giving google your ip address?

> DuckDuckGo is owned by Gabriel Weinberg who is is the founder, current CEO and controlling shareholder. Investors/shareholders include Union Square Ventures and several others. DuckDuckGo generates it’s income from advertising (Bing Ads) and collects affiliate revenue (Amazon, eBay).

Is that a bad thing? Ads are keyword based, why is that a bad thing?

> Duckduckgo gives out a HTTP header field that identifies the address of the webpage.

What does this even mean? Referer?

I feel you may be expecting DDG to be something it’s not meant to be.

The jurisdiction and infrastructure aren’t very relevant, DDG is about protecting yourself from advertisers, not the US government.

Bangs aren’t supposed to be safe. DDG is quite upfront about that.

> Remember, though, because your search is actually taking place on that other site, you are subject to that site’s policies, including its data collection practices. [1]

Yes, DDG did use Yahoo to power the searches, now they use Bing. Searches are forwarded anonymously. More than enough to protect you from advertisers.

They do advertise based on search keywords. This does not conflict with their mission. Same with affiliate links.

The lack of an audit is a fair criticism.

Can you elaborate on the header?

I would expect DDG to comply with government orders, because I don’t expect DDG to protect me from the government.

[1] https://duckduckgo.com/bang

As a fan of DDG (I use it daily), I had no idea there was this much risk -- I was surprised enough to find out they offloaded search to Bing a while ago (but also I don't blame them, it's probably pretty hard).

Even with these hangups, I still think any near-viable option to Google should exist.

[EDIT] - just realized it's only been 20 minutes (not a day) -- looking forward to DDG's response

There is so much to say.

In practice, DDG revenues depends on tracking visitors.

DDG asks third-parties to track (Yahoo/Bing ads), so if someone asks "do you track? NO! WE DON'T". They don't need to create a user profile, the keyword based ads pay enough.

This is the catch, "we don't track, but we ask our partners to do it for us".

The same with Amazon affiliate revenue. Did you know that DuckDuckGo had access to the whole history of items bought through their affiliate link ?

It's a detailed item list, like "3 x Happy Belly Dried Mango, 500 g" Every, single, item.

No tracking (◠ ‿ ◠)

AT&T tried this and it didn't go well. For a time you could get a discount on your internet if you opted into their data tracking programs. From what I heard, there was a lot of complaining and little uptake, which is backed up by the fact that they cancelled the program less than a year later and just started charging everyone the lower price (and claimed that they stopped the tracking).

I think people don't want to be reminded that they're being tracked, which is possibly why it went poorly, but this will be a problem with any company that wants to do opt in tracking.

At 13m in, he uses the fact that around 10% have turned on DNT in theri browser as a proof that people actually care about tracking, but haven't multiple browsers started turning on DNT by default?

Yes Microsoft enabled it by default in their browser (can't remember if this was the days of IE or Edge) and advertisers used that as an excuse that "DNT doesn't represent user choice" and that they could ignore it.

Safari is removing (or removed already?) DNT simply because websites are still tracking visitors regardless if DNT was enabled or not.

And because DNT was used as an additional data point to fingerprint the browser and enable more tracking.

Just confirming, it was already removed!

Discounting Internet by axing privacy is a nasty idea. Privacy should be available by default without any added price tags.

This isn't some moral position, it's an economic one.

ISP's make money with the data they gather. Once you remove that as a revenue stream, it'll naturally increase the prices those companies are willing to charge.

They make more than enough money by providing the Internet service. They should have nothing to do with your data. Especially considering, how easily they can spy on you. It's like your post office delivery would charge you more, if they'll not sniff through your mail. This thing shouldn't even come up as a "feature", it should be the always enabled default.

The USPS likely would charge you more, if they weren't already in the business of selling both address lists and access (carrier route deliveries of bulk mail). So they might not be sniffing your mail, but from what it seems the postal service needs your data to survive.

That's why it should be a government service. No more tracking by private companies reselling your data to everyone and their dog. No more foreign governments selling you crazy ads.

It's not like government can't abuse privacy all the same.

> They make more than enough money by providing the Internet service

So you are going to decide whether they are making enough money? Don't like tracking? Switch over to a privacy respecting ISP.

It's obvious greed. There is no need to violate people's privacy for the ISP. Which you yourself point out. They don't do it out of need, but because they can.

> Switch over to a privacy respecting ISP.

As if this was a readily available option.

Verizon did it too, they would give you a token amount of "rewards points" or something like that which you could use on accessories or bull credit if I recall.

We use this term "personal data" so much. But what IS data?

If I happen to know that you like cheese sandwiches, is that really data? When did we decide that knowing something about somebody else was such a big deal?

I feel like this whole brouhaha about data has left me behind.

>When did we decide that knowing something about somebody else was such a big deal?

When data became a commodity that can be analysed, acted upon, sold and used to make predictions about groups or individuals.

No, it was when the nerds started doing it.

Credit reporting agencies were always all about knowing something about somebody else en mass as a commodity that can be analysed, acted upon, sold and used to make predictions about groups or individuals. But these are companies best represented by people who wear proper suits and play proper golf. Nobody batted an eye.

It's not about nerds and suits, it's about how frequently we are exposed to data collection. Credit reporting agencies collected a powerful set of data (personal finance information), but you wouldn't really notice it until you applied for a loan, mortgage, or credit card.

Contrast that to today's barrage of tracking and advertising. Firefox Focus tells me to date that it has blocked 65k trackers in the past ten months. I mention a specific tool I need for a project to my dad, he searches for it, and then I see advertisements.

The "nerds" didn't forget to wear suits and play golf, they expanded the data collection net and feedback loop for advertisements to an unprecedented scale.

>The "nerds" didn't forget to wear suits and play golf, they expanded the data collection net and feedback loop for advertisements to an unprecedented scale.

Exactly. It's intellectually bankrupt to suggest that tech is just like everything else before, all while conveniently ignoring the change in scale. Scale matters, and scaling up changes a system's implications and risks. It's irresponsible to think otherwise.

Just because I'm OK with using party poppers doesn't mean I want a flashbang to blow up next to me.

It's also important to understand that the knowledge of how to utilize this data to better control your actions has grown exponentially too. And the scale of computing power lets that manipulation be tailor fit to you as a target. We know exactly what to show you, when to show it to you, and in what order so that we can influence your behavior.

A very simple and generic example is if you prime someone with a picture of an American flag, they're more likely to vote Republican. So with a slight tweak to the algorithm, on election day we can guarantee more or fewer Republican votes just by changing the ranking of posts with a flag on Twitter.

And the more we know about you as an individual, the more we can control you by priming you in this way.

> A very simple and generic example is if you prime someone with a picture of an American flag, they're more likely to vote Republican.

Would be careful with that. AFAIK priming studies generally fail to replicate.

Never heard of this phenomenon, any study/source info?

You are saying that people like the banks more than they like tech companies? Or that credit companies had anything nearly approaching the data that is available today -- 24/7 location data, all purchases, all associates, and more?

Credit reporting bureaus are nerds in suits. I'm inclined to think of them like actuaries or insurance guys.

It's easy to dismiss privacy because it's the absence of something: namely the absence of information that can be leveraged against you. It's a preventative measure to avert future harms like discrimination, data leaks to adversaries, and misguided decision-making. It's cheaper and more effective to cut that off at the source rather than trusting companies (who were sneaky in collecting that data in the first place) to be responsible stewards of that information.

Knowing something is one thing, but having a perfect memory of everything you've ever desired and liked is completely different.

The sheer scale of the intrusion as technology has advanced is the problem. You're not telling someone "I love pizza," they are stalking you across the internet and deciding that you must love pizza, because their mountains of data confirm it.

You won't have much luck getting these algorithms to change their mind about you, since they're pulling it from your browser footprint, your IP, your connected networks, social media buttons, cookies, data sold by companies you've shopped with, etc...

I almost wonder if best practices in the future will incorporate "data hygiene" involving 15-20 minutes of random browsing of topics you're completely uninterested in, just to add noise to your profile.

Why yes, Pampers, I am in the market for adult diapers. And denture cleaner.

That doesn't work at all. Are you going to fill your network with people who aren't really your friend? Visit locations you don't want to visit? Devote more than have your time to this disinformation?

I think we're already starting to see it, but not by that name (I love that name, btw). I think the rise in disinformation is a reaction to how hard it is to keep things private. If I can't hide information, then I'll insert false information to make sure nobody knows what is the true information.

While I think this may help maintain my privacy, I worry that on a societal level it may help us lose our sanity.

There's at least one plugin for that: https://noiszy.com/

With accompanying HN discussions from 2017: https://news.ycombinator.com/item?id=14002995

That is easy to filter out.

No it's not easy. Hell, the systems fail on bloody missclicks. Click on steak link by accident and prepare yourself for a month of steak ads eventhough you have been vegan for bigger part of a decade and never browsed anything steak related before.

It’s not a threat model that advertising systems care about. If it becomes widespread, the filters will be smarter.

How, for example?

Most humans are bad at coming up with random data and queries, these things would tend to be obviously unrelated to their actual profile. Not to mention that these "randomisation sessions" are very bursty time-wise if it wasn't being done automatically and continuously.

As far as I understood OP implied it would be an automated process that already exists: https://adnauseam.io/

See the technique Apple uses to gather aggregate data about their customers: https://en.wikipedia.org/wiki/Differential_privacy

If this technique became widespread, advertisers could compare users A,B C, and … Z to filter out the “random clicks” (since they would all just as equally click these advertisements). This will leave A’s extra clicks still in the data sets letting the advertiser build a profile for A.

> to filter out the “random clicks” (since they would all just as equally click these advertisements).

I mean sure it _might_ work against something very basic but adnauseam is a bit more advanced than that. I also avoid trusting any marketing lingo as technological solution.

The fact is it would be very hard to catch this, even the most basic implementation. Not only that but it's absolutely not worth doing it as huge huge minority of profiles are compromised, we're talking 0.01% of all profiles here.

Not exactly this, but yes.

Plus, let's go to yellow pages and randint(1,999999) for a business listing and spend some time browsing that.

Plus, a random selection of Google trends in various niches.

Plus, maybe the best selling items in a few Amazon verticals.

Maybe toss in some "how to treat"+ WebMD topics.

What for? What is your threat model? Getting ads for things you are actually interested in?

I haven't really resolved my own thinking about all of this, but I think that take is overly simplistic.

How about beginning to target dating sites to your partner because they've figured out you are having affair before the partner has? Or divorce attorneys?

Or notifying your work recent changes in pattern make you look like a high flight risk? Or your insurance company that you may have an unreported condition? Same basic data sources can feed into that analysis.

Those are just obvious ones, and I'm not claiming they are specifically currently being done ... but the basic technology isn't a barrier.

This is why it is a bit disingenuous to echo the common refrain "what's so bad about better targeted ads" as if that adequately describes what is at stake.


And it's not just how it's used today. It's how it can be archived and analyzed 10 years down the road with all the additional information gleaned from you in that time.


And that's using 7 year old tech.

If one day I develop diabetes and start researching insulin and products, I don't need a company automatically sending me flyers for blood strips.


Oh look, now we historical catalog of all the customized mail that was sent to you, by companies who know what you're looking for.

And we all know how reliable and secure government and corporate databases are. /s

Historically people have not liked other people building and curating dossiers about them.

If I told you that I like cheese sandwiches, then no. However, in normal, human interactions there's the reasonable expectation that if I tell you something, you won't go and sell it to the highest bidder who can then in turn do whatever they please with the information and further propagate it. Privacy isn't a huge problem in society because as humans most people have adequate social skills and understand what is and isn't appropriate to disclose and share. The problem is that tech companies don't treat their users as humans and don't follow basic conventions and common sense.

> When did we decide that knowing something about somebody else was such a big deal?

With the risk of pulling a semi-Godwin, you should visit the Stasi museum here in Berlin. It gives you a good insight into how powerful “knowing something about somebody” can be.

Echoing a comment I made a while back, there's a difference between transient (a humorous remark amongst friends) and permanent information (a humorous remark printed in a book).

Except that the internet has made all transient data permanent (i.e. my jokes from high school are free to see on Twitter), and our social systems simply can't haven't kept up with this change.

No, no one cares about your lunch today. But if I have your history of eating sandwiches every day, and sell it to your insurance company, who now increases your quote because an algorithm says that you have a higher risk for heart attacks or colon cancer; then it might matter.

They can effectively do the same thing by offering discounts for those who submit their lunch data. Like they already do with excercise data from wearable devices.

Insurance is in a funny spot, because the whole premise of insurance is to protect against the unknown, but better and better data is making things known that weren’t before, requiring a change in pricing (lest their competitors beat them to it).

In other words, they ask you to opt in.

But the sick part is the asymmetry: the company(ies) know, and the individuals don't.

Would an individual change their practices if they were told that something they were doing was harmful? But, they were never given that choice. Instead, the data is collected, categorized, and sold. Insurance companies can collaborate all the data and extrapolate trends at scale.

Now, these insurance companies also have a fiduciary responsibility to lower costs and raise profits to their shareholders. What that amounts is that every insurance company will have to do the same, with slight variations. But we the populace will be worse off with unknown black boxes saying worse things about us (mortality, health, safety) and for us few ways to change things aside gross generalizations our doctors give us.

> fiduciary responsibility to lower costs and raise profits to their shareholders.

This is a thoroughly debunked lie

Good thing none of the major advertisers would even consider selling the actual data connected to you to anyone else. They will happily sell the ability to advertise to you, but that is a very different thing.

That's not really correct.

Yodlee almost certainly has a record of the vast majority of your credit/debit transactions. It doesn't sell the ability to advertise to you as a specific demographic, it sells that data directly.

If you log into the free public wireless network available in most areas, your presence at that location will be logged, correlated to your identity and sold. Again, Foursquare sells this data directly, not the ability to advertise to you.

Thasos sells your location history based on mobile data. Slice sells your location data based on mobile SDKs it provides developers in return for your data. I can go on and on.

Is there at least an attempt to anonymize this data? Usually, yes. Does that work in practice? Sometimes, often even. But frequently it does not. You need fewer than 33 bits of independent data to uniquely identify anyone on Earth. There are only so many other people in your area and age demographic who share similar waking hours, interests, habits and location data.

It is a myth that companies don't sell user data directly because it's their golden goose. Facebook and Google don't, but most of the data they have can be reconstructed independently from a variety of other sources that freely advertise it.

If as an individual I know mentally that, it's most of the time ok. If I start taking notes on paper, the chances it gets weird increase. If I do it systematically for a lot of things in your life, we go into creep territory. Now if instead of an individual, a company does that, and instead of one person, it does that to millions, there is a problem.

The problem becomes even bigger depending of the company agenda and the political context.

It's a huge deal. I didn't tell you I like cheese sandwiches. You found out against my consent. You violated my consent,which is a huge deal for me.

I get to decide who knows whether or not I like cheese sandwiches. If I am a free person,equal with you under the law then how is it ok for you to violate my consent and persistently stalk me to find information about me for your own gain?

If you order cheese sandwiches every time you eat out, and your friends observe that you like cheese sandwiches without you telling them, did they "violate your consent"?

If you eat lunch at the same place, and the guy who takes your order notices you like cheese sandwiches, did he "violate your consent"?

If the same guy happens to be in line behind you a few days in a row and notices that you always order cheese sandwiches, did he "violate your consent"?

I agree the behavior is a bit creepy in the same way I (very) mildly dislike it when the sandwich shop guy knows my order, but it is also unreasonable to expect information you share with others to be kept private. It's not "stalking" for someone to observe things about the world that were made available to them. To change that would necessitate some pretty ugly regulations that would infringe on the rights of everyone (including you).

I disagree with that. It's nit someone remembering I ate a cheeseburger,they're writing down that I ate a cheeseburger and using that information for profit or to keep tabs on me,you need consent for that. They didn't notice I eat cheeseburgers as a matter of coincidence, they intentionally set out to see what food i eat and use that information. There is a term for that: stalking (which is illegal)

You told someone you like cheese sandwiches. You don't get to decide if they remember you telling them that or not.

You also don't get to decide who they pass the information on to. The sandwich maker analogy isn't that great because IRL robbers off sandwich shop owners daily as a means of keeping their data private.

Is it stalking if you're willingly giving information in the form of search queries?

Would you be willing to send me, a stranger, your complete browser history?

Well, what will you give me in return? If you give me something good like high quality maps and directions then maybe.

It’s often not a trade or is a trade for something rubbish.

What do you mean by “something rubbish”? Can you give an example? As they say, one man’s trash is another’s treasure.

I am going on a trip with my girlfriend, and while looking at her Google Maps, I saw that it had marked the hotel we were staying at along with the dates we would be there. Right there next to the hotel marker. Impressive, really.

I don't like that sort of thing because you can't depend on it. Even if you had the perfect workflow for it, what about any hotel booking you have that doesn't send an email for Gmail to parse, or Gmail can't parse it? Just seems like an annoyingly probabilistic system. Aside from the obvious issues of privacy and creepiness, like how did Google Maps get information that was supposedly just between me and the hotel?

But my girlfriend absolutely loved it. She just wished it also showed us our bus route and schedule like it apparently sometimes does for her.

I'll give you a poem.

  Roses are red,
  Violets are blue,
  The rest of the content
  is on page 2.
Please send me your data now.

This is representative of the exchange a regular person gets when browsing mainstream sites without an ad blocker.

Yes absolute that is (personal) data if it's attributable to a person such as myself.

It's not "not personal data" because it's trivial. It's not "not personal data" because I voluntarily gave it to someone.

I only want sites to store data they necessary to carry out their function and only for as long as they need it to do so (note: keeping lights on with ads does not count as function).

> We use this term "personal data" so much. But what IS data? [...] I feel like this whole brouhaha about data has left me behind.

Data and personal data are two different things. "Data" is any information. "Personal data" is information that is about you, connected to you, or identifies you personally. I recommend poking around at how GDPR defines personal data, for example, since it's at the heart of the new emerging global privacy debate.





> If I happen to know that you like cheese sandwiches, is that really data?

Yes. It's both data, and personal data.

> When did we decide that knowing something about somebody else was such a big deal?

That's not the issue. The issue is companies knowing something about everyone else, and selling that knowledge to people you didn't give it to, and using that knowledge to do things you don't really want, starting with advertising.


Is data really that ephemeral to people? Maybe its bc i worked with intelligence systems in my first career, but the idea of controlled data seems pretty straightforward. What i choose to reveal about myself should be in my hands, and part of that decision is brokering trust that ot isnt shared.

What Is Data? Its a record. It isnt handwavy to me or most.

Ultimately your fondness for cheese sandwiches ends up in an aggregated dossier of all information about you. The type of bread you like. Your SSN. Your income.

If you don't think that's a big deal then you need to get to a point that it IS a big deal for you.

Suppose someone has collected data on my groceries. They now know I buy a lot of red meat, and may then figure I must be eating a lot of red meat. Suppose my insurance rates now go up.

Okay I hear you saying “well maybe your rates should go up if you eat a lot of red meat. You’re costing everyone else a lot of money.” I’d say you have some highly misguided morals, and I’d say that such a scheme would never lead to savings for people who don’t eat red meat, but almost surely additional costs for those who do. The thing to remember is these schemes never, ever benefit you. They make other people money, generally at your expense.

So you are saying that insurance company should just have to accept higher loss from you?

I’d say that for-profit insurance as a concept is unethical, but also yes.

> They make other people money, generally at your expense.

Sure, sometimes (as a matter of probability). But are you really claiming that all data collection is a zero-sum game wherein only the collectors can possibly win?

Not sometimes, almost always.

Data collection isn't a zero-sum game. Between you and collectors, it's positive-sum game with you almost always having negative gains. At the society level, this game is embedded in the negative-sum game of advertising.

OTOH I'm not sure why stalking is legal, provided you do it in bulk.

Someone [1] might instead ask "why is stalking illegal if you're not doing anything that causes an [deleterious] effect on the person".

I mean MI5 might be tracking me right now, but it doesn't make any difference whatsoever until the point at which they take action.

[1] but not me!

And then, you may ask "why bulk stalking is legal given that it has deleterious effect on people". Deleterious by means of ads, where the point of most of them is to distract you from what you were doing and convince you to make a purchase against your best interest.

A big example is Facebook looking at how long people spent on their feed, how much stuff they liked, how much they responded, if those responses were negative or positive, if they spent more time on negative content, etc; Once they had a decent understanding of that they literally A/B tested if people stayed on Facebook longer if they made them sad or if them made them happy. So yeah, if 'data' allows you to do clandestine psychology experiments on a massive scale, that is a big deal.

> their feed

Questioning 'what is data' feels disingenuous, hoping the essence is lost in the smoke and noise.

What is asymmetrical information permanently captured?

Is ignorant coercion the same as consent?

If you're a person and I've never met you, it would be creepy if you knew what kind of cheese sandwiches I particularly like.

So, yes, it is personal data.

They have met you. You order sandwiches in their cafe every day, if the metaphor is followed.

Except that companies like Facebook have an employee at the exit of every shop/cafe/bar/cinema/park you visit. This employee notes the time, place, and your name in a little notebook. Every night the notebooks are shipped to HQ and the data is combined into a profile about you. Somehow this is not creepy.

How we spend our time and money is important personal data. I bet that nobody will part with this information in a totally transparent way to people they personally know.

The whole reason for having this amount of surveillance is to control behavior.

People generally prefer freedom over being controlled and manipulated.

I honestly don't understand how we're supposed to operate being monitored 24/7. The idea that any information that contains a personal, social or corporate advantage is likely going to go through an electronic device attached to FAANG, is 1984 on steroids. FAANG + MS has the information to out-compete, out-advantage anybody at anytime.

The only reality left is radical vulnerability. If google knows I was caught DUI one night 6 months ago because I was a couple percent over the limit driving home from a nice party, then everybody has to know. I have no choice but to tell the truth all the time, lest exposure be my downfall. This applies to every single bit of data gathered and every bit of information gathered from the analysis of that data.

There is a defcon talk that describes a German judge that jacked off to porn in chambers nearly every day and during times he really shouldn't be. No-one in his social or work life knew or was negatively effected. But these defcon engineers were able to deanonymize his data and reveal his behaviour. This applies to everybody now. Radical vulnerability is the only thing we have left to maintain social order. It's hell for those that want to minimize vulnerability for the sake of extreme competence.

People in positions of power or capital can access that information and I gain no social security or social safety from having secrets anymore.

We use this term "personal data" so much. But what IS data?

If I happen to know that you like cheese sandwiches, is that really data? When did we decide that knowing something about somebody else was such a big deal?

It's whatever I want it to be, and I should be able to change my mind about it. It's not hard.

When my health insurance goes up because of it?

FB and Google will never share your PII with a 3rd party in the way you're describing (on purpose - and hacks are few and far between). They will sell permission to advertise to you - and even then, only if you're targeted in a bucket of a couple of thousand real matches or more.

"We want to target all people who have a heart condition" ...

so FB/G send that advertising and you harvest the click-data and now have a catalogue of people that FB/G believe have heart problems. You then sell that data, or a company owning that data, or whatever, to insurance companies to use in their premium pricing.

If the information is going to be used to select users to show a particular advert to then that information is effectively able to be liberated to benefit the advertiser (the one buying the adverts on FB or G) or their customers (the ones buying the amalgamated data).

Then there's all the FB linked surveys, like "When will you die? Answer 10 questions to find out!" and that information just happens to be prime info for insurance people to use that they're not allowed to ask you for directly.

Maybe the survey company has to make a new "life metric" and sell that data to avoid the insurance company doing something they're not allowed, but surely this is how it's done?

Given that you can track who clicked on your ad yourself, this essentially turns Google and FB into a query engine on personal data. Placing a highly-targeted ad is like SELECT [what you can track yourself] FROM person WHERE [match against targeting criteria];. The recent voting manipulation scandals were essentially about using Facebook like this.

The data is too messy for that to work in practice.

I've been hammering this into people's heads for over a decade now. Anything you put public is not "personal data". If you go to town square and yell that you like cheese cake do you then expect everyone to forget it upoin request. Should the local cake shop forget that you just said you like cheese cake?

As privacy and floss advocate "personal data" just makes absolutely zero sense. There's private data and public data. Private data is your passwords and stuff not exposed to public internet - everything else is public data and it's no longer "yours" unless it's copyrightable content.

Not only should we opt in but there should be a (digitally) signed chain of custody showing the exact derivation of any information held by 3rd parties that in itself, or when combined with others, can identify and reveal information about individuals.

Be real. You are saying that data should not be collected at all.

I've long dreamed of a contract system in the world that does this. Every contract you're apart of, becomes explicit. We get paid for our data based on our preferences. Let's say the cost of showing me an ad is $51, so be it, it that means I can't read a blog post - fine, at least I agreed to the the terms for a price.

The system would be all encompassing but organized by agreement type - titles, loans/mortgages, insurance, purchases/warranties, ad networks, bets, etc... All managed by an open system but used by private parties. I don't advocate for this to be done by say, etherium, I still think the classic system can be used to decide disagreements, but at least all of what you agreed to will become very explicit. And there can be ways to "break out" of agreements, with whatever very explicit ground rules to be followed after that.

I'm absolutely being real, and I know how to do it.

It's not even hard in concept. Of course because data export/import mechanisms are so baroque and error-prone it will take effort to implement but that's already true with all existing systems.

Any time you export data you sign the transfer. Anyone else who then re-exports it has to sign, incorporating your signature in to the export, and so on.

It would actually make keeping corporate-held data clean and healthy rather much simpler, which is something people spend considerable time and money on already. And it's a basic policy mechanism to implement subject-dictated controls rather than vague, invisible, and unenforceable corporate-dictated controls such as exist today.

I work with user data as part of my job.

We gladly set up large pipelines and infrastructure to let data flow from users, through message queues, into databases, and from there into analytics workflows. But we balk at the thought of this process being anything but unidirectional, or in implementing exportable logs to track how data is tranferred, combined, or analyzed.

If the way user data propagates through third parties where auditable and visible, it would definitely at least double the work of setting up user analytics. But, other industries make do with similarly powerful regulations. If you can't afford to let users see what you're doing with their data, should you be allowed to collect user-level metrics anyway?

We could also blacklist a limited set of data types, as is effectively done with HIPAA, to better enforce privacy. However, even HIPAA is not restrictive enough, and there is a whole subfield of academia engaged in privacy research which has shown that even HIPAA compliant (in the sense they don't contain certain columns of data) datasets can be used to reveal senstive information using relinkage against public forms of data [0, 1, 2, 3, 4]. But the tech industry is better equipped than any other industry to enforce algorithmic privacy and be good stewards of data. We just don't want to, because it's hard. Building structures up to building safety code is also hard (and, in some ways, too bureaucratic/poorly implemented. Sometimes private companies can actually copyright building code laws[5]), but it's good that we do it, in general.

[0] https://en.wikipedia.org/wiki/K-anonymity

[1] https://en.wikipedia.org/wiki/L-diversity

[2] https://en.wikipedia.org/wiki/T-closeness

[3] https://en.wikipedia.org/wiki/Differential_privacy famously used by Apple

[4] https://en.wikipedia.org/wiki/De-identification

[5] https://techcrunch.com/2019/04/09/can-the-law-be-copyrighted...

P.S. : I do think HIPAA, GDPR, etc. have their flaws. But that just means we should try to do better, rather than just blindly oppose any attempt to do better. The vast majority of privacy gains can be accomplished with the simplest changes: anonymization, pseudonymization, limits on time/spatial granularity, etc.

The problem for businesses, of course, is that nobody would opt in. Data collection is something literally nobody wants, and the only reason it survives is because most people don't know what these companies are doing, or aren't able to keep up with the ways to decline being included in it.

Data monetization is an inherently harmful and unwanted business practice.

No one would partake in something they don’t like if they’re made aware of it and have a choice... well, good?

Let’s take it a step further and have AdBuddies from Maniac

I loved that part the most about Maniac. I hate ads but at least AdBuddy is entirely more obvious. Its like sponsored living.

Or make a kind of asset out of a persons likeness and data. Endow individuals with ownership of their likeness and data. Give them something they can demand payment for storing, using, and selling.

I’m thinking something like the minimum civil court valuation per month of storage or use. Likeness and data would be defined by a jury of their peers.

If anyone is convinced what they are doing is good for end users they would let them make the choice. But expending time and energy to design and code deceptive dialogs with misinformation and avoid transparency betrays the opposite. You don't need an ethics course for this, it's willful fraud and deception.

Our societies are shaped as much by technology as by the incessant greed of a few often couched in euphemisms like 'innovation' and 'drive' to justify their value but these only accrue to a few. Behavioral targeting and surveillance have negative externalities for everyone not making money from it, and even for them in the wider societal and long term context.

If this is the behavior we are incentivizing then either we provide strong regulations to counter greedy and unethical behavior or accept these as our fundamental driving values without fabricating a 'feel good' alternative reality as a fig leaf or feigning shock at mercenaries in our midst.

TLDR; Most money made from advertising is still contextual, however because it's possible for G/FB to monopolize behavioral advertising and there is no regulation to prevent it, they do it anyway. Opt out is stupid: I'd like food that isn't tainted, please. Filter bubbles are bad news macro-politically. Regulation is rarely effective but still a good idea. Corporate fines should be two orders of magnitude larger. Instagram does evil things to your brain.

It’s already there. That’s one of the European GDPR core rules.

we should, but can we? of course not... it need to be a law

What a badly copy-edited headline (from the original article). Maybe just me, but my first reaction reading it was "Why is the CEO of Duck Duck Go advocating opting-in to data tracking?

Actually, he was saying we should _have to_ opt in to data tracking, if it's something you want. [/Pedant]

In cases like this we give up trying to finesse it and just say "Interview". Title changed thus above.

This is why this is blatant clickbait

The only way to stop this is by starving their hunger for clicks.

It would be nice having a global clickbait variable to be attached to an URL or a domain, so that it can be community updated when titles such as this one are used, then a browser extension reads it and warns before clicking it.

Of course I would expect heavy abuses of such a tool...

The thing that really bothers me is that it ought to read "opt in to", but I agree with your sentiment.

The irony here is that this article has at least 20 trackers.

PS: Firefox Quantum is great

Is it really ironic though? This article isn't published by DuckDuckGo - it's published by Vox, which doesn't necessarily agree with their viewpoint.

I didn't say it was hypocritical. Just ironic.

It's neither ironic nor hypocritical that the Vox website reporting on DDG's view about data privacy has 20 trackers.

Nice overview with examples (if somewhat lengthy): https://www.quora.com/What-is-the-difference-between-irony-a...

> Nice overview with examples

And trackers.

What definition of irony would this fit? Honestly asking.

The Alanis Morissette definition.

I sincerely don't think it fits but I'm no expert in philosophy.

Maybe it's just an interesting (sad) fact that an article about data privacy is full of trackers?

I think we all got what you were trying to say and we're just being pedantic, sorry.

Because Vox is using the concept of losing your privacy to ad trackers and companies to generate clicks, whilst providing those companies with valuable user data and impressions.

There are at least two possible arguments behind this question.

One (which, to be clear, is not how I interpret your comment) is the tired gotcha "if you hate society, why do you partake in it?", which is simply not substantial enough to answer.

The other argument I've seen is actually worth some thought, though I'm not ready to give a definitive answer in either this case or as a general principle:

Entities that play a certain game effectively endorse that game by playing it.

The devil lies in the corollary to that observation: whatever prizes are awarded by that game will only go to entities that are willing to play that game.

> There are at least two possible arguments behind this question.

What was the question?

"Question" as used here is closer to "matter" or "problem" than to asking.

I just can't find the right word to describe a point raised about a paradoxical relationship. When I read it, I find an implicit question, though I cannot put to words the exact inquiry. Maybe "Why?"

In a way, all replies on a public forum are implicit questions. (This is where I get even more hand-wavy, sorry)

Corales are full of holes: _whoever knows the prizes and doesn't like the game can recreate them._

Worst case for the game-hater is play once and fork off.

Pointing out hypocrisy is now brushed off as a "tired gotcha". What a world we live in.

Pointing out hypocrisy by itself is not a virtuous action; it is a neutral vehicle for a deeper analysis of traditions, opinions, mores and values. This resulting analysis can then be virtuous or not.

In some debates, it is used as a quick way to endorse the status quo without having to analyse it further. A way to "win" a debate that shouldn't be won by either party.

Instead of debating the current practices or the proposed alternative approach on their own merits and circumstances, you first require that one of the participants (the one proposing a different approach) step down from the debate and only come back when the proposed approach is fully baked, working, and has all the optimizations of the current approach in place.

It ignores the differences between the two participants — be it background, power, money, skills, context, future commitments.

The opt-in/opt-out arguments is a valid one and DDG IS my primary search engine and I love it. BUT, I must say this pseudo-activisim by DDG has become too much now, they're building this pure and altruistic image whereas obviously they will directly benefit from regulation and hurting Google!

What is "too much" and "pseudo" about their activism? Yes, they indeed directly benefit from that. What is wrong with having a privacy-respecting monetization?

Their own analytics/tracking is opt-out ;)

DuckDuckGo staff here. Just want to point out that we do not track users - opt in or opt out. We do serve ads that can be disabled (opt-out) but they’re contextual, based on individual search terms rather than any form of tracking.

Thank you for taking the time to set the record straight. I've been a DDG user for a number of years now, and I'm so glad that it exists. Its demonstrated integrity feels exponetially relevant, in contrast with state and corporate behavior these days.

Ah true, if you don't consider logging search queries, and storing device information on improving.duckduckgo.com, and sending all data to your partner Yahoo/Bing, and use affiliate tracking codes.

Then yes, you are not tracking.

I think it could hurt their image in the long-term. I've already seen others complain here and there about the pushy narrative, they just need to be more subtle about it. Again, I'm using DDG and want them to expand so I want what's best for them!

Problem with people in the tech community is to think that sometimes anything that involves promoting yourself is being too push. DDG needs to do interview, needs to be on the front of the issue and needs to be doing activism in order to make changes. Just having a browser and shutting their month to not sound "pushy" isn't going to help them grow and make the changes that WE need.

> I've already seen others complain here and there about the pushy narrative, they just need to be more subtle about it.

what should they do instead?

Have more of a sense of humour about it. I think most complaints I've seen about people on this seem to mostly be tonal.

Are you suggesting DDG stop marketing the primary benefit of their product in an effort to be "real" with its customers?

What is your suggestion for an alternative message? "We are trying to make money like everyone else, please use our search engine that is not an altruistic public good and is trying to influence regulation that will hurt our competitors".

Doesn't really roll off the tongue.

They could do what they're already doing without mentioning Congress or altering regulations. I use DDG and will continue to, but can't say I care for them (or anyone/anything else I support) slamming down on that "let's get the government to help me" lever.

I think you can take each one on a case by case basis.

If health insurance companies lobbied the government to require that all citizens have health insurance, I'd take a ton of issue. But if health insurance companies lobbied to require doctors to get consent if they want to harvest tissue or organ samples from patients for personal research, I'd support that change.

It's all grey area but as of yet I don't think DDG is at all in the wrong.

No. I just want them to be smarter about how they deliver the message. I would love them to succeed and this is just a feedback from a regular user who follows news about DDG.

"Be smarter" doesn't explain what it is you want them to do.

If they are successful I think it’s entirely possible that people will switch back to using Google and away from DDG if they can be certain their data is kept private.

In addition to not stalking you across the web DDG also does not store data on you even when using their products directly. For me that is still cause for my use of DDG.

“DuckDuckGo is a general internet privacy company at this point, and we help you essentially escape the creepiness and tracking on the internet.”

The Internet has always been and will always be creepy. As a search engine DDG isn't that great.

>The Internet has always been //

What do you mean by that?

Back before chat had pictures there were creepers creeping on the Internet, it's not an artificial sandbox like Disneyland where everything is perfect.

Like Mozilla, made even worse by the fact that later Mozilla proceeds to stab you in the back :P

Does anyone remember outcry of advertisers after Microsoft made absolutely useless and toothless Do-Not-Track option opt-in instead of opt-out? They were stinking and whining for months and finally got what wanted. The answer is no, unless some "communists" will regulate this spy community of advertisers they will never agree to lessen data collection (including the switch to opt-in).

Personally, I think even without Internet Explorer ruining Do-Not-Track (which I also think Microsoft torpedoed on purpose by making it the default), the tracking companies wouldn't have respected it. They were looking for an excuse not to comply with DNT after promising Congress they would self-regulate, and IE gave them a fig-leaf to go back on their word.

IIRC the argument was that DNT being enabled by default in a browser made its "advisory" nature to the server on the other end even more toothless, because it could be argued that for the majority of users with it enabled, it would not be a signal of any actual decision on their part.

Yeah, fuck them if they can’t take a joke. Aside from all the valid reasons why the spyware is horrid there are also practical reasons. I get virtually no benefit from gigabit Ethernet at the house when surfing the web due to all the extra http requests.

I don't think it's up to the user to decide how data about them gets used.

Regulation like GDPR only makes it more difficult (impossible) for competition to emerge. It basically secures the future dominance of big players like Google and Facebook.

Making people pay a monthly fee to get their data collected and analyzed seems like the next big business model. We will soon reach a limit on the usefulness of passively collected data and will need to switch to a more active model.

I would prefer to have complete access to data about myself, but it would be unreasonable to expect a monopoly on it.

> I don't think it's up to the user to decide how data about them gets used.

To a certain extent, I'd agree. Specifically in the case of things like analytics data.

The real problem is people not knowing this data is being collected in the first place where you can't make any kind of informed choice on opting in.

Under the GDPR,

"the data is not available publicly without explicit, informed consent, and cannot be used to identify a subject without additional information stored separately. No personal data may be processed unless it is done under a lawful basis specified by the regulation, or unless the data controller or processor has received an unambiguous and individualized affirmation of consent from the data subject. The data subject has the right to revoke this consent at any time.... Data subjects have the right to request a portable copy of the data collected by a processor in a common format... violators of the GDPR may be fined up to €20 million ..."


Key phrases here: 'explicit, informed consent' ... 'lawful basis' ... 'right to revoke this consent' ... 'fined'

I know the regulation. That's not what I'm arguing here.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact