Hacker News new | past | comments | ask | show | jobs | submit login

As noted in the article, apparently Google is presenting the top four search engines used in a given country. So presumably this means they're seeing a lot more DuckDuckGo searches in the data they're collecting from Chrome users.

It's also a solid choice for them to hedge against antitrust claims, if they can point to having just added them to their browser, regardless of the fact that Google is the default and they do not present a choice screen like Microsoft had to in the EU.

Antitrust was the first thing I thought of when I saw this. Doing it automatically based of statistics would work toward that end as well.

What do you bet they went looking for a automated way to avoid anti-trust and it wasn't a happy coincidence?

An automated way to avoid a political (human) outcome is the definition of insanity.

As noted in the article, apparently Google is presenting the top four search engines used in a given country

Good. 4 is a good number. It's on the low end of the number range people think of as "enough choice."

4 is the new 3

4 has been the 3 of China for 4000 years.

care to explain?

Uhh, I'll do my best.

At its simplest, in the West we have a thing for threes. Three bits of God, three little pigs, three branches of government (in the USA at least), "things come in threes", three books/movies in a series (a trilogy), stories that have a "beginning/middle/end".

Bottom line, the West tends toward organizing and thinking of things in threes. Some might even be superstitious about threes (perhaps a Pythagorean influence).

In China the number 4 plays a similar role. I don't know much about 'numerology' in China, save to say that recently the number 4 (which apparently sounds like 'death' in Chinese) has been considered bad luck. Here's a better explanation than I could give: https://www.quora.com/In-Chinese-culture-why-is-the-number-4...

4 seasons, the 4 corners of the world, the 4 cardinal directions, the 4 bodily humors (https://en.wikipedia.org/wiki/Humorism), you can find a ton of 4's in the West.

In China the number 8 means wealth or fortune. It's not a coincidence that Brooklyn Chinatown is on 8th Ave. Their 'numerology' is interesting

My company just skipped the number 13 in a software version number.

In Hong Kong, "Chinese" buildings skip floor 4, 14, ..., while "Western" buildings skip floor 13. More recent culturally inclusive buildings skip floors 4, 13, 14, ... The most prestigious floors are 8 and 88.

To add a complication, Chinese tends to use the US convention (with the ground floor being floor 1), while the English convention is the British one (with the ground floor being floor 0).

By happy coincidence, then, the 13th floor is also 十四樓, ie the 14th floor, so you only need to skip one floor, rather than two. That explains why HK skyscrapers are so high.

8 sounds like

3 is important in decision making because it makes it easier to form a consensus. If I disagree with your idea, you have another entity to act as an arbiter.

Did the Hitchhikers Guide to the Galaxy trilogy in four parts do well in China?

The increasingly inaccurately named Hitchhiker's Guide Trilogy was, finally, in five parts; see Mostly Harmless.

Soon it will be 6 parts. For some reason they (whoever they are) have decided another book is a good idea.

Reminds me of the song "Three is a magic number".

It’s 3 more choices than most in the US get for broadband providers.

It was an odd thing to me but Chrome would not list DuckDuckGo until after you had visited DuckDuckGo.com manually. Once on DDG it became an option. That's been around for a while, as I've had DDG as my Chrome default for a couple years. I presume it's now an option even if you've never visited.

I believe that's because of opensearch and not necessarily a Chrome thing.

Similarly, how many different options are available for similar classes of items found at Costco?

Say, frozen chicken, napkins, instant noodles, paper cups, etc. In some cases there is only 1 option offered, sometimes 2, rarely are there ever more than 4 options offered at Costco for a single type of item. When you trust that you are being offered the best choice or a top choice, well, we know what happens at Costco. People buy pallets in that warehouse.

Similarly, how many different options are available for similar classes of items found at Costco?

Effectively, Costco shoppers are people who already have chosen, "the cheapest fairly good quality option."

Costco doesn't have a near-monopoly on grocery shopping.

Yes, but (1) Costco has nowhere near a monopoly on grocery, clothing, and electronics and (2) not even Costco customers do 100% of their grocery, clothing, and electronics shopping at Costco.

Costco is the "I'm feeling lucky" button.

"... in the data they're collecting from Chrome users."

What percentage of Chrome users consented to the data collection? (Is consent even required?)

Does the data represent all Chrome users or only those who have consented?

Actually consented, as in understood the implications and freely decided that Google should have this data, probably none. That would take a lot of generosity, especially to pay that team of lawyers and technical experts, so that you have any chance of actually understanding the implications.

Unwillingly consented, that's the vast majority of Chrome Sync users. Unless you enable the end-to-end-encryption (for which they require a second passphrase, so probably less than 0.1% actually use that), they will use your data for ad profiling etc.. Yes, that is on page 1312 of the Chrome Sync privacy statement. (They're only required to write it into there, if they do it, so it is quite certain that they didn't just want the bad PR for nothing.)

Is consent required? Assuming they actually do collect this data from their Chrome Sync data or through similar personally identifiable ways, consent would be required in many jurisdictions, especially the EU.

However, if they cared enough, it would be possible for them to collect this particular data point without personal identification. You could for example create a UUID per installation that's only associated with this one data point. Or you could have a time-based solution where each Chrome instance goes out to "vote" for their default search engine e.g. every 4 weeks. If you then look at the statistics on a weekly basis, you can just take these values times 4 to even roughly correct numbers. It's certainly going to be representative enough, you don't need every browser instance to have their vote in every week's statistic.

Look at the linked patch.

These metrics are from UMA stats. They are collected from everyone who ticks the box to report stats when installing Chrome.

They only get histograms of counts of visits to search engines, not the entire URL, and not search engines or other sites not in the list of things they track (which is at the bottom of the file).

It is ticking that box I was wondering about. How many users tick it?


and other chrome-urls

These can provide useful data for me but not sure why I would want send the data to Google.

So that Google can make Chrome render fast on the sites most people use the most often, for example?

Well, the first question is why are the pages rendering slow to begin with?

One way to make the pages I visit load faster is to disable Javascript. Another is to remove (or block) advertising. Another is to put DNS data for these sites into local hosts or zone files.

Those actions are how I prefer to approach the problem.

However as far as I can tell, those are not actions Google wants to take. They have their own preferred approach.

It is possible there are users who are aligned with Google in terms of how they want to approach the problems created by misuse/overuse of Javascript and advertising.

It is also possible there are some users who have no idea why pages are slow to load.

Those groups might want to send usage data to Google.

However I am not in either group. I dislike the web advertising business that Google depends on and therefore must nourish and support.

As such, there is no reason I can think of why I would want to send data to Google.

Also, I have not checked but I wonder if Google is restricted in how they can use the collected diagnostic data. Are they prohibited from using it for the purposes of selling advertising?

Usage data helps us make UI changes. For example, if not a ton of people are using some functionality, we might prioritize modifying or removing it. When we make a change, seeing how it affected usage is an important part of verifying we did the right thing.

So if Chrome's ever made a UI change you disagreed with, then you're in a group that would have benefitted from sending Google usage data.

In terms of the restrictions on usage data, see https://www.google.com/chrome/privacy/whitepaper.html#usages... .

Having grown tired of graphical software back in the 90's I have little interest in graphical user interfaces and interactive use. Chrome has never made a UI change I disagreed with because I do not care about the popular graphical browsers.

I care about command line programs, less-interactive and non-interactive use. Truly, the best interface is no interface.

The whitepaper.html appears to explain how usage data is utilised in ways that help Chrome improve but does not appear to contain any restrictions on use of the data to help further Google's ad sales business, whether directly or indirectly.

It is the business model that I do not wish to support.

Producing software such as Chrome is just something the company is doing in the course of selling advertising and collecting maximal amounts of data from users, whether the data is anonymised or not.

I assume it's at best opt-out so 90-98% would be my guess. Although I'm talking out of my ass.


Take a look at Google's user-directed legal jargon.

They have slimmed it down to only a few pages and now have very simplified statements.

Obviously every statement is now very carefully worded...

> So presumably this means they're seeing a lot more DuckDuckGo searches in the data they're collecting from Chrome users

Reminds me of the sort of advantage Facebook had from its VPN app to identify competitors early to kill/acquire them.

> So presumably this means they're seeing a lot more DuckDuckGo searches in the data they're collecting from Chrome users

There's a lot to unpack in that statement... Is there any recent analysis on the usage stats that chrome is reporting back that someone could point to?

They might obtain these statistics by mining the browser history associated with Google Accounts synced to the cloud.

No clue at all why politician was downvoted for this.

Isn't it well known that Google scoops up web history from the browser or have they stopped doing/never done this? In the latter case any pointers would be appreciated.

Dunno about currently, but about three years ago, if you opened your own site in Chrome, a couple minutes later you'd get a visit from Google bot on the same url.

Google has that data only when Chrome Sync is enabled, and only when you haven't set a Sync custom passphrase (which encrypts it end-to-end).

== most of the time. The majority of people don't care enough to change the defaults. Most of the time, they don't even think about whether there even is something to change. The overwhelming majority thinks roughly like this: "Ooo, Computer just knows all of this about me? Neat!".

Source: I work in education - even in a highly educated area in a developed EU country, young and old alike think like this.

In the default config the sync data is encrypted end-to-end with the user's Google account password. However, there is also an option to share browser history with Google for telemetric reasons, and it's on by default (regardless of sync encryption settings)

Isn't Chrome Sync opt-out at this point? I seem to recall some small controversy about that a while ago, and setting a passphrase seems like something that it's unlikely most people will do.

A default open browser history synced across devices seems like exactly the sort of thing that would show that DDG has increased its market share.

Sync is opt-in.

Look at the linked patch. They use the UMA histogram reporting mechanism, not Chrome Sync.

It need not be chrome users, Google could (ironically) get relatively good metrics just by how many times ddg has been googled.

Or from the Google router. Or their DNS service. Or ....

Or from how many times, people have shebanged their way to google from duckduckgo.

Shebanged or banged?

Banged, shebang is #!

> Is there any recent analysis on the usage stats that chrome is reporting back that someone could point to?



Dunno about currently, but about three years ago, if you opened your own site in Chrome, a couple minutes later you'd get a visit from Google bot on the same url.

I don't believe this to ever have been the case.

Not only is Google's indexing infrastructure not that fast, but they deliberately don't do that because some poorly designed sites have passwords or unique keys in the URL that should not be used to retrieve content for the public search index.

This isn't sourced from any Chrome-collected data. See https://news.ycombinator.com/item?id=19393019 .

Regarding "data they're collecting": The list here is based on popularity of search engines in different locales, determined using publicly available data.

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact