
To Break Google’s Monopoly on Search, Make Its Index Public - JumpCrisscross
https://www.bloomberg.com/news/articles/2019-07-15/to-break-google-s-monopoly-on-search-make-its-index-public
======
nostrademons
Ex-Google-Search engineer here, having also done some projects since leaving
that involve data-mining publicly-available web documents.

This proposal won't do very much. Indexing is the (relatively) easy part of
building a search engine. CommonCrawl _already_ indexes the top 3B+ pages on
the web and makes it freely available on AWS. It costs about $50 to grep over
it, $800 or so to run a moderately complex Hadoop job.

(For comparison, when I was at Google nearly all research & new features were
done on the top 4B pages, and the remaining 150B+ pages were only consulted if
no results in the top 4B turned up. Difficulty of running a MapReduce over
that corpus was actually a little harder than running a Hadoop job over
CommonCrawl, because there's less documentation available.)

The comments here that PageRank is Google's secret sauce also aren't really
true - Google hasn't used PageRank since 2006. The ones about the search &
clickthrough data being important are closer, but I suspect that if you made
those public you still wouldn't have an effective Google competitor.

The real reason Google's still on top is that consumer habits are hard to
change, and once people have 20 years of practice solving a problem one way,
most of them are not going to switch unless the alternative isn't just better,
it's way, way better. Same reason I still buy Quilted Northern toilet paper
despite knowing that it supports the Koch brothers and their abhorrent
political views, or drink Coca-Cola despite knowing how unhealthy it is.

If you really want to open the search-engine space to competition, you'd have
to break Google up and then _forbid any of the baby-Googles from using the
Google brand or google.com domain name_. (Needless to say, you'd also need to
get rid of Chrome & Toolbar integration.) Same with all the other monopolies
that plague the American business landscape. Once you get to a certain age,
the majority of the business value is in the brand, and so the only way to
keep the monopoly from dominating its industry again is to take away the brand
and distribute the productive capacity to successor companies on relatively
even footing.

~~~
appleshore
If there were viable alternatives, people would shift over time.

If I type in “<name> Pentagon” on Google, the first link is LinkedIn.
DuckDuckGo doesn’t even list it at all. There’s countless examples where
DuckDuckGo just can’t find basic information. DDG is just unreliable beyond
it’s silly name.

~~~
jazoom
I try to use and like DDG, but the results just aren't as good. For example,
it seems to be completely unaware of Docker Hub. Like, pages from that entire
subdomain never show up. I can search "Docker hub" and it doesn't even show
up.

~~~
ben509
For that specifically, use !dhub or !dockerhub to search the site directly.
Really, the magic of DDG is bang queries.

(Search for bang queries with, not surprisingly, "!bang".)

~~~
jazoom
usually I just do !g and that solves the problem ;-)

But also, thank you. I didn't realise there were so many bangs.

------
noahl
Okay, this is a relatively serious proposal to require Google to allow API
access to its search index, with the premise that it would democratize the
search engine ecosystem. There are some issues with the regulations he
proposes (you have to allow throttling to prevent DDoS attacks, and you can't
let anyone with API access add content to prevent garbage results), but it's
roughly feasible.

The main problem is, I think the author is wrong about what Google's "crown
jewel" is. Yes, Google has a huge index, but most queries aren't in the long
tail. Indexing the top billion pages or so won't take as long as people think.

The things that Google has that are truly unique are 1) a record of searches
and user clicks for the past 20 years and 2) 20 years of experience fighting
SEO spam. 1 is especially hard to beat, because that's presumably the data
Google uses to optimize the parameters of its search algorithm. 2 seems
doable, but would take a giant up-front investment for a new search engine to
achieve. Bing had the money and persistence to make that investment, but how
many others will?

~~~
tryptophan
>2) 20 years of experience fighting SEO spam.

Tangential - but does anyone else feel that google results are useless a lot
of the time? If you search for something, you will get 100% SEO optimized
shitty ad-ridden blog/commercial pages giving surface level info about what
you searched about. I find for programming/IT topics its pretty good, but for
other topics it is horrible. Unless you are very specific with your searches,
"good" resources don't really percolate to the top. There isn't nearly enough
filtering of "trash".

~~~
packet_nerd
Someone linked to an interesting site talking about how to make homemade hot
sauce here on HNs. I partly read it and thought it was a great clean site and
something I wanted to try. Later going back to find it again I literally spent
hours searching, even though I'm pretty sure I remembered some of the exact
phrases. For some reason recipe related search results are really really
terrible on both Google and Bing.

~~~
thinkalone
Could you not find it again via the HN site search?
[https://hn.algolia.com/?query=%22hot%20sauce%22%20recipe&sor...](https://hn.algolia.com/?query=%22hot%20sauce%22%20recipe&sort=byDate&prefix&page=0&dateRange=all&type=comment)

~~~
packet_nerd
This is awesome and helped me find it again! Thank you!

------
nova22033
Caveat: The author is not a technologists

 _Robert Epstein (born June 19, 1953) is an American psychologist, professor,
author, and journalist. He earned his Ph.D. in psychology at Harvard
University in 1981, was editor in chief of Psychology Today,_

He has also made some questionable claims about google manipulating search
results to favor Hillary Clinton.

[https://en.wikipedia.org/wiki/Robert_Epstein#cite_note-15](https://en.wikipedia.org/wiki/Robert_Epstein#cite_note-15)

His research is based entirely on his own experience

 _“It is somewhat difficult to get the Google search bar to suggest negative
searches related to Mrs. Clinton or to make any Clinton-related suggestions
when one types a negative search term,” writes Dr. Robert Epstein, Senior
Research Psychologist at the American Institute for Behavioral Research and
Technology._

~~~
tomweingarten
The comments he made are not just questionable, they're outright wrong (and a
great example of the problem with cherry-picking data):

[https://www.vox.com/2016/6/10/11903028/hillary-clinton-
googl...](https://www.vox.com/2016/6/10/11903028/hillary-clinton-google-
debunked)

[https://www.politifact.com/punditfact/statements/2016/jun/23...](https://www.politifact.com/punditfact/statements/2016/jun/23/andrew-
napolitano/did-google-adjust-its-autocomplete-algorithm-hide-/)

(Disclosure: I work at Google, but this opinion is my own)

~~~
ar-jan
Google's claim that the algorithm is generic is demonstrably false. Type in
"hillary clinton e" and there is no suggestion for "email", type "donald trump
e" and email is the first suggestion. Given the news content that we know is
out there, that can only be the result of adjusting the results for clinton
specifically (if anything, we would not expect "email" to be autocompleted for
trump). This is not research that tells us what exactly Google is doing, but
you cannot deny the example.

~~~
root_axis
This is not "research" period. Using one arbitrary search comparison to draw
conclusions about the nature of a system that processes billions of queries a
day is pretty weak. Additionally, I don't get the same results you do.
"hillary clinton e" does not bring up emails, nor does "donald trump e" bring
up emails (the first results I see are election, education, england visit, ex
wife).

I'm not ruling out the possibility that google actually is manipulating search
results, but this is not proof of that.

~~~
bodono
Try "hillary clinton emai". From a fresh chrome session in NYC I get nothing,
not a _single_ autocomplete result. On the other hand "donald trump emai"
gets:

* donald trump email * donald trump email address * donald trump email list * donald trump email newsletter * donald trump email list signup

And just to drive the point home I tried "root_axis emai" and got "root_axis
email". Try anyone else and you get similar results, 'barack obama emai',
'george bush emai', etc etc. So yes, this is proof that the results are
scrubbed for Clinton email.

~~~
scott_s
I got curious and tried the names of a bunch of public figures. Some of
"<first and last name> e" yielded "email" as the suggestion. But these did
not: elizabeth holmes, tom jones, tom cruise, brad pitt, gwyneth paltrow,
roger federer, will smith, jimmy carter.

Since Hillary Clinton is not unique, then it's _not_ proof that her results
are treated differently.

~~~
buzzerbetrayed
Why would you expect “brad Pitt email” to be something that auto completes?
You would, on the other hand expect “Hillary Clinton email” to auto complete
because there was a huge controversy about it.

I’m not saying google is manipulating auto complete intentionally (though they
might be), I’m just saying your counter examples are irrelevant.

It would be like “Donald trump Russia” NOT auto completing then someone saying
“but neither does Taylor swift Russia, so we’re good.”

~~~
scott_s
The poster claimed Hillary Clinton was unique, meaning the _only_ person that
applied to. For me, she was not. Since she's not unique, then her being unique
can't be used as evidence.

Claiming that she's _unusual_ , since you expect it to work for her based on
stories written about her is a different claim.

~~~
ar-jan
You're misrepresenting my claim.

1\. I said "only for Clinton" in the contect of Trump vs Clinton. Then I
compared to other U.S. politicians, where the example still holds. The
intended meaning is perfectly clear.

2\. Obviously the fact that Clinton was the topic of a scandal involving email
is the assumed context here. That's why I said "if anything, we would not
expect "email" to be autocompleted for trump" (implied: but for Clinton, email
_is_ a more relevant search term based on published news, etc.).

------
zubspace
From the article:

"But what about those nasty filter bubbles that trap people in narrow worlds
of information? Making Google’s index public doesn’t solve that problem, but
it shrinks it to nonthreatening proportions. At the moment, it’s entirely up
to Google to determine which bubble you’re in, which search suggestions you
receive, and which search results appear at the top of the list; that’s the
stuff of worldwide mind control. But with thousands of search platforms vying
for your attention, the power is back in your hands. You pick your platform or
platforms and shift to others when they draw your attention, as they will all
be trying to do continuously."

But this is a huge problem. I'd rather have 10 independent search providers
instead of 10 companies proxying the results of google. It's worse, if I don't
even know from which index the results come from. I guess, many people don't
know, that Startpage shows you Google results.

I don't want Google results! I want different web crawlers ordering the
results according to my taste without tracking each and every page impression
of me. Give me that and I'll switch in a heartbeat.

~~~
foota
> I don't want Google results! I want different web crawlers ordering the
> results according to my taste

Okay, great, we'll see how you browse and use that to determine...

> without tracking each and every page impression of me. Give me that and I'll
> switch in a heartbeat.

Uh? How exactly do you want them to do that?

~~~
zubspace
The search engine basically owns an index of all websites. I assume, that
crawling websites and storing an index is a solved problem. Not trivial on
that scale and expensive. But conceptually solved.

What google does really well are two things:

1) It ranks those pages according to the well-known PageRank algorithm,
although the details are kept secret. They also learned how to punish sites,
which try to mislead the algorithm (Duplicate content, excessive SEO, link
farms, etc)

2) What they also do well is returning search results in a very short time,
based on a relevancy of your search term combined with the precalculated
ranking. It also looks at your history in this step and the data it tracks
about you.

My preferred search engine would open up both bullet points, giving you the
option how pages are ranked and how relevancy is determined. For example: Rank
personal blog posts higher or rank news higher or only use an inbound link
count. Or you could get fancy and crowd source ranking (similar to hacker
news). You could also configure search term relevancy. For example: did you
ever search google for something specific and it automatically adjusted your
term to something similar, more popular?

You could open up all those decisions and give the user options. Google on the
other hand keeps everything secret and uses your personal data in unknown
ways. I would love to have a search engine which is configurable. Yeah, not an
easy thing to do. But it would be awesome.

~~~
foota
While you may be able to open up some switches, I don't think you can build a
reasonable user experience on something like that, since anything that allows
you significant control over how the results are chosen (beyond the simple
control you've mentioned, like prefer certain types of results) will be both
too complex for anything other than like distributed common sets of options
(like a plugin that comes with presets) and/or have so many knobs as to make
tuning for good results infeasible.

I would argue Google already supports some of what you want, like allowing you
to search just for news, although additional filters based on something like
the niche you're looking for or the type of content don't hurt.

------
jacknews
Maybe this article should be made public too.

~~~
unreal37
The irony of this being behind a paywall. An expensive paywall at that.

------
eaenki
The effect of this on Alphabet's revenue would be nil.

The majority of Google's Revenue comes from Google, Youtube, Gmail and Play.
They make so much $ because they have the biggest network effect of
advertisers-eyeballs in the world along with Facebook. That. Is. Unbreakable.
Even more than a social network's network effect, because the friction to
switch budgets and people in a company is higher than a guy telling their best
friends to download an app.

And then, YT is a network effect. And then, Play/Android is also a network
effect. And then there's the branding. But presumably every big company has
the latter. Still, what a brand. Everyone knows what Google or Android is.
Every. Single. Human.

Finally, because they make all this money, they can pay to be the default on
the other half of the devices, Apple's devices, to use Google as default. Last
time I checked, $5B a year.

Hence, this article is so bad.

I don't even care about Google, just saying.

edit: did I mention Chrome? They've got chrome too, with the googleverse as
default.

~~~
boringg
Also - just the amount of people who would game the system afterwards all the
search results would be utterly useless.

~~~
eaenki
Ye and anyway you could return x10 worse results than Google's current results
and still become the new dominant search engine if you've got infinite $B a
year to outbid Google to be the default on browsers and operating systems and
a big salesforce to onboard the advertisers. Man this is the exact playbook
they used to become the search engine in the first place. They literally were
Yahoo's search bar at some point two decades ago.

~~~
toast0
Microsoft has infinite money, pays users with rewards directly, and has barely
gotten any marketshare with Bing.

In order to compete, you would need good results and good performance for your
organic results and your ad program in addition to a huge budget for traffic
aquisition and patience and ability to execute on strategy consistently over
multiple years.

Also, keep in mind that people will judge based on perceived quality, not
objective quality. Simply being shown as a Google result increases the
perceived quality for most results -- in order to be seen as equal, the
competitor will need to have consistently better results.

~~~
eaenki
As you can clearly see for yourself, Microsoft is not as interested in
acquiring market share for Bing. Not as much as to outbid Google's contracts
with Firefox and Apple to be the default. Bing is clearly a second thought for
Microsoft even tho it's big. That's probably because they wouldn't get the
same bang for their buck anyway. They have far less advertisers on board.
Anyway, according to Microsoft, they own 33% of the US market. That is quite
the market share. Especially once you consider it has mostly been acquired
thru sneaky toolbar installations and IE.

~~~
gscott
Microsoft is solving their biggest problem by switching IE to Chromium. Right
now everyone uses IE to download Firefox or Chrome. But now that IE will be
Chrome they are hoping that enough people will not switch their browser and
will start using the defaults (bing, outlook, etc). It's genius. In the past
IE had the dominate browser it is not impossible that they can claw back some
of that market share.

------
taf2
So although everyone likes to believe google is a monopoly it’s far from it.
You have choices- bing, biadu , yandex, DuckDuckGo... there is also nothing
about googles search position that prevents you from building a competitor.
What we do have is peter thiel backing an administration that’s anti google,
Russia, China that are anti google. Why? it’s a source of truth that
challenges their lies. We also have an emergent anti ad - cult like backlash
against personalized ads. So all of these factors combined and you get a lot
of pressures mis information telling you google is evil. Additionally, karma ,
google led the charge against Microsoft with googles do no evil position
against Microsoft- which did have an oem monopoly preventing others from
competing. Anyways that is how I see it... so is google near to being a
monopoly no I think they would need to be doing a lot worse things and there
is room to compete and people should

~~~
celticmusic
ah yes, the only reason you could ever have for being anti google, et al, is
that you want to lie...

You started with a decent premise, that google isn't an actual monopoly
because it has "competitors" (and yes, those quotes are intentional). But then
you go off the deep end and completely lost me to the point that I didn't even
finish the entirety of your post.

There are many many reasons that someone could be against google's absolute
market dominance.

~~~
taf2
Where do you see their absolute market dominance... I checked here for example
[http://gs.statcounter.com/](http://gs.statcounter.com/) and for browsers
globally it's 63% that's not a monopoly... I checked here
[https://www.statista.com/statistics/216573/worldwide-
market-...](https://www.statista.com/statistics/216573/worldwide-market-share-
of-search-engines/) for search engine and 90% is pretty close but it's not a
monopoly... contrast that with MS in the late 90s according to
[https://www.forbes.com/sites/timworstall/2012/12/13/microsof...](https://www.forbes.com/sites/timworstall/2012/12/13/microsofts-
market-share-drops-from-97-to-20-in-just-over-a-decade/#5361b19b51cf) and you
have 97% that's a monopoly if it's abused which it was proven they did..
[https://en.wikipedia.org/wiki/United_States_v._Microsoft_Cor...](https://en.wikipedia.org/wiki/United_States_v._Microsoft_Corp).
again i find it interesting it's hard to talk about google being a monopoly
and not mentioning MS...

~~~
celticmusic
Google doesn't have market dominance in the same way that Verizon doesn't have
market dominance in a specific region because you can also get dialup.

Technically correct, but not truthful.

------
ga-vu
This is dumb. You mean if I work two decades to develop tech that nobody else
can copy, I have to open-source it because my competitors are dumbasses?

This is not how it's suppose to work.

~~~
dpacmittal
I think the bigger issue is how it affects people and the power this tech
gives to control various things about socio-political landscape which a single
corporation shouldn't be trusted with.

~~~
root_axis
If we're going to start nationalizing the resources of private companies in
the name of public good, I think internet tech giants that offer up their
services for free should be last on the list. On the top of the list should be
corporations that dominate resources like life-saving drugs and residential
property.

------
iainmerrick
I don’t see any mention in this article of what seems like the most obvious
way to split up Google, separating their search and ad businesses. _(Edit to
add:_ although maybe the effect would end up being similar, if API users serve
their own ads but without access to Google’s ad infrastructure.)

That obviously wouldn’t be a simple job, of course, and maybe there are some
interesting reasons why it wouldn’t work well.

~~~
ailideex
> I don’t see any mention in this article of what seems like the most obvious
> way to split up Google, separating their search and ad businesses.

What would be the revenue model for their search then?

~~~
samscully
The ad business would pay the search business for ad placement. The search
business would also be free to auction ad space to other providers.

~~~
epicide
> The search business would also be free to auction ad space to other
> providers.

Doesn't that make them an ad business?

~~~
ailideex
I agree, to me it seems like that really does not change anything except it
mandates one more middle man.

~~~
GordonS
Yep, it just makes what likely already operates as an "internal" customer an
"external" one. Not really sure how it helps with competition?

~~~
iainmerrick
That’s exactly the point, to make that internal customer external, and make
any special access or APIs they have available to competitors on an equal
footing.

------
astonex
A list of websites and their content is really not useful at all. Anyone can
get this themselves with some really simple programming.

The actual hard part is when it comes to ranking and sorting the data in any
useful way, and doing it within like 100ms. Plus various other issues like
spam protection etc. This is where Google excels (at least in my opinion).

~~~
colinmhayes
I wouldn't say that anyone can create an index with really simple programming.
There are quite a few technical obstacles that "really simple programming"
probably couldn't solve. That being said, I agree that any legitimate company
would be able to create an index easily enough. The hard part is ranking and
spam detection.

------
seisvelas
This seems a bit silly - it's not like Google's search results are that much
better than Bing or DuckDuckGo (or better at all most of the time). Google has
Google Colab, Chrome, and Android, and the whole Docs ecosystem, and they've
integrated those things together pretty well while still being loosely coupled
enough to switch out any part of the ecosystem with a competitor's product.

Is there a reason to break up Google other than that they are doing well?
Other search engines seem to have no problem establishing their own niche and
doing well.

~~~
ma2rten
Colab? I don't think that google's position is held in power by that.

~~~
seisvelas
I wanted to give a small example to represent the zillion other little small
examples that I didn't list

------
gigatexal
I'm of two minds here: Google's whole reason for ascending to where they are
is the PageRank algorithm which is why Google was created in order to
monetize. I see this in similar veins to Apple and iOS: would we support calls
for Apple to be forced to allow iOS to be installed on non-Apple hardware? If
not, then why would we insist on Google giving up it's reason for being, it's
reason a lot of us use it to find relevant information?

Then again, the concentration of power in a handful of operators likely
threatens the open internet.

~~~
epicide
Not a lawyer, but this seems to be conflating patents with copyrights?

i.e. iOS (especially new versions) would fall under copyright protection [0].

PageRank is a patented technique for search. The patent apparently ran out
about 6 weeks ago [0].

While both copyright and patents are intended to protect creators for a
certain period of time, copyright protects a specific work and patents protect
an idea. Patents should generally expire much more quickly since they cover a
much broader topic.

I also realize both systems are completely crippled at the moment, but I'm
trying to stick to what they're at least _intended_ to be.

[0]: I'm sure they have tons of patents centered around iOS but this is about
protecting the OS itself. [1]: [https://pulse2.com/googles-pagerank-patent-
expired/](https://pulse2.com/googles-pagerank-patent-expired/)

~~~
gigatexal
Makes sense actually. Yeah I think I am conflating them.

The patent is expired but could you argue the index Google is accumulating and
aggregating could be copyrighted work in the same way that iOS is? It has
taken considerable effort, expense, and expertise to cultivate a useful index.

Full disclosure I’ve no idea what I’m talking about just thinking I can’t see
what basis people have to force Google to open up their key money maker just
because. Other people/companies can create their own indexes with the now
expired patent, no?

------
peteretep
I don’t want to break Google’s monopoly on search. Google’s search is
fantastic. It’s their advertising business knowing too much about me I care
about.

~~~
JumpCrisscross
> _Google’s search is fantastic_

The author's proposal, making Google's index publicly accessible, would leave
Google search intact. Google's algorithm would presumably remain proprietary.

For people who like Google today, nothing would change. For people for whom
Google falls short, there would be new options. Looks like a clear win-win for
consumers.

~~~
sametmax
Like you said, the value of Google is the algo. The index is worth little.

The limits of Bing, Qwant, DDG, etc. are not the number of the indexed pages.

It's that, give them the same pages, and the same search terms, and they don't
return results as good.

~~~
JumpCrisscross
> _the value of Google is the algo. The index is worth little._

But it's worth something. Giving DuckDuckGo direct access to Google's index,
including the ability to train models on said index, would improve the
competitive landscape.

~~~
vntok
Or, you know, they could innovate by themselves instead of relying on Google
to do all the heavy lifting?

~~~
sct202
Yeah it seems like this would spawn a lot of weak competitors who only exist
because of the protections and would die off the second they're cut off from
the API.

------
sterlind
I lean pretty far left, but this proposal makes me really uneasy. It seems
very coercive and ham-fisted. I think instead, Bing should consider opening
_its_ index up. It'd be a massive PR coup, they wouldn't lose anything of much
value, and anyone who put the index to better use than them would help them by
breaking Google's stranglehold over search.

------
dalbasal
So... this article is a good example of how ??!? it gets once you move from
"We gotta do something about these tech monopolies" into the "what should we
do?" phase.

How exactly "do* you break up a Google or a FB so that (only one possible
reason, but the one cited here) they don't control too much media/mind share?^

Laws usually want to be general, and my suggestion doesn't necessarily lend to
that, but I'll suggest it anyway.

Facebook doesn't need to be broken up into several companies. _It can just be
shut down._

I don't mean that it should (justice-wise) be shut down. I just mean that we
won't lack for social media. We will have social media alternatives the day
after FB shuts down. Theres a chance we'll get something more open instead.
There's a good chance we'll get several small replacements instead. There is
0-chance that we'll lack for ways to share posts and post pictures. This isn't
Bell, where we need to keep the phones working. The phones will work fine with
or without Facebook.

There is no need for a company to generate $70bn in revenue, in order for us
to have social media. That's a key difference from all antitrust cases of the
past.

YouTube is another sort of example. If it shuts down, alternatives will pop up
with immediately...maybe open ones.

There are justice questions (is it fair to shareholders/employees/zuck?) There
are legal validity questions (why FB and not apple?). But, for the practical
questions... the problem is an easy one.

^fwiw, I also think this is the most worrying part. These companies have a
tremendous control about how and what people think. They make Murdoch media
look quaint.

~~~
empath75
> How exactly "do* you break up a Google or a FB so that (only one possible
> reason, but the one cited here) they don't control too much media/mind
> share?^

The problem isn't so much that google has a monopoly on search as that they
abuse the monopoly to force their way into new businesses. Force google to
spin search off as a separate entity from the rest of the company.

~~~
dalbasal
Isn't it? I'm not even sure that "monopoly" is the right frame. Google's
"price" in their high market share areas (eg search) is free anyway.

To me, the problem is power, not necessarily market/pricing power or even
exclusion power. The problem is that the "public square" is essentially a
handful of private monopolies. What Google considers "inappropriate" is banned
or disadvantaged on YouTube. What FB (or Twitter) considers distasteful can
become absent from the public or political debate.

It's no joke, at this point. Things "woke up" when it seemed that shady stuff
happened during trump's election. But.. several revolutions had already
started on social media, coups, counter-coups, civil wars. 10s of thousands of
public officials around the world have been elected on FB.

That is a _lot_ of power in very few hands, under a veil of secrecy, often
with bad incentives. It was cheap printed paphlets and union halls that made
fascists-vs-communist street fights a thing in the 30s... mediums. These days,
an algorithm notices that telling (proverbial) fascists about communists is
good for engagement... And the algorithm optimizes. I'm not accusing them of
doing this deliberately doing this, but the feed algorithms optimize this way
for a reason.. incentive.

------
ng12
I don't understand why we are talking about this. Google is far from the first
"monopoly" that I would like to see broken up.

My guess is that Google's not lobbying effectively.

~~~
Nasrudith
Because their even more monopolistic competitors are pissed, journalists are
pisses because the world changed on them and they have to compete, combined
with radicals with a persecution complex and a "no the children are wrong"
attitude towards their opinions being unpopular. The tech break up push is one
big hypocritical circle jerk from every party.

~~~
Loughla
This just sounds like pissed off rant. Can you draw out your statements a
little and explain them?

------
ngngngng
Fun timing to read this, this last weekend I was playing around with making my
own search engine to understand better how ElasticSearch and Lucene work.

It occurred to me that the two most powerful things Google has to work with
are records of clicks, and the time users spent on the webpages Google
returned. I've argued against Google monopoly before because I can throw
together a web crawler and search engine in a weekend, so it's not like it's a
hard market to enter.

> According to W3Techs, Google Analytics is being used by 52.9 percent of all
> websites on the internet

This is the real problem though. When a search engine sees a new query, it
uses everything it's got to assert which pages the user wants, but with Google
Analytics, they can test their assertions constantly to see if a user actually
wanted that web page. Then your future queries could be compared against
previous queries that were validated by a user spending several minutes of
active time on the returned page.

I'm sure Google's algorithm is great and all, but I really think this is what
sets them apart.

~~~
tantalor
> but with Google Analytics...

No: "Search does not use Google Analytics for ranking"
[https://www.youtube.com/watch?v=LLmO1GE4GvI](https://www.youtube.com/watch?v=LLmO1GE4GvI)

~~~
ngngngng
I think you misunderstand what I'm saying. I'm not saying Google analytics
will get you ranked higher or lower. Just that Google can use the data from
analytics to tell if their results were what the user wanted.

~~~
tantalor
> tell if their results were what the user wanted

That's ranking.

~~~
marcosdumay
That's validating the rank. Ranking is done in another step.

If they use the data for ranking, they'll lose the huge amount of value it has
for validating.

------
jedberg
Good idea! We should make Bloomberg’s stock data public too!

------
sct202
I wonder if it's even in America's interests to create a weaker
Google/Facebook/Amazon/Microsoft. These companies are dominating globally
(excluding China) and bring back so much money, jobs, and influence to
America. Weakening them might allow real foreign competition to flourish.

------
ksahin
"DuckDuckGo, which aggregates information obtained from 400 other non-Google
sources, including its own modest crawler.)"

I looked into it, and it seems DDG is using Bing and Yahoo search API and lots
of other sources. I looked into the pricing of Yahoo's search API / Bing
search API,it ranges from $0.80 / 1000 queries to several dollars per thousand
queries.

It seems to expensive to be economically viable with ads, what am I missing ?

~~~
api
They've negotiated a lower rate?

~~~
londons_explore
Running a search engine is pretty expensive. $0.80 per 1000 queries doesn't
seem orders of magnitude off the actual running costs.

For comparison, that works out to ~$15 per year for the average users search
volume.

If you can't make $15 per user per year on search, you have bigger problems...

------
pointillistic
If DuckDuckGo can change their stupid name they will gain market share
overnight.

~~~
colordrops
Many just call them duck or ddg now. They bought duck.com

~~~
gourou
It might have been for free, it's unclear they bought it

[https://www.theverge.com/2018/12/12/18137369/duckduckgo-
duck...](https://www.theverge.com/2018/12/12/18137369/duckduckgo-duck-com-
google-acquisition)

~~~
judge2020
They added an intermediate page around July 2018
[https://web.archive.org/web/20180721002315/https://twitter.c...](https://web.archive.org/web/20180721002315/https://twitter.com/robshilkin/status/1020458019174223872)
and probably saw that 99% of people clicked on the DDG link.

------
nothis
I don't mind google search. Let them have search, they're obviously the best
at it. What I _have_ a problem with is literally every other service they
provide. Google shouldn't own the most popular browser, the most popular video
streaming platform, the most popular map service, the most popular operating
system and (one of the?) most popular email services. Why is a single company
allowed to have control over essentially half the internet? I don't think
there's a precedent to that and it's about time it's tackled.

~~~
drstewart
Why is it people care so much about Google being in many different markets but
I never see anyone calling for Samsung or Berkshire Hathaway to be broken up?

Just goes to show how easy populism is to push. Google and Facebook are in the
news a lot and used by the average consumer, making them the perfect
scapegoats.

Meanwhile, the average consumer just sees Samsung as the TV people (actual
industries: apparel, automotive, chemicals, consumer electronics, electronic
components, medical equipment, semiconductors, solid state drives, DRAM,
ships, telecommunications equipment, home appliances) or Berkshire Hathaway as
that old rich guy that drives a cheap car (actual industries (property &
casualty insurance, Utilities, Restaurants, Food processing, Aerospace, Toys,
Media, Automotive, Sporting goods, Consumer products, Internet, Real estate),
so not a peep is made about them or every other horizontally distributed
company in the world.

~~~
izacus
> Meanwhile, the average consumer just sees Samsung as the TV people (actual
> industries: apparel, automotive, chemicals, consumer electronics, electronic
> components, medical equipment, semiconductors, solid state drives, DRAM,
> ships, telecommunications equipment, home appliances)

Samsung was also building military jets and artillery pieces. Imagine the
media response if Facebook went into weapons production :)

------
anonuser123456
I think a distinction needs to be made between monopoly and 'best product'.
Search is winner take all in many respects. As a user, I don't particularly
care about the second best (because I don't pay).

The best way to fix the situation is to get users to pay. And this could be
done. Users can pay negative dollars (e.g. get paid). Why would a search
engine want to pay it's users? Why would a retailer want to pay me to look at
their ad? Why wouldn't they?

Example:. My wife and I buy a not insignificant amount of stuff from
LuLuLemon. So I have a deal for Lululemon. Pay DuckDuck half what you pay
Google to show me your stuff. And DuckDuck, pay me half what they pay you.
Right now DuckDuck gets zero. Google probably makes 20+ bucks a month off me.
In this math, DuckDuck nets 5$!

There is a natural equalibrium here. Advertisers quickly start paying zero to
people that don't buy stuff, so users can't farm ads. But relevant customers
get discounts.

So now I actually care about making the trade off between second best but
effectively cheaper or best but more expensive. E.g. a functioning market.

To do this though, you have to start tracking users in some way. But Google
already does this, so the situation doesn't change. Some form of surveillance
is required to improve relevance. The difference here is, I can choose who
surveils me and to what extent rather than being stick with the winner take
all.

------
lucaspottersky
I disagree. To break the Monopoly a new & innovative way to search needs to
rise, not just a clone or "Google wanna be" like Bing/DuckDuckGo etc.

So, somebody gotta do what Google did 20 years ago.

~~~
jdironman
I don't necessarily think re-inventing the wheel every so often is the way to
go on some things. When it comes to search, you want to type in your query,
and search. It doesn't really get much simpler than having a search box, and I
am having a hard time understanding how that can be innovated upon. I am
however not saying it CAN'T be done, I just don't see why or how it would be.

------
richliss
Unfortunately, people like him and consultants who will be interested in doing
business with Google will be the ones that politicians listen to.

I'd personally do this to solve the problem:

1\. Separate Google from Alphabet and separate Google Search from everything
else they do.

2\. Nationalise the Google brand and search engine along with Bing. Everyone
else can stay as-is for now.

3\. Allow applications for companies to become a search result endpoints.
Ownership would need to be completely transparent. Prevent incumbents from
being involved - Amazon, FB, Google, Apple, Microsoft, Samsung etc.

4\. All Google and Bing result listings would then be evenly distributed
across the search result endpoint providers.

5\. The propositions for each of these companies can now be customised knowing
that they will get a maximum number of likely searches per year.

6\. Prevent any mergers or foreign ownership of these new companies.

Sure the search would develop at a slower pace but its power would devolve
into utility again rather than as a way to control what people see.

------
pliny
None of the rules he sets out for access to index are true for ICs within
google trying to run over even the top/smallest tier of pages in the index.

------
linuxftw
This idea is a pipe dream. It would be nationalizing property, it's never
going to happen in the US.

If you want to enforce anti-trust actions against google's app store policies,
I'm all for that. But forcing them to give away property is insane.

~~~
JumpCrisscross
> _This idea is a pipe dream. It would be nationalizing property_

The author references precedent in the 1956 consent decree with Bell Labs [1].
If anything, that was more extreme. AT&T developed those patents.

The author's proposal doesn't involve Google surrendering its algorithm. Just
the index it compiled from public resources using a publicly-subsidized
Internet infrastructure. All without content owners' permission.

[1]
[https://economics.yale.edu/sites/default/files/how_antitrust...](https://economics.yale.edu/sites/default/files/how_antitrust_enforcement.pdf)

~~~
glitchc
What about the computing and infrastructure resources that Google dedicated to
build the index, and continues to dedicate to keep the index up to date?

This is a failed model. There is no incentive for Google to continue to update
the public version, and it would quickly fall out of date while Google focuses
on their own internal copy. It’s a solution suggested by lawmakers who
fundamentally don’t understand how computers work.

~~~
JumpCrisscross
> _What about the computing and infrastructure resources that Google dedicated
> to build the index_

How is this different from the 1956 consent decree? AT&T spent money
developing its patents. But on the basis of longstanding law around public
interest, it was forced to license them to third parties. (Note: not give them
away.)

> _and continues to dedicate to keep the index up to date?_

The author explicitly contemplates, again within the context of the 1956
consent decree and many subsequent and preceding actions by the U.S.
government (mostly around pipelines, _et cetera_ ), use fees paid to Google.

> _There is no incentive for Google to continue to update the public version,
> and it would quickly fall out of date while Google focuses on their own
> internal copy_

Google wouldn't be permitted to maintain dual states. Its algorithms would
have to use the public index.

~~~
glitchc
Conflating AT&T to Google is incredibly misleading. AT&T had become a defacto
monopoly approx. 20 years prior to that consent decree, not to mention being
subjected to an anti-trust lawsuit ten years prior to the decree.

The article makes the conceit of equating an online data store to the
telephone infrastructure. Google is not preventing other companies from
indexing the internet in any way, shape or form. There is no barrier to
another company building a similar data store from scratch using the existing
internet infrastructure. On the other hand, laying out your own wires has
physical limitations, especially when another company has claimed the most
optimal route. In such a scenario, your costs will always exceed theirs (more
infrastructure to reach the same consumers).

Regarding forcing Google to use the same index: How does DoJ plan to enforce
this? How would DoJ ever be sure that Google is in compliance? I once again
assert that the law magically assumes new technology to solve old technology
problems without fundamentally understanding how computers work. Computers are
copy-on-write by design. Delete always requires an extra step, and the
verification that only one copy exists is impossible to make in the digital
domain.

~~~
JumpCrisscross
> _Google is not preventing other companies from indexing the internet in any
> way_

AT&T wasn't blocking anyone from laying interstate copper. It's just
prohibitively expensive to do so. The analogy, down to the network effects, is
quite apt.

------
ohiovr
Google search is by far the company's most profitable service or product.
Compare the investment of putting a dollar into search over a year vrs putting
a dollar into android or youtube. You can legally force a company to work
against itself but we may have to say goodbye to all the side projects.

also this didn't set well with me

"Access. There might have to be limits on who can access the API. We might not
want every high school hacker to be able to build his or her own search
platform. On the other hand, imagine thousands of Mark Zuckerbergs battling
each other to find better ways of organizing the world’s information."

------
gopher2
I don't think this would help our social fabric, which the article says Google
is "tearing apart." Why do we need to break Google's monopoly on search again?
Which they don't actually have.

~~~
Nasrudith
Well look at what they mean whenever people complain about tearing the social
fabric it usually means "other people aren't conforming to how I think they
should, this is new and I don't like it so therefore it is all its fault and
we are doomes if we don't get rid of it".

Given that it has previous culprit of that exact same charge was gay marriage
it is clearly a meaningless "family values" style euphemism to try to make
their complaints seem valid.

------
dekhn
Making the index available would be very valuable for scientific research.
when I was a googler, we used the index (and google scholar's index) to do all
sorts of interesting science projects (DNA search, gene search, etc) But we
couldn't publish the results for various reasons. If the index (or a fragment
of it) was available sitting in parquet files (or possibly a better format for
indices) you could easily sit around doing spark jobs to extract all sorts of
interesting web data.

~~~
sytelus
There are tons of projects with serious sizable open source index, for ex,
[http://commoncrawl.org/](http://commoncrawl.org/). Even crawling 15B pages on
AWS is not super expensive on your own using well established OSS tool chain.
Bing already provides APIs if you don’t want to do crawling.

Web index, while important, is by no means most important part of search
engine (which I had say is relevance). The OP article is written by someone
who has little clue that search engine involves massive amount of technologies
besides index and even relevance - everything from spell correction to
recommendations to query rewriting to answers to segment specific searches
like images/video/maps/local/product/news/entertainment/events, so on and on
and on. This is above and beyond the gigantic infrastructure needed to run all
these at scale, speed and cost effectively. All of these needs to work
harmoniously with each other and designed keeping in mind weaknesses and
strength of each component. For example, index for news must be refreshed
almost in real time and relevance needs higher emphasis on location.

Search is not free and no one has figured out if someone just can provide
index and someone else can provide relevance and everything can just work as
efficiently as before (data point: each query consumes 0.3 Watt-hour of energy
at Google).

------
w_s_l
The index word suddenly gave me an idea: the entire internet transformed into
tabular data, searchable and open to public.

Even if we got Google's data, there's a whole lot of scraping to transform
irregular and disparate. Typically you would have to google a keyword, look
through the search results, visit different websites (research mode), and then
consolidate separate sources of truths to build your own understanding.
Building a scraper for each website in the search results and displaying a
complete table data of the website. Why click through 100s of pages of
profiles or data when you can have it all in one view? Why bother with HTML,
when all a data-centric individual desires is data. Getting to the data is so
tedious and a long journey. Scrape the website, clean the data, make it
available for consumption, schedule & consolidate updates. For instance, a
hedge fund that scrapes certain group of websites to execute market orders for
an automated trading system.

A tabular data focused search engine would return tabular data, there would be
no HTML medium, just straight up raw data. For instance, instead of seeing the
comments rendered in a normal browser, imagine a tabular data that describes
all the username, post time, comment minus the hierarchy.

To build a focused crawler quickly, I came up with Web Scraping Language
([https://scrapeit.netlify.com](https://scrapeit.netlify.com)), and
essentially what I want to do is hire people to write WSL to scrape the web
and then sell a subscription to data-centric customers.

What do ya say HN?

------
zepearl
> Fortunately, there is a simple way to end the company’s monopoly without
> breaking up its search engine, and that is to turn its “index”—the mammoth
> and ever-growing database it maintains of internet content—into a kind of
> public commons.

Partially bull__it - even forgetting about the $ that anybody would have to
pay to Google for I/O traffic to scan its index (on which Google would put its
% of fair earnings for at least the I/O generated plus maybe the "discovery"
service for new URLs) that way any competing search engine would only have
access to what Google decides that is worth indexing => OK from the point of
view of the competing market (more search engines can rely on the same source
data and potentially extrapolate different results) but in many cases NOK for
the end-users as in all search engines that would use such service only the
data that Google would decide that is worth to be indexed could be used
(therefore a "closed bubble" set by Google).

If I were Google this proposal would probably be OK (more $ earned to cover
the fixed costs of indexing by selling I/O to outsiders and at the same time
it would lower the pressure/attention from regulators).

On the other hand non-indexed informations like all what comes from the
Android OS (e.g. position of users, apps used by whom&when&etc), usage and
actions on web-pages tracked by g-analytics, DNS access (e.g. 8.8.8.8), and
etc... would be excluded - these are nowadays probably the most lucrative
sources of infos of "what's happening now" \+ "what are people doing now
where" which in turn generate the best answers to what "most" users want to
know, which is mostly about "now" and their regional area.

------
klntsky
> Google is especially worrisome because it has maintained an unopposed
> monopoly on search worldwide for nearly a decade. It controls 92 percent of
> search, with the next largest competitor, Microsoft’s Bing, drawing only
> 2.5%.

1\. The definition of the word "monopoly" is not applicable to this situation.

2\. Property expropriation reduces the competition within the society, while
also being unethical.

------
babypuncher
I don't think this would do anything. Indexing the web is compartively easy
next to querying that index to find results relevant to the users query.

On top of this, Google search is as good as it is thanks in no small part to
business practices that make Google as a company unsavory. Tracking your
browsing habits, reading your emails, and watching your geographical location
all feed the search algorithm, providing more seemingly prescient search
results. For people who are OK with this, I don't see how an alternative can
show up and compete with the years of data Google has already collected on a
given user.

Toppling the Google monopoly is going to be more about a cultural shift than
just giving competitors the data and algorithms they need to become viable.
Users need to be made more aware of how Google uses their data, then
alternatives can pop up marketing themselves on how they do (or don't)
leverage user data. I'm skeptical that enough users will actually care to make
this difference, at least in the short term.

------
goldenshale
It might be true that Google is in such a dominant position that the greater
society needs to think about regulation, transparency, etc., but I think we
need to develop better and more clearly differentiated terminology that
clearly differentiates this form of dominance from an anti-competitive
monopoly. This is not a monopoly in the standard sense, and since anyone can
crawl the web and build their own index Google isn't hindering anyone from
competing.

Through hard work and good strategy they have achieved a dominant position,
and I think that as a society we should both honor that and recognize that
they have been pretty good players in the market. At the same time, if Google
were instead run by the Koch brothers we could be in dire straights, and so we
should also realize that we can't safely leave such power in the hands of
private citizens without demanding accountability, transparency, openness to
scrutiny, open dialogue, etc.

~~~
core-questions
> This is not a monopoly in the standard sense, and since anyone can crawl the
> web and build their own index Google isn't hindering anyone from competing.

It's close. It's as close to a monopoly as MS was with IE built into Windows
98 - while you can find and use alternatives, it's on you to do this, and
Google is the default all over the place.

Remember, too, how much power this gives Google - the top five search results
for a keyword are intensely powerful, for everything from directing commerce
to a particular site, to controlling news narratives, and more. Companies pay
millions for SEO "professionals" and software like Moz / Stat to ensure
they're in the top rankings. The latter in turn need to defeat google's anti-
bot stuff to get search results, so they're in their own arms race...

The whole ecosystem is dirty and gross and gives too much power to too few
people with too much money. It'd be fine if the average consumer was conscious
of this and could reasonably make the choice to use a different engine, but
this is highly controlled as well - think of the Google donations to Mozilla
that they may not be able to live without.

------
thom
I don't personally think Google's 'index' becoming public would make a single
bit of difference on its position as number one search engine (and therefore
advertiser) on the internet, unless you expand that word to include all the
algorithms and infrastructure around it.

------
w0mbat
Ex-Googler here. This proposal makes no sense to me.

Anybody can write a web-crawler and get a searchable indexed snapshot of the
web. What Google is good at, is ranking your search results based on relevance
and quality so the best results are at the top and you never see the terrible
results.

------
tdeck
I am no expert on Google search, but it seems to me that "accessing the index"
and "retrieving ranked results" must be one and the same. That is, the very
structure of the index is designed around a particular ranking scheme in order
to make accessing it computationally tractable. In light of that, how would
one expose an API to offer access to the index _without_ having something
about Google's curation or ranking built in. A query API is clearly out
because query responses include some results and leave out others, but even a
dump runs into the problem that the data was collected, parsed, and structured
with a particular scheme in mind.

------
martin_drapeau
The article mentions a precedent: "in the 1956 consent decree in the U.S. in
which AT&T agreed to share all its patents with other companies free of
charge."

Today the picture is completely different. Google, Amazon, Facebook and Apple
are global - their infrastructure is virtual - not physical lines and cables.
If a government wants to cause harm to one of those companies, they can decide
to "move" somewhere else.

In addition GAFA is capturing dollars from everywhere in the world and
bringing it to the US. It's in the US governments interest to tread softly
here. No matter how the media spins things, GAFA is great for the US economy.

~~~
learnfromstory
Uh no. Google has hundreds of billions of dollars invested in real non-virtual
assets, majority of which are in the USA.

~~~
martin_drapeau
Sure, however they are not directly tied to their bread and butter - pay per
click. They could over time divest and move data centers and jobs elsewhere
without directly affecting revenues. Would be painful and risky but feasible.

------
vinayms
Can someone explain why its fair or acceptable to force Google to make its
index public, the one it built with its own time and money, instead of having
some open source initiative building another index from scratch?

~~~
tynorf
It's posted a few other places in this thread, but it already exists:
[https://commoncrawl.org/](https://commoncrawl.org/)

------
qtplatypus
There are problems with this article.

First googles competitive edge was never from its index. It was from its
ability to give good matches to questions.

Second google being forced to share information like this might be considered
a government taking of google property. Which would require a massive amount
of compensation.

Third this has a condition that google isn’t allowed to remove anything from
its index. However this flies in the face of the right to be forgotten,
copyright law, child exploitation material law etc.

Fourth it says google must allow people to add to the index. Which is just
inviting the index to be spammed with garbage.

------
mhkool
Bloomberg talks about 'making the index public' and the way to do it is with
the API. But the API behaves like the website of Google and does not make the
index of the website content available. Google's new AI-based algorithm is
trained by humans with a strong bias. An example of this is the disappearance
of website of respectable medical doctors who use non-mainstream methods. It
is one thing to agree or disagree with someone, but completely dropping a
website from search results is effectively censorship and who is Google to
determine who sees what?

------
caiocaiocaio
"Facebook is controlling people's minds and ripping apart the social fabric.
Also, you'll notice some 'share' buttons to the right of the article. Remember
to 'like' us!"

------
dumbfounder
"High-volume users (think: Microsoft Corp.’s Bing) should pay Google nominal
fees set by regulators. That gives Google another incentive for maintaining a
superior index."

How do these nominal fees hold a candle to the advertising dollars they rake
in? Their dominance in advertising fuels their R&D to keep their algorithms
superior. They have to maintain their superiority to fuel to keep the
advertising dollars flowing. If suddenly anyone can offer Google results then
a lot of that ad money dries up. Then the R&D money dries up.

------
whalabi
Similarly I've been considering lately that the only real easy to break
Facebook's monopoly is to target it's network effect, by forcing it to
interoperate on some standard protocol, like ActivityPub.

So, say, people could use Google+ (if it was sticking around) to see Facebook
content without giving data to Facebook.

Not sure how that would work with Twitter, I guess they'd just show media and
clips of text posts.

Honestly I think I'd love it - just choose your favourite social network and
only use one.

Up to each network to implement an interface that makes sense.

------
BluffFace
First, Google doesn't have a monopoly on 'Search'. Competitors provide
entirely comparable products and are not that far off from what Google
provides. Other industries? Maybe. Search? Not in the slightest.

Second, that is a terrible idea. It would be a fucking crisis if Google's
entire index was made public. I don't think the author understands the scope
and nature of what Google indexes, as well as the protections Google has
implemented.

------
IronWolve
Just "Google it" means to find an answer, if that answer is being changed by a
megacorp for unknown and sometimes nefarious reasons, I'd say we have a
problem.

Would you want to use a library that filters its books to push an ideology?
Would you want to listen to the radio if they only pushed top 40? Would you
want to have a doctoral thesis that needs corporate approval? Research books
that exclude facts based on corporate profits?

------
mattferderer
Is Google's Search quality that much better than Duck Duck Go or Bing? I get
about the same quality of content when using any one of the three.

All three are terrible at serving me up websites to buy crap when I typically
want to learn about something. I would love a filter that says "Don't try to
sell me anything"!

I believe Google's biggest advantage is in their marketing & that the word
"Google" now is a synonym for search.

~~~
exhaze
I don't know the answer to this - you could be right, but just outright
questioning that Google's search quality is no better than DDG or Bing and
basing it off of a single data point is pretty foolhardy. Your second
statement reinforces this point - I've run a lot of Google ads, and they're
quite effective. Google ads are very effective at selling things to people.

In the future, when coming up with an opinion about something, I'd encourage
you to look at statistics and combine that with your own personal experience.
You'll often learn something you didn't know, and come up with a more grounded
opinion.

~~~
mattferderer
I was looking for personal opinions to gather data from, hence the question. I
was not trying to cite stats & persuade with anything more than my personal
experience. It would have been very easy to link to articles such as this Yale
study - [https://www.networkworld.com/article/2225489/research-
buries...](https://www.networkworld.com/article/2225489/research-buries-
microsoft-s-bing-vs--google-claim.html) if I wanted to do that. FYI, this
argues the 2 are fairly close.

Also asking for personal opinions will give me much different results than
looking up a blind test research as brand loyalty will play a big part in ones
like Duck Duck Go.

------
ryacko
Why does everyone assume Google is automated? A single website containing a
small amount of knowledge or even a leaked archive of Diebold emails will show
lower on Google search results than a web forum or some listicle.

The real web crawlers are people who surf the web, Google likely takes
information from analytics and outbound link tracking to determine the most
popular (not the most informational) website.

------
wongarsu
Forcing them to give anyone access to their index at a fair price wouldn't be
unprecedented. In Europe that's a common method to deal with infrastructure:
anyone can lay a phone line, but you have to sell access to competitors at a
fair price.

Alternatively we could just publicly fund a crawler. A project like Common
Crawl, just bigger and better, with some simple indexes prebuilt.

------
sprash
There is already commoncrawl[1]. Nobody managed to build a useful search
engine with it. Apparently it takes much more to produce meaningful search
results. Making googles index public won't solve anything. What we really need
is a viable competitor to google.

[1]: [https://commoncrawl.org/](https://commoncrawl.org/)

------
Animats
Wouldn't help.

A classic antitrust breakup might.

\- Google Search - no account required or offered.

\- Doubleclick - ads on third party sites, not including Google sites.

\- Alphabet Services - mail, docs, etc. - anything that requires a login or
pay.

\- Alphabet Heavy Industries - cloud, large scale business services, etc.

------
nurettin
Making google's index public may not be as simple as sharing a file.

We don't exactly know how google indexes or ranks pages. It might be a
completely different way of looking at indexing, it might be optimized for
google's use cases such that we don't even have the software or the hardware
to instantiate and search through it.

------
henvic
There is no barrier of entry to indexing the web. It is antiethical to force
Google to make its index public through regulation. Google has its problems
and we should attack them but not like this. It is not a monopoly. It is just
the largest player by attending public wishes.

Let other indexes such as DuckDuckGo win through the market.

------
swalsh
The data that google has holed up that I want more than anything else would be
trends, broken down by page number. As interesting as the data of how often
people search for a term is, I find it far more interesting when they can't
find the thing they're searching for. That's a hole in the market.

------
bvrmn
Article on bloomberg wants to make something public and wants money from me
simultaneously.

------
lunulata
this guy doesn't understand search, google or reasonable policy. thanks for
the lawls

------
t4ndem
All my results on the first screen on mobile and web are just ads... The good
old days of discovering new websites and new ideas went away after the top
results ended up being just ADS or super branded sites that we already know
about...

------
retpirato
Regardless of its search index or anything else, Google's dominance really
comes more from it amassing a large userbase from the early days of the
internet when no other search engine came close than it does from its search
index.

------
paganel
It would be even more useful if they would make all the GMaps data available,
like locations, points of interest etc. Like now I thing you need to have a CC
attached to your account in order to get a slice of that data for free.

------
vgetr
This sounds eerily like what happened with the railroads, and I suspect it
would have some unintended and unsavory side effects. At the very least, it
would open up opportunities for the Wesley Mouches of the world.

------
40acres
The fundamental tension of technology monopolies is convenience v.
competition. In the same breath that Google and Amazon are criticized for
their dominance critics bemoan the fragmentation of online streaming.

------
matthewfelgate
I'd like to break Microsofts Monopoly on desktop OS. And Github. And Skype.
And LinkedIn. And Azure.

Search is virtually the only industry that Microsoft operate in where they
aren't the Monopoly or biggest player.

------
akrymski
If only replacing Google would be as simple as building another search engine.
It’s like trying to beat WhatsApp by launching another messaging app. Any kid
could do that. The tech is not the hard part.

------
yy77
The key for goolge's monopoly is not only its index algorithm capability, but
also the brand, the chrome browser, the android and all other google product
integration.

------
_pmf_
The index is a vast majority of their infrastructure. Queries just do not make
sense from external infrastructure; the bandwidth just isn't there.

------
vectorEQ
And then next thing ,people will want an API into this index because it gets
updated frequently... tadaaaa GOOGLE SEARCH 2.0

------
ahallock
An API is not the same as making the index public. I don't think the author
has a grasp on all the technical details.

------
dharma1
Would definitely make Siri and Alexa more relevant if they had access to
Google knowledge graph

------
eqtn
Another relevant option would be to ban targetted advertising, both online and
offline.

------
rayiner
“Information wants to be free!”

------
lonelappde
Bloomberg Opinion outside of Matt Levine doesn't have credibility to me.

------
jklinger410
If you take away the moat of any business it helps their competition.

------
ilaksh
The solution to technopolies is decentralized protocols.

------
mgamache
Yes please, a new Golden Age for Black Hat SEO!!!

------
estebarb
I don't see how this could reduce Google's "Monopoly".

Searching with an API is the same as searching from the browser. how the
results are going to be sorted: garbage will be garbage, and should be demoted
both in the API and HTML version. And of course, downloading the index is out
of question.

I think that a better way to improve competition would be reducing cloud
computing costs (bandwidth in particular). But of course, regulators in USA
would say that it is communism or something... Sigh.

------
ElijahLynn
Is Duck Search's Index Public?

------
ejz
Boo

------
based2
add Bing to the Index

------
tempsolution
This suggestion is about as intelligible as saying back in 2006: "To break
Microsoft's monopoly in the OS market, make it's kernel source code public".
(i.e. Absolutely useless)

------
auslander
It takes $Billions to get new search engine to same level as Google..

Candidates with $Bs to spare (!) are Apple, Facebook, Microsoft. First two
never started it, Bing is quite stupid so far. My default SE in Bing. I wish
it was Apple :)

------
arch-ninja
No sure if relevant but I've hated google's search for a unique reason: I find
it horribly slow, especially compared to what it used to be. It's so slow,
I've designed + partially implemented an alternative for my own use:
[https://github.com/Jeffrey-P-McAteer/dindex](https://github.com/Jeffrey-P-
McAteer/dindex)

I've only tested with 1000 records, but the query times are all <200ms.

~~~
judge2020
Maybe it's your part of the world, but my TTFB to google search pages is less
than 150ms and the "generated in x seconds" is usually less than 1 second.
That's pretty good for an index searching effectively every public internet
page.

~~~
arch-ninja
I measure from when I hit enter to when my screen is full of results, and just
_rendering_ google.com takes a full second on my macbook with 8gb ram and an
i5 processor. It's so bad I have a shell script which forces my processor into
the 3200mhz range when I'm on my browser workspace. When I'm not looking at my
browser (+no downloading files, no audio) the same script sends a SIGSTOP to
it so it isn't eating CPU cycles while I'm writing code.

