This is what makes me wonder why we don't have a LOT of competing search engines. Perhaps i'm vastly under-estimating the technology and difficulty (I could well be - it's not my domain) but it surely it can't be THAT hard to spawn Google-like weighted crawl-based search results?
It's a long-since solved problem - heck, pageRank's first iteration recently came out of patent protection - it could just be copy'pastad. Why aren't all the big companies Doing Search?
The spammers won. Google gave up and settled for "we like the right kind of spam—the kind that took a little effort, and makes us money".
Searching for "north face glaciation" did help as the first page of search results did have one entry on the topic I was actually searching on!
Maybe they should have a "I'm not buying anything" flag!
Many of the top search queries are navigational searches for brands.
And so if tons of people are searching for your brand then if there is a potentially related query that contains the brand term & some other stuff then they'll likely return at least a result or two from the core brand just in case it was what you were looking for.
Google pays some large number of people to do search and grade the various results they get to see if the answers are good, which then helps feed back ML.
Heck, according to this article, google has been paying people to evaluate their search results since 2004.
Google for general search.
Duckduckgo fir general if you want something a bit more private but not extreme enough to run your own spiders.
Bing mostly for porn search - not being snarky some people do consider it to have better results.
It's easy to gather the necessary data, but it's hard to know which parts of that data are the most relevant for finding good content and avoiding bad content. Is it more relevant if key words show up in links or titles than in the body of the text? If so, SEO spam sites will include a bunch of keywords in links and titles. Is it more relevant if keywords show up in the first 200 visible words of the page? If so, spam pages will make tons of pages with relevant keywords at the top.
The hard part about building a search engine isn't indexing the internet, it's adapting to spam. Spammers are continually adapting to changes in the algorithm, so the algorithm needs to adapt as well. And the more popular your search engine is, the more money you make and the more able you are too adapt to spam (and the more spammers focus on your engine).
So, the problem isn't that Google has a better index (though I'm sure it does), the problem is that nobody else has the will to spend the money necessary to tune the search algorithm to stay on top of spammers. When Google started, companies didn't care as much about improving their index and instead focused on building their other content (Yahoo, MSN, etc). Google saw the value of search and got a lead on everyone else in terms of curating results, and now they have the momentum to stay in front and have shifted to building content to improve monetization. Nobody else has the monetization network for search that Google has, so they'll continue having the problem that other companies had (Microsoft wants to point you to their other services, DuckDuckGo is limited by their commitment to privacy, etc).
In short, Google wins because:
- it was better when it mattered
- it makes money directly from search
- its other services improve their ability to understand what users want, which improves search quality and ad relevance
You can't make a better algorithm by being clever, you make a better algorithm by having better data, and that's hard to come by these days. The only way I can think of a competitor stepping in is if they target an underserved demographic and focus data collection and monetization there, and DuckDuckGo is close by targeting privacy conscious power users.
The irony there is that DuckDuckGo can't collect much of that data precisely because of their privacy focus.
You didn't just hit the nail on the head; you drove it all the way in with a single blow. Bravo.
Outside of ad revenue, search has always been seen as something of a "charity" effort for the internet. It's "boring" infrastructure work that can be critically useful but doesn't really make money directly on its own. No one wants to pay a "search toll" and there's no government agency in the world that the internet would trust as a neutral index to run it as actual tax-basis infrastructure.
This is far more valuable than the general page rank algorithms that were initially developed and have already been duplicated many times in academia and business.
Google custom tailors results for each and every machine. Even if you're not signed in, Google uses your browser fingerprint, the OS it's reporting and location/IP data to custom fit results. There is no "stock" google result.
This is something DuckDuckGo et. al. can't do if they want to focus on a privacy model. DDG does offer location specific searches, which can be helpful.
Consider DuckDuckGo, which sells itself on privacy, but after more than a decade has only 0.18% market share. Without the power to make it the default in an OS or browser, you'd have to have a really strong value proposition to convince people to switch.