Search engines need money to operate (we, Blekko, pay a ton of cash every month to keep our portion of a data center in Santa Clara humming) and search advertising is an excellent product, but it is also a finite market. Let me explain.
So lets assume for the sake of argument that the amount of money everyone in the world is willing to spend on advertising is fixed . You know like $32B/year. (I don't know the actual number that is just made up for illustration). These are "companies" (from single users to large multi-nationals) who are willing to pay money to a person who puts their advertisement in front of a potential customer.
So lets say Alice at BigCorp has an advertising budget of $1M/year. Maybe she is going to buy a TV spots with most of that and spend $100K on "Internet" advertising. She can either talk to a bunch of "properties" (which is what she would have done in 1995) or she will buy ads on "Google" which means they might pop up on AdSense for Content pages, or via AdWords in various searches, perhaps on your Gmail window, or your News feed. And she'll only pay for them when they get clicked on and she'll use her analytics to try to figure out how "impactful" that was. Or maybe she only has a $100 Ad budget and she will blow it all on AdWords for putting her ad on search queries that people might make when they were looking for her business.
The small search engine is at a disadvantage, not from a technology perspective (the searches can be better than Google's pretty easily for highly contested searches) but from a revenue capture perspective.
What is worse, bad advertising networks are really bad, they can serve up malware as a number of popular blogs have discovered. So people who are very brand conscious or burned by a bad ad network will shy away from those networks, making non-Google networks less effective (fewer advertisers so less competition for ad insertions) and search engines that use them get commensurately less revenue per thousand searches.
The one redeeming factor is that when you have the ability to crawl and index enough of the web, that asset gives you the ability to do some very interesting things. Fortunately things that others will pay for (because neither Google nor Microsoft/Bing will give you access to their index). The down side its not as lucrative (on a $ per kilo-core-cluster millisecond level) as running the combination of the worlds most used search engine feeding you the worlds most used advertising network.
If you ever had any doubt, Google's advertising business funds them like Microsoft's Office business funds Microsoft. If you ever split Google in two where its ads business offered services to anyone on a non-discriminatory nature, the world would be a more interesting place (and there would be several really interesting search engines with their own editorial slant, not just a few)
 This is largely true, although the "growth" in Internet advertising revenues has shown up as a decrease in other media advertising. From newspapers to radio those ad dollars are shifting to the net but the overall size of the pie is constant or shrinking slightly according to Advertising Age (http://adage.com/)
- Google takes a huge (also: normally not published/known - it was 69% the last time I saw a number, but they can adjust it as they like) chunk of the ad revenue, so advertising on the Google network will be very expensive in comparison
- Google has no/terrible customer service (e.g. you can get banned for their apparent misinterpretation of EMEA trademark laws, by using "iphone" in your ads that point to your page where iPhones are sold - good luck getting unbanned again)
The biggest problem is currently the market share of alternatives (including Bing e.g. in Europe) - and the fact that DDG does not have any ads.
When we switched to their ads our revenue increased 10x which made us super profitable (we were primarily a federated engine like DDG though).
At the end we sold it for little money, focused entirely on the enterprise (which had always been our main focus) and sold to IBM for real money.
The best way to take on Google is to render web search obsolete (the same way Microsoft is becoming irrelevant because the PC is becoming irrelevant), not by trying to match with limited resources and knowledge what they do very very well with enormous resources and know-how. Thinking otherwise is not being passionate, it's being presumptuous.
Depressing. Mozilla can and should take them on, Google Search is biased given that their real customers are the advertisers and extremely ad heavy. Amazon could make a dent too but they'd be biased as well.
We've seen this work when people take on CraigsList and Ebay and other "unstoppable" tech companies. Don't attack head on.
For example, I can think of a few popular types of searches that Google doesn't do super well: code search, product search, local search, genealogical search, real-estate search.
Blekko is of course working on searching specific namespaces, but that's not what I mean. I mean taking on a single underserved domain and really making it perfect.
This is good. I used to work with an industry-specific portal that was perfected to work with said industry. Google would never be able to touch this space.
Despite being a smallish industry, there was two large players and few smaller players. The innovation even in such a small space was quite astounding.
There are definitely some areas that are too specific for Google to really work with. It is very good as a general search engine, but if your time is dependent on getting information fast in a specific industry, Google falls flat.
The main issue with this strategy is that you only have a small subset of the population and you have to an expert in many domains to get it right. thus, you'll never be as large or profitable as Google. Of course, you better have people to talk to on the phone. That'll kill this engine before it gets off the ground.
The only way I can think of toppling Google is if you created an engine that really focuses on productivity and gaining market share from the people who really need information and who are willing to pay for said information and offer them a free version that is better than their industry-specific tools, then after gaining penetration and toppling some of their industry players, branch into focusing on finding cute cat pictures and the like. There is more than one way to gain mind-share. Finding a better way to find links to movie reviews is definitely not it.
I think there's a market for search engine competitors, maybe not in the "general public" category, but certainly so for some verticals - I was recently asked to build something that requires either a crawler or access to a search engine API, and I don't know if Google is what I want (probably will start with Bing if we end up doing the project).
By the way, is it possible to get deep query results from Google or another search engine? Say I need to see top 1000 results for 100K keywords and use that as a seed for my own crawler. Is anyone offering that?
DDG also take a lot of other people's crawls.
DDG then add a layer on top. For example, 'official sites' are identified.
(I was reading about this just this morning, and frustratingly I cannot find the post again.)
I use Blekko. I like it, but it needs a lot more people creating topical lists of sites before it really offers something unique.
That already exists in a sense. You could start with a bunch of ASF projects... Nutch, Solr/Lucene, ManifoldCF, Droids, Hadoop, OpenNLP, Tika, Mahout, UIMA, etc. and build a reasonably good search engine. The problem isn't writing search code; it's scaling the darn thing up to "Internet Scale" and other things that get ya. Can you imagine how much hardware and how much bandwidth it takes to continually crawl the web, download, parse and index pages on the scale of a Google?
The other problems are things like preventing spammers from gaming the system, etc. Whether or not a search engine where all the algorithms were public would be easier to "game" is, I suppose, an open question. I think most of us intuitively feel that it would be, but maybe not.
2) SEO has made search for some items really lousy. Some way of filtering those results out might be handy.
3) Sometimes people just need a quick answer to a question. "What's the population density of Manhattan?" is reasonably easy to get an answer to, but "What's the name of that TV programme I used to watch in the 80s with a character called something like flimby or flombu and it was a steel egg thing" is much harder to search. (It's a demonstration of just how good search is that I can bash in a few keywords for such vague queries and get useful answers.) Uh, so some sites try to solve this by letting you ask a question and have humans answer it (Yahoo Answers; Stack Exchange), but still a best search engine would be better at finding these kinds of answers.
4) I make a post to Facebook. Or I see a post on FB. A few weeks, months, later I want to find that post again. But I have no hope.
5) I have about 3 million bookmarks. They are untagged and poorly named. I want something to crawl those and build an index, so that I can then find the URLs I want.
1. It is more vertical. Allows for deep search with filters.
2. Ditto. It uses hashtags, forward slashes and bangs. All with a different functionality.
3. I'm up to my neck in machine learning to improve this.
4. It allows for you to search your own posts. Yes, you can post to Nuuton. No, its not a social network.
5. It allows tagging with hashtags. Say #pizza #recipe.
Plus its available as an API.
I disagree about the flat search engine being outdated. I hate all the social crap, videos, and local listings the big search engines shove down your throat. I've heard a lot of other people voice the same concerns.
I use DDG as my main search engine but sometimes I just give up and have to fall back to Google which I think is because of their reach or maybe filtering. Happens mostly with pages/searches specific to my country(IN).
Any new player cannot be just as good as google. It must be much better than google. If you manage to build something that is objectively better you can expect investors to shower you with money. Would not be surprised if google, facebook, MS would enter a bidding war to buy you. Don't know if Apple would be interested though.
The only way google can improve search is to drastically change their search algorithms. Perhaps throw away whatever they are doing right now and approach it from a new direction. This is the opening I believe new players have. A new approach to search, like contextual search will probably be enough to seriously threaten google.
I think human powered search could be achieved by Google. All they need to do is track our clicks on the serps and interpret which sites are good. Rotate all sorts of sites in the serps and gradually build a database. The whole thing would be like a wiki, with everyone contributing a little, and benefitting from the whole. The only way to accurately beat spam and low quality is to use human feedback. They probably do use human feedback already, just in a different way.
I'm wondering if reddit and Facebook could use this approach to build search engines. They do have large databases of human preferences.
The machine learning (AI) approach:
Another idea would be to distill information from the web Watson-style and try to answer many questions directly instead of redirecting to external pages. So far Siri, Watson, Wolfram Alpha are ahead in this field.
Google is no longer about search.