Ask HN: Is there room for another search engine? - jmstfv
======
Animats
That's a good question, and something I've spent much time on. Cuil
(2008-2010) tried. I knew some of those people. It cost them about $30 million
to launch a full scale search engine. They had no revenue model. In
retrospect, they were hoping to be acquired by somebody. It was some ex-Google
people, trying to replicate older Google technology. They had a great launch,
but the system wasn't very good and traffic rapidly fell off. Their technology
wasn't that great. Their big selling point was that they could do the job on
less hardware than Google used.

Yahoo had a search engine from 1995 to 2009. Yahoo is now a Bing reseller.
There was a period around 2007 when Yahoo search was better than Google
search. They pioneered integrated vertical search: special cases for weather,
celebrities, and such. But Google copied that.

Blekko (2010-2015) had a scheme with "slashtags" which attracted a small
following but never caught on. They were trying to crowdsource part of the
problem. Eventually, Blekko was acquired by IBM's Watson unit, and ceased
offering public search.

Bing, Microsoft's entry, remains active. Microsoft seems to have given up on
trying to raise Bing's market share. Bing no longer has a CEO of its own; it's
just a miscellaneous online service Microsoft provides. It's still #2 in
search, but only has 7% market share.

There remain a few little search engines. Ask, formerly Ask Jeeves, continues
to operate, but has only 0.17% market share. Ask is from IAC, in Oakland, a
spinoff of Barry Diller's Home Shopping Network. Excite, formerly Excite@Home,
with 0.02% market share, continues to operate. Excite, in its day, was a hot
startup powered by too much venture capital.

Outside the US, there's Baidu (China) and Yandex (Russia). Neither has much
traction outside their home countries.

It's possible to do a better search engine than Google from the user
perspective. It's not clear how to get it to profitability. There are two
things Google does badly - business legitimacy and provenance. Google doesn't
background-check businesses online. (I do that with Sitetruth; it's not only
possible, it could be done better with a tie-in to costly business background
services such as Dun and Bradstreet.) This allows bogus and marginal
businesses to reach the top of search via the usual SEO techniques. Google is
also bad at provenance - figuring out that site A is using text derived from
site B, and thus B should be ranked higher. This is what allows scraper sites
to rank highly in Google.

Fix those two problems, and a new search engine could be better than Google.
Whether anyone would notice is questionable. Profitability would be tough. The
reward for success is high. Search ads are more relevant and more profitable
than any other form of advertising. When someone sees a search ad, they're
actively looking for the item of interest and may be ready to buy. Almost all
other ads are interruptions or annoyances. That's the basic reason for
Google's success.

~~~
thesmallestcat
There are way more than two things that Google does wrong. Remapping my search
terms into oblivion so it can pretend it's fast is the worst one. Especially
when this happens to a query I've modified to quote "every" "single"
"flipping" "term." I think Google is cheating, and that their usable index is
much shallower than they'd have you believe.

What's needed is a search engine with functional queries (as opposed to
Google, which now only operates in "the user is drunk" mode), that doesn't
give a damn about your robots.txt, and that can capture content in a way that
is more akin to archive.org than Google's shoddy and increasingly absent
cache.

Another issue is spam/false matches. Why does Google return illegitimate
results? Because, let me tell you, any search for "some nifty computer book
pdf" returns pages upon pages of bogus links leading to ad link mazes. A
crawler should be able to trivially crawl such a page, determine that no PDF
is linked, and blacklist the result, but this doesn't happen.

Google is slow and preoccupied. Their business is ripe for disruption.

~~~
kjhughes
To disable remapping of terms in Google, use _verbatim search_ : On the search
results page, choose "Search tools -> All results -> Verbatim"

~~~
dredmorbius
There's no syntax of which I'm aware which tells Google to _not_ "correct" or
expand the results.

When I'm seeking specific matches, I'm seeking _really_ fucking specific
matches. Google annoy me to no end.

~~~
user5994461
+word

It used to force to search the word exactly as written.

~~~
dredmorbius
"Used to" is the operative phrase. That was among the negative consequences of
Google+, as +<term> notation apparently was going to be reserved or repurposed
for that somehow, but never was.

I've done some large-scale searching where the most relevant detail is _how
many results are returned_ , most particularly for a specific domain. (For
which, incidentally, there's no handy mechanism to accomplish, so it's <array
of terms> * <array of domains>, and multiplicative explosion of searches, plus
about a 45s timeout per query to avoid triggering bot defenses by Google.

Such as this:

[https://www.reddit.com/r/dredmorbius/comments/3hp41w/trackin...](https://www.reddit.com/r/dredmorbius/comments/3hp41w/tracking_the_conversation_fp_global_100_thinkers/)

------
cocktailpeanuts
The short answer is YES, but the long answer is, if you're thinking about
building a startup, this should NEVER be the question you ask.

All successful companies that came out of nowhere and disrupted already stable
industry never started out thinking "How do i build another X", "How do I
disrupt X"? They all built something they thought was needed by the world and
it went onto somehow "disrupt X".

So if you're starting out thinking "I want to build a search engine if there's
room for another.", that will never work because you don't even know what
you're solving, you will be frantically searching for the question throughout
your "startup" life.

~~~
dredmorbius
Dunno. It's worked before.

[http://infolab.stanford.edu/~backrub/google.html](http://infolab.stanford.edu/~backrub/google.html)

~~~
cocktailpeanuts
good for you, go do it then!

------
sairamkunala
Companies like Algolia which provide a site specific search engine has been
doing really well especially with the speed and relevancy where Google
currently is not concentrating on.

[https://www.algolia.com/](https://www.algolia.com/)

~~~
sergiotapia
Algolia is a game-changer. They made it so incredibly simple to add search to
your website. I'm not talking about their widgets, I'm talking about their
server-side integrations and their javascript client-side lib.

It's like magic. See it in action here:
[https://stackshare.io/match](https://stackshare.io/match)

~~~
kapauldo
It's bizarre that algolia is a thing. Elasticsearch is good and free.

~~~
bsilvereagle
Yes, but Elasticsearch requires a dedicated server, you may have to use a
river, you may have to shard, etc. Some people (most people?) don't want to
think about shards & rivers and just want something to work. Algolia seems to
do that.

~~~
eridius
I know what a shard is, but what's a river? I've never heard anyone use that
term in a technical context before.

~~~
bsilvereagle
I used ES a long time ago and rivers have since been deprecated:
[https://www.elastic.co/blog/deprecating-
rivers](https://www.elastic.co/blog/deprecating-rivers)

A river was just a mapping from a database to ES. For example, you could
search CouchDB with ES with the proper river set up.

------
cagenut
This is more a feature than a different search engine, but I so so so wish I
could de-prioritize blogspam. 300 - 1000 word text-heavy writeups of a couple
of facts where a few bullet points, an image, a graph, a map, or a data table
would be much much better. Google has been SEO'd to death because of its
block-of-text lowest common denominator favoring.

~~~
throwayedidqo
I write blogspam. It's definitely a problem but I don't know what Google is
going to do about it.

Basically google can't trust backlinks anymore because people game them and
competitors try to destroy each other's sites by buying scummy links to their
stuff.

So they mainly attempt to measure quality in a vacuum. This is using their
machine learning stuff to look at the quality, confidence, and reading level
of the writing style.

They do the same quality checks for the site. Checking for EV certs, clean
markup, real email volume through Gmail, reputable DNS provider, physical
address in G maps. A lot of their hundreds of quality metrics don't measure
the site itself, but use Google's pervasive data trove from their other
services. Most scammers don't bother doing any of this right.

The problem becomes people like me. I setup sites with all measures of quality
for legitimate businesses. Have articles written by good writers with
knowledge in the subject. Sounds great right?

The problem is that these articles are still done for money and quite biased
sometimes. Google is slowly running into a need for a strong AI because all
measures of quality can be emulated if enough money is on the line. It doesn't
matter if something seems truthful in every way except the fact that it isn't.

This is the same reason "fake news" is invading google and Facebook. Smart
spammers have upped their game to the point that it's impossible to know
what's real anymore.

Need a wikipedia article changed? Good reviews on Yelp? A nice piece on a
popular tech website? All of this can be openly bought with zero consequences.

~~~
tajen
> pervasive data trove from other services

I would believe that, although there has to be several alternate ways to
measure a site popularity, Google Analytics are a huge part of a site ranking.
Do users stay long on your website? Do they come back to the search results
after that? Do they click through to "Pricing", then back to "Features"? If
so, that must be the right answer.

Alternate example: I would bet that shops where Google Maps geolocalises a lot
of customers have a higher-ranked website than similar websites where their
physical venues are empty.

~~~
throwayedidqo
Even creepier... If Google knows all your physical locations they know how
many employees you have at work on a given day

------
rgovind
Yes. In today's search engines, I cannot give you a blacklist and say filter
out these results. If I am looking for tutorials, I cannot say no video
results. If I am looking for market research, I cannot filter out news
websites from the links. For personalization, I cannot give google any
suggestions on what I absolutely do not want to be included etc.

~~~
greglindahl
blekko had these features, and almost no one used them. The google guy who
teaches advanced Google searching says that almost no one uses Google's
advanced search, either. So if this is a viable niche, you'll have to figure
out how to find these users...

~~~
rocky1138
Who cares if a small subset of people use them? It doesn't make them
worthless.

~~~
greglindahl
Oh, I think it has value. I was just reporting that I tried it at blekko, and
think I completely failed. Maybe you'll be able to do better.

~~~
ianwalter
Very interesting! Is the main problem really finding the users or could it be
developing an interface that makes it easy enough for users of varying
technical abilities to occasionally use?

------
garysieling
I think there is definitely space for niche search engines - there are tons of
them already, if you include things like the DPLA, octopart, iconfinder.com,
Spotify or class-central.com.

Google is focused on getting you to a relevant result quickly, but having a
search engine that helps you discover new things is really useful. If you
focus on a niche, you can also make use of a lot of metadata Google doesn't
retain.

I'm exploring this on a small scale with
[https://www.findlectures.com](https://www.findlectures.com). Having the date
a video was made gives it a 'street view for history' feel, and lets me rank
historical content differently from conferences (where recency is more
important).

Building a graph of talks, conferences / speakers / books / publishers could
be the building blocks for a pagerank implementation, or to build a different
type of book search. Alternately, I think it would be interesting if search
engines let you do LSA style queries, like "Brian Goetz" \- "Java" \+
"Python", to help discover speakers.

------
adamnemecek
I think that definitely. Google lacks in quite a few areas, I think that

a.) storing more data about the sites (and doing something interesting with
said data)

b.) improve the UI/UX for power users. The best part is that I can imagine
that there would be quite a few people who would pay actual money for being
able to use a better search engine. Note that the Bloomberg terminal, is,
among others, a search engine. For example, you could make the link graph
explicit, you would immediately see what sites link to what sites.

E.g. symbol search really leaves something to be desired on google. I also
wish I could use regular expressions. I get it, they are expensive, but like
even a little "expressiveness" goes far.

c.) i would pay A LOT for a good search engine for code.

~~~
boto3
What kind of code search are you thinking of? In my experience, code search
could be useful when one works with a big and unfamiliar code base, but even
then good architecture documentation and a good IDE would help more. And when
one really needs string search, `git grep` is usually fast enough (for me on a
5GB code base).

~~~
adamnemecek
It's not just current code search but like "auto complete". What if my IDE
could recognize that I'm writing a bad implementation of binary search and
could suggest a better one from the internet. What if I had an SQL like
language that I could use to query and transform things. E.g. find all
instances where a method starting with "set" that take an Int as an argument
and do with these. I've been noticing that quite a bit of writing software
would be a lot simpler if I could do this.

------
dhimes
Here's what I want in a search engine:

Charge me $15-25 dollars per year

Let _me_ decide what demographic information I wish to share- make it easy for
me to control and help me protect my information. Because you are charging me
money you can afford it and I trust you.

Give me two search options: one, I'm only seeking information. two, I'm
looking to buy. Do this for me as an advertiser: help me qualify the clicks
I'm paying for

Perhaps allow me to pay per 1000 impressions (CPM) instead of per click.

By the way, I would also subscribe to a facebook that did this.

~~~
boto3
Google is making 50 USD per user, potentially an order of magnitude more from
a US user from ads, so I am quite certain that your offer of $25 USD is a low
baller :)

~~~
ianwalter
Yea but Google also makes more money than they know what to do with. Surely
you could be sustainable at a lower price?

------
stupidcar
For a general search engine, no, there isn't.

The upfront capital investment, in terms of the data center capacity necessary
to make a modern scraping and search infrastructure, is immense. And since the
ad-word business model does not scale linearly with market share – e.g. the
market leader collects a disproportionate share of the available profit – you
will be losing additional money for a long time.

Since the market leader is good enough that it isn't possible to disrupt the
market purely through result quality (as Google did), you will need to rely on
bigger and more effective marketing spend. Not only will you have to outspend
and outperform Google, but also Microsoft/Bing, who have tried to do the same
thing for years, with only limited success.

Even if you have the funding necessary to do all of this, then you would be
better off either buying shares in an existing search engine company, or
starting a business in a different market, one with lower upfront costs and
less dominant incumbents.

~~~
dri_ft
> the ad-word business model does not scale linearly with market share – e.g.
> the market leader collects a disproportionate share of the available profit

Why is this so?

~~~
dredmorbius
Essentially: network effects, efficiencies of scale, and costs of managing
multiple small relationships (from the advertising buyer's PoV).

Google's real monopoly at this point is ad-side.

~~~
Animats
Along with Facebook, yes. About 65% of online ad spend is to Google and
Facebook.[1] The remainder is to a bunch of little guys, mostly selling to
bottom-feeder sites.

Except on mobile. Banner ads on mobile are growing.

[1] [http://fortune.com/2017/01/04/google-facebook-ad-
industry/](http://fortune.com/2017/01/04/google-facebook-ad-industry/)

------
mark_l_watson
Although I also use Google search and Microsoft Bing, probably more than 80%
of my search is done with DuckDuckGo.

The fact that a lot of us DuckDuckGo, and I hope they are profitable, is
evidence that there is room for other search engines.

I would like to find a good substitute for Facebook, but the fact that so many
people I know use it, that I always need to check Facebook two or three times
a week to not miss out on stuff since many friends and family don't use email
anymore.

Attending the Decentralized Web Conference last year got me excited about
using smaller and Decentralized services. Gnu Social is pretty good, but
requires work to find interesting people to follow.

~~~
cassowary
Instead of Facebook, just talk to your friends and family semi-often. If one
of them has a baby or goes to Europe a lot of them will know and someone will
mention it. On occasion you will hear about something three years after the
event but that's still okay. It worked perfectly well for thousands of years
and it still works today.

------
jasode
If by "search engine", you mean something similar to Google/Bing then probably
not.

However, if we expand the concept of "search" to something beyond text on
webpages and "engine" to something beyond a linear algebra pagerank problem
that weighs url links, there's room for many more competitors.

Let's say we want to search for "best restaurant":

Method #1 might be searching millions of web pages, twitter posts, newspaper
archives, etc where ngram such as "best restaurant" is mentioned. That's what
Google/Bing engines already do.

Method #2 might rank restaurants by collecting crowd-sourced opinions. That's
what Yelp & Tripadvisor does. (Although Google also piggybacks on their data
and lists yelp pages in SRP.)

Method #3 might be a company like Visa/Mastercard analyzing their billions of
transactions[1] and based on actual _spending amounts & frequency_ of a
billion cardholders, they can also provide their own calculation of a "best
restaurant". (I know that Visa/MC already offer limited marketing data to some
entities but they don't surface that data to every day web surfers.)

The idea is that there's plenty of room for more imaginative scenarios of #2 &
#3. The common theme is that Google doesn't have the data (e.g. credit-card
transactions) and therefore, the new "search engines" can give fresh answers
that Google algorithms can't provide. To try and boil it down to a simple
question: _" What interesting answers can a new engine provide that _can't_ be
extracted from the text of webpages?"_

Btw, I ran across some posts from a Microsoft employee (but not a Bing team
member) stating his opinions on building competing search engines.
[https://news.ycombinator.com/item?id=7011472](https://news.ycombinator.com/item?id=7011472)

[1] [http://marketrealist.com/2016/10/why-visas-processing-and-
in...](http://marketrealist.com/2016/10/why-visas-processing-and-
international-revenues-have-risen/)

~~~
eridius
Method #3 would be for calculating popular restaurants, not best restaurants.
For example, I bet McDonald's would rate pretty highly with that approach, but
it's very nearly as far away as you can get from the idea of a "best
restaurant".

~~~
jasode
Popularity is but one input for "best" \-- depending on one's definition of
best. There's enough metadata (price, location, cuisine, etc) to correlate
with actual payment data to filter out fast food like McDonalds. It doesn't
have to be a totally naive statistical approach that gives "dumb" answers.

The idea is that what people _actually pay for_ is a different set of vector
inputs compared to what people submit reviews for (Yelp/TA) or what people
link to (blog with links to favorite city restaurants.) Google's search index
is over 100 petabytes but even that gigantic database is missing lots of data
that other entities can collect and convert into uncontested search results.

------
larrydag
I definitely think there is room for search improvement. I believe the next
area of search is contextual search
([https://en.wikipedia.org/wiki/Contextual_searching](https://en.wikipedia.org/wiki/Contextual_searching)).
If you can combine what the user is looking for to actual website content then
I think you might be onto something. The trick is finding that link function.
Traditionally Google has relied on keywords and ranking by links. There could
be other ways to find that user/content relationship.

------
Skylled
I'd think there would have to be something fundamentally different. It would
have to be hardly recognizable as a "search engine."

There's too many clones on the market right now. Some with good purpose, like
DuckDuckGo which can be simplified to "Google but without privacy invasion."
Others like Bing could be just "Google but clunkier." (my opinion)

If you've got an idea on your hands that can't be described as "Google but..."
then there's definitely room for another.

------
tarr11
Take a look at how DuckDuckGo built up their business around privacy first,
and leveraging Google when appropriate.

------
atemerev
Some ideas:

1) a good sitewide search engine. Google's offer is laughable, and Algolia is
too developer-centric (requires pushing the data through API). What I'd want
is a single input field where I can put my site's main page URL — and get a
working search in a few minutes.

2) subscriptions / monitoring. I want to monitor some event or topic, and I
want the updates to be delivered to e.g. my WhatsApp/Telegram/Slack/whatever,
with smart filtering, refining etc (in lieu of frantically Googling /
redditing / refreshing Twitter feed)

3) context-preserving interactive search, that can ask me questions/ refine
results.

4) Timeline search interface for news / events / company history etc. I want
to be able to put the name of a person, or company, or TV series, and get a
comprehensive timeline view of all things happened there.

I have a lot more ideas, and zero free time :(

~~~
LiamBoogar
1) Swiftype (or any of Algolia's integrations - WordPress, ZenDesk, Shopify,
Magento, ...) - here you talk about Algolia's developer focus, but the rest of
your arguments are about the consumer experience. All search engines are built
by developers/engineers, and Algolia delivers end-user experience on Twitch,
Periscope, Medium, and even HackerNews (hn.algolia.com), which are exactly
what you're looking for. You can actually use Algolia to create all the search
engine experience ideas you have, and it takes less time (which you don't
have)

2) Mention ([http://mention.net](http://mention.net))

3) Jelly (didn't work. Maybe there's a reason?)

4) Google / Wikipedia.

Unless you can build something 10x better than what exists,

------
Razengan
I think search engine functionality will sooner or later need to be
incorporated into the core specifications for the internet, like DNS.

I mean, the modern idea of the internet is pretty much useless without a
search engine, and we've been spoiled by the power we get through Google — the
phrases on this very page get indexed within literally seconds; I just tried a
literal search for a sentence from a 3-minute-old comment here — but it's
really not a good idea for a single company to have so much authority.

This really isn't something to keep relying on a small handful of companies
for, especially once we have interplanetary internet. :)

------
yitchelle
Yes. Google is too generic and that is great for the internet.

I would look forward to search engines that are topic specific. However, the
blocker is having the information available in the first place, so I doubt if
this will ever happen.

------
tommynicholas
Absolutely - Giphy is a great example. There will be plenty of search engines
that will grow to prominence around either a niche content type (gifs =>
Giphy) or a niche feature privacy => Duck Duck Go).

------
mrharrison
I feel like amazon is my second search engine. So yes there is room with
category specific search engines. Reminds me of when people started making
specific apps from craigslist sub-categories.

------
makecheck
It is possible to compete with Google by offering what they used to have:
simplicity and speed, and not screwing with results.

Google is beginning to show signs of accidental self-sabotage. Their AMP
approach was so aggravating for me on mobile that I literally switched search
engines to avoid it. And their insistence on scraping and summarizing things
and trying to prevent you from even visiting other sites is slowly ruining
even desktop searches. They are in danger of disruption.

------
alain94040
Absolutely. I'm fairly confident that 20 years from now, we'll laugh at the
notion that all Google could do was finding pages.

I don't know what will replace it. Chat bots could be one. Much better
understanding of context. Or providing answers based on the knowledge that is
spread on many separate web pages. or actually taking action (if you are
searching, it's to _do_ something, not to read a page).

But "finding a page" will sound really silly 20 years from now.

~~~
rocky1138
Finding pages is becoming more and more irrelevant as people less and less
create and host their own websites like they used to when the web was young.
Web page creation has mostly been centralized to things like Tumblr, Twitter,
and major news organizations.

------
kevando
Yes and I predict Reddit will be the first challenger to finding web content.
And looking beyond, I don't see Google as the way I find amazing AR/VR
experiences.

------
jv22222
DuckDuckGo is doing pretty well, and it has a lot of room to grow (so proves
it can be done).

------
MichaelMoser123
There is the problem of relevance - you must order your search results by
relevance, Now you can have one global model of 'relevance' or you can have
several models (i think the web is so big that one model is not good enough).

relevance can be relative to language/origin ; number of links away from a
wikipedia article ; coolness (has it a link from a known twitter account/news
aggregator) ; age group ; a link from HN frontpage or slashdot would make it
'nerdy', news - was it referenced by a news source etc. etc.

I think a differentiator would be to have a non intrusive and intuitive UI to
select an available relevance model (instead of trying to profile the user
based on his search history / browser history)

On the one hand the user profile is of great value for advertising, but on the
other hand the explicit choice of relevance model can be used to match
relevant adds.

~~~
matt4077
Google obviously already operates such profiles, at a rate of one profile per
user - with AI in the background.

Exposing options is rarely a good idea, as it only reaches a single-digit
percentage of users.

And they say profiles are oh-so-relevant, but as far as I can tell, Google's
main product (search ads) is still tied almost exclusively to keyword, region,
and language.

~~~
MichaelMoser123
> but as far as I can tell, Google's main product (search ads) is still tied
> almost exclusively to keyword, region, and language

Don't know: try to do the same Google search from different accounts. I think
the results will be quite different...

------
EmilStenstrom
Google has the best search results cause it has the most people using its
service. Its models learn every time you click on a result. There's no way to
take that on directly. What you need is to find an angle that Google can't
easily follow, as with DuckDuckGo and privacy.

Are there areas where Google can't go?

~~~
greglindahl
Partly using human curation is an area that Google doesn't want to go into...
that was blekko's sustained competitive advantage. Wasn't big enough for us,
but maybe someone else could make a go of it.

------
dgudkov
I observe occasional referrals to our web-site from Qwant [1], a French search
engine with focus on privacy. Although, I don't know if they're cash-positive
or still burning investor's money.

[1] [https://www.qwant.com/](https://www.qwant.com/)

~~~
sanatgersappa
Qwant is good. Fast and relevant. Don't know about their financials.

------
pygix
Build a local search engine for a very focused niche.

------
bobosha
I think there is room for a horizontal search engine, by making it mobile-
first. Even with Siri style conversation agents, mobile search still sucks
real bad. If you design bottom-up for a mobile form factor you could have a
winner.

------
tyingq
Google's market share is ~64%, while Bing is around ~22%.

Probably not great for consumers that the #2 offering is almost 1/3rd as
popular as the leader.

Which says two things...

\- Yes, there's room for a better number 2.

\- But, if the best Microsoft can do is 1/3rd as popular, how well would a new
entrant fare?

It seems like you would need some new feature that makes you significantly
better than Google to stand a chance.

Also, the barrier to entry here is enormous. The spend to be at least as large
an index, and as fresh an index, and as relevant results as...Google, is big.

------
kapauldo
If you keep the search paradigm of entering text and returning 10 links, it's
probably not likely to succeed. But come up with a new pardigm and you can
definitely shake things up.

~~~
bobosha
This ^. I would also posit that coming up with a completely new paradigm for a
mobile form-factor (cf. the mini-me approach of current mobile search), has
lots of potential.

------
Neliquat
A small additional 'Yes' in the pile. I am currently searching for a new
search as google mangles every querry. I am white hot with anger at google
after every 3rd search.

------
gwbas1c
Yes, you just have to figure out how to get a large enough percentage of the
market to switch to you so you can make a profit.

That's a much harder problem to solve today then when Google trounced
AltaVista in 2000. Now search engines are tightly integrated into browsers.

One hint: I switched to Google when they released a browser toolbar. I even
remember deciding to switch to whoever released a browser toolbar. What's
today's equivalent of a browser toolbar?

------
sixdimensional
How about some kind of machine learning algorithm that is regularly trained on
user feedback and ratings of search results? The system has one mode that
starts out feeding you approximate matches for your search criteria, ordered
pseudo-randomly, and has a secondary mode of viewing results based solely on
user feedback to order search results? Yeah, it probably wouldn't work but it
would be fun to see what it produced.

------
siddharthgdas
Very specific and concentrated search and discovery sites eg., producthunt,
zomato, yelp, quora (if they execute really well) will out do google as I
think, leaving only discovery of these sites (and their content, like for
almost every google search I end up going to SO or Quora) and contextual
information as a problem for them to address.

------
mrfusion
I think I wiki model would be awesome. Of course they'd need a smart way to
weed out seo and other results gaming. But somehow Wikipedia does it.

~~~
sebst
Remember wikia search?

------
ParameterOne
You can always test the theory and see who likes your search engine with a
browser extension that over-rides google as a search provider
[https://developer.chrome.com/extensions/settings_override#se...](https://developer.chrome.com/extensions/settings_override#search_provider)

------
credit_guy
I think there is and it's glaringly obvious. "If you don't pay for a service,
you are the merchandise." With Google that is obviously true. If you can build
an engine where you can demonstrate that you are not sniffing on the user, you
will be able to charge a user fee, and a lot of people will gladly pay for it.

------
skdotdan
I dream about a real time search engine. Also, there's room for improvement in
personalizing and curating content.

~~~
sebst
I think so, too. A week ago, the biggest approach in curating web content -
dmoz - closed its doors. What was a good idea in the 90s (title and short
description) is obviously not sufficient in a modern web. And it's far from
real time.

I try to build a curation platform with [https://curlz.org](https://curlz.org)
\- just at a planning stage at the moment, but did raise interest in some dmoz
admins and editors.

------
h2hn
I don't see any problem here.

I've been using startpage for the last 5 years and I'm not looking back. I
woudn't have any problem using any other search engine, nowdays any search
engine works. The 3 or 4 times that I googled these years I found it pretty
weird.

tl;dr: There's plenty of room, just not enough gray matter. :)

~~~
jplayer01
But...Startpage uses other search engines, primarily Google.

~~~
h2hn
Truuuue. :)

But my point is my searches usually end up in:

wikipedia, blogpost, *overflow, twitter, reddit, papers ...

So making a search engine for only those sites, could be a usable search
engine indeed and I would use it if I had to. :)

------
amitutk
There is room for vertical search engine. E.g., searching for research papers
with Google is not very impressive.

------
Entangled
Absolutely, but search alone won't cut it no matter how smart it is. For
profitability you need a whole ecosystem where ads is a major revenue source.

DuckDuckGo is a fine search engine and I believe they'd benefit a lot from
services like ads, email, blogs, docs, shops, apps, social, etc.

------
siquick
If anyone is looking for an alternative search engine then please go for
[https://www.ecosia.org/](https://www.ecosia.org/) who plant trees with their
ad revenue.

------
cft
No. It also doesn't help that the whole "independent websites" scene is
disappearing: there's less web surfing on the phones, and also Facebook and
other platforms swallowed a lot of small sites.

------
dopeboy
Build a search interface on top of reddit. Make it way better than reddit's
search engine. Produce an index on each comment that takes credibility into
account on top of number of upvotes.

~~~
aryamaan
What's such important that Reddit provides?

And doesn't Google return better search results for Reddit already?

------
elvirs
i think most of the social media content created recently is stored inside
mobile apps and and for the most part is not available to be indexed by search
engines. apple and google are in a unique position with access to all of that
data. i think just like facebook does push its users to make more and more
content public apple and google could put in similar efforts and if the right
balance is found there could be a new search engine for all content created
and stored inside mobile apps

------
petra
If it can build something significantly better, or at least differentiated
enough, and it won't be copied - i think you could find your niche or even
large scale success.

------
killin_dan
I already use Bing full time. Inline searches are getting better but all basic
Google functionality is basically there. Fuck Google.

------
dsschnau
Hell yeah, I exclusively use DDG and it kicks ass

------
rocky1138
Build me a search engine that only returns results which use https.

------
sgt101
Nope, but search is 20thC stuff... when do we live again?

~~~
greglindahl
80% of Google's ad revenue is search advertising, and ads overall are almost
all of Google's revenue.

This may be the 21st century, but...

~~~
sgt101
I get your point - but to me it's a bit like "is there room for another voice
operator" in 1983... we could artificially make them, then watch as they
consolidate and then note that the real action is in something else which it
would have been good to think about first.

------
timthelion
I beleive that there is a place in the world for a new search engine. People
say that Google is good enough, but I find it hard to use Google to find the
things that I search for. Here is how I would upon Google:

I often find that I search for "vegan pancake recipe" and end up at a page
with lots of images and it is very hard for me to find the ingredients list.
Google does a poor job here. They should give preference to simpler sites
where it is easier for me to find the information I'm looking for. Instead,
they seem to actually give preference to complex sites. If their job is to
help me search for the information, then they shouldn't give links to
haystacks. They have tried to improve upon this with their answer feature
where they quote websites. This is, IMO, the wrong way to do things.

Instead, the search engine should be a desktop application which is more
pervasive than a website can be. It needs to run natively, and not be cloud
based, both for privacy, performance, and for the ability to integrate well
with the system. When I search for "vegan pancake recipies", if the search
engine is going to give me a result which contains 3-5 pages of text and
images before the ingredients list, it should automatically scroll the web
browser window down to the actual recipe.

This desktop application should also build a context profile based on what I
am doing on my computer. This context profile shouldn't be uploaded to the
internet, but it is still usefull. For example, I should be able to select a
string in my terminal and press the search icon in the sytem tray. This should
bring up a stack exchange question containing the exact text of the string I
selected.

I should also be able to select a set of websites which I want to use as my
search "domain". I might give my search domain as "the documentation to
Python3, the Docker API reference, and stack exchange". This whould make it so
that those "feeling lucky" links would work much better.

The search engine should also present image results which are NOT WATERMARKED
before ones which are!

I should be able to write a markup for things I need to search for, and then
enter a "search engine research wizard". The makup would look like so:

"We had a great time at [Park on that hill in prague???] park. It was so
sunny! The temperature was [Prague temperature on 27th of march ???] which is
[Average Prague temperatures in March ???] for this time of year."

The search engine would then, when shown this text, would allow you to right
click on the bracketed areas, search for the text in them, and then, by
selecting parts of wikipedia articles, fill in the blanks.

The search engine should use accessibility APIs to record the text of the
windows that I have open. I should then be able to use the search engine as a
kind of memory store which I can search. If I want to know what that awsome
new tiling window manager written in Rust was called, I should be able to
search full text of my browsing history and open up the previous HN page where
the tiling window manger was presented.

------
throwayedidqo
I have a theory that Google was terrified of this very thing happening a few
years back. Kinda out there but hear me out.

Based on what I've heard from leaks and various news articles, the leadership
of the search division was once adamantly against using neural networks for
search.

Things took a sudden turn maybe 5 years ago. Something caused google to do a
complete 180 and vastly increase their investment in AI research. They saw
something that scared them.

I think it was their unexpected success using AI for machine translation.
There was much PR about it at the time, and I think it really got the gears
turning at Google HQ. You see, the same language processing needed for machine
translation has obvious parallels to search.

The more curious employees began applying word vectors used for translation to
search. After all, most of it had been trained on index data from multi
lingual websites anyways. They found that, horrifically, rather simple neural
networks sometimes outperformed the the search algorithms google had spent
billions on.

When this reached upper management it set off a quiet panic. Google, once seen
as invincible, could have been beaten by a start-up using effective ML
techniques. Computer power calculations showed that this could have been done
since a few years after the debut of CUDA, a window of vulnerability of maybe
7 years.

The timeframe of around 2005-2010 coincided with Google spinning off a bunch
of moonshot projects and doubling down on their core businesses. Coincidence?
Maybe, but I don't think so. I wish a Xoogler or two would come out of the
woodwork and tell me if I'm crazy or not.

Anyways, Google usually has a 5-7 year lag time when they release details of
their tech to the public. This dates Tensorflow and their heavy AI work to
around 2010. The window where somebody could have beat them easily with ML was
probably 2005-2010

------
PaulHoule
Google beat the competition not so much that it has a better search engine but
in that it has a much better ad platform and brings in much more revenue per
search than Yahoo, Bing, etc.

I would address the money issue first before thinking about how to make a
better search for some market.

~~~
webwright
This isn't true.

Google existed and grew (rapidly) for quite a while before they launched
Adwords, which they introduced in 2000. Their growth in '99 was pretty
meteoric (prompting a $25M investment from KP and Sequoia) _before_ it rolled
out PPC monetization. The Adwords model wasn't new at all-- Goto.com was the
first search engine to bet on that model.

Google won because it was a massively better search engine... Not just 10%
better-- it was "holy crap" better on a mess of fronts (notably: serving up
what you were looking for).

