Hacker News new | comments | show | ask | jobs | submit login
Ask HN: Which websites actually have a useful search that Google can't beat?
42 points by photon_off on Sept 25, 2010 | hide | past | web | favorite | 63 comments
Everybody uses Google, Bing, DDG, or some other search engine. But are the search giants as ubiquitous as they seem? I think there's a great value to individual sites' search features, mainly because they go beyond the grasp of GoogleBot.

There was a post recently about Google's main competition being Bing. I think that's entirely untrue. I think their main competition is the long-tail of search that they simply don't have the dataset to compete against.

Here are sites that I can recall using the search feature, and it actually being useful:

  - Wikipedia (half the time I use Google, though)
  - SearchYC 
  - Urban Dictionary 
  - BTJunkie 
  - StackOverflow
  - Delicious
  - Twitter
  - YouTube
  - eBay

What sites do you find have a useful search feature?

Bing's travel search is the best in the Biz as far as I'm concerned.

Wolfram Alpha is phenomenal for certain types of queries. I find their results for companies pages to be so cleaner and more intuitive than either Google or Bing's.

I think there are huge openings for competing with Google and Bing when focused on a specific niche. Google's revenue per search query in the US is $0.12.[1] And, I know I"m getting frustrated with their search results, and I find myself using multiple search engines for different types of queries.

There's tons of room for competition in this market, and I don't know why more startups aren't taking advantage of it.

<shameless plug>

This is the reason that we're turning http://Newsley.com into a search engine for economic and financial news. The news sections of Yahoo Finance and Google Finance suck IMHO. We're trying to make financial news search suck less.

We're focusing on building our index and results pages like crazy right now. We're currently indexing the Economist, NYT Business and BBC Business[2]. Bloomberg News should be online this week.

After getting Bloomberg online, we're going to focus on getting the alpha version of our search released.

We're making our results pages available as soon as possible, however, so we can start building a bit of organic search traffic to our site even before search is released. So far, it's worked great. Since we started releasing those pages 6 weeks ago, our traffic has tripled.

In the meantime, each one of the keywords or tags below each article leads to the results page for that specific term. Feel free to click around.

</shameless plug>


[1] http://dondodge.typepad.com/the_next_big_thing/2007/05/why_1...

[2] http://newsley.com/crawl_stats

Best of luck with Newsley.

In terms of competing with Google, Bing, etc, I'm on it. I'm incredibly excited about launching my latest service, which hopes to gain a decent market share of start page users. Basically, it's going to be a search portal. Not another one of those crappy multi-tabbed "search 10 engines at once" pages, but rather something much more... substantial. That's why I started to thread, to see which sites I should include in the index. I'm glad somebody else sees the same potential in the search market as I do.

What are the particular queries where you've been frustrated?

I'm finding Google less and less useful for queries about specific programming problems.

I'm finding myself going to DDG and Stack overflow for answers to those problems first, just because I don't have to sift through results from crappy forums or Experts Exchange to get to answers for my specific query.

I do prefer Google results to Bing's results for those topics, because I can query Google groups, which also tends to have higher signal::noise ratios.

As I said, Bing has Travel search down pretty cold as far as I'm concerned, and I travel quite a bit. That may well change since you guys bought ITA, which provides Bing's data. Brilliant move, btw. I do use http://matrix.itasoftware.com a ton after a quick overview search on Bing because of the domain specific language that itasoftware lets you use on your searches.

Amazon does much better for shopping and product search. My first inclination these days is to go to Amazon and look for a product. If the price seems reasonable, I have Amazon prime and the product gets shipped to me for free in 2 days, or I can spend $4 and get it shipped next day.

I found myself living over at Edmunds.com when I was searching for a used car to buy two months ago.

I think that the results from Wolfram Alpha on company financials are much better laid out than Google Finance or Yahoo Finance. I'm much more likely to use Wolfram than GOOG or YHOO to get an overview of a companies financials.

I think there's a lot that can be added when it comes to financial news search, which is why I'm building http://Newsley.com

Looking for recipes on Google pretty much sucks. I'd much rather go to the Food Network.

The reason that I think you guys are hurting at times in search results, is because your mission is to categorize "all the world's information". So, you're approach is to pull in as much information as possible, and then sift through and sort though what's important.

If a site takes a semi-supervised approach like DDG or provides search over a specific silo of curated content like Stack Overflow, then I think you guys are going to be hard pressed to compete.

And on a certain level, it doesn't make sense for you guys to try and compete with sites like Edmunds.com or Stack Overflow. You guys are looking for niches that can provide the next $0.5 Billion to $4 Billion in revenue. Small verticals like programmer searches, recipe search, financial news search etc... aren't really going to be worth a huge investment for you guys to dominate.


Speaking for myself, I'm not really all that concerned if Amazon or Kayak is better at their specific vertical than Google. That's why they exist, they can focus much more tightly on a good user experience for that particular task than a generalist search engine can, and that's basically the economy working correctly.

I do care when I can't do a particular task because Google's search results suck. That seems to happen more and more with programming-related queries, and I dunno if it's because I'm searching for harder, more narrow topics or if it's because Google's results are worse. I recently was looking for info on Haskell's FFI, so I searched for [CUInt]. The top result was "an Irish slang term for female genitalia". Not what I had in mind, Google.

Again, the most useful info we can have is the exact queries that went wrong. It's like any debugging problem: if you can't reproduce the bug, it's maddeningly difficult to fix it.

umm..what's DDG? (yes, I googled it)

I think that Google's major shortcoming is that searches for specific subjects rarely lead to a canonical source of information on the subject. Instead the top results are full of forums, Usenet scrapes, and content-sparse SEO link farms.

What counts as "canonical"?

For some queries it's easy, like if I search for [amazon harry potter], I probably want either the Harry Potter Store or the Harry Potter books on Amazon.com, which happen to be the top two results. If I search for [llvm doxygen], I probably want the index page for LLVM's API docs, and that's what I get (even if it does have no snippet). If I search for [height of empire state building], I even get a little OneBox with the answer right on the results page, along with the sources it comes from.

But what if I'm searching for an obscure error message, and the answer is in a forum posting that was mirrored across a dozen mailing lists? Which is canonical? The biggest? The one on the site for the project I'm searching for? The one with the fewest ads?

Or what if I search for [can't read my poker face]? The #1 result is for metrolyrics.com, which sucks - it's a spammy lyrics aggregator with a popover ad. But at least it has the lyrics. Lady Gaga's official site doesn't.

It's those latter cases that interest me, but it's not exactly clear what the desired behavior is.

As I recall I run into this when searching for Linux kernel internals or library documentation. I find lots of error messages and forum posts, but no documentation targeted at programmers. For common library calls a manual page will often be in the top results, which is helpful, but it seems that documentation from the official project site (if it exists), the git commit that introduced a function, or source code containing the function would be more useful.

Next time I run into a search that would have a more appropriate best answer I'll try to post it here.

LinkedIn. Try asking Google for a list of all the Ruby developers who went to Princeton and live in New York City.

I did, and it let me straight to your post.

stackoverflow is useful? they have one of the worst search engines I've ever seen. If you are looking for anything more specific than "Javascript" you are going to have a hard time finding good results.

if I need to find something on stackoverflow, I just go to google and do a site:stackoverflow.com

Or, better yet, create a keyword search for it:


I personally hate typing out the whole "site:stackoverflow.com" before my query. But you're right, it is better.

If you do it often just make an alias for it in the browser.

True. But that won't solve the problem for everybody. I will solve the problem for everybody.

Standard search engines can only help you find results to a search query you can put into keywords. Often I want to ask the question "Am I missing something important about X?" and that cannot be translated into an effective search query.

We've built resourcey.com trying to tackle this problem. That is, http://resourcey.com/site_details/2/news.ycombinator.com/ is the answer to "Am I missing something important about Hacker News?".

This is a different form of search queries we couldn't find a way to get results for via standard search engines. If you think this functionality can be somehow produced by a standard search query, do tell.

This is great.

Right now, there are not too many metadata websites out there. Your site, AboutUs.org, Alexa and ilk, moreofit.com and ilk, and comment aggregation like BackType and UberVU, are all that come to mind when I think: "I have a URL, what can you tell me about it?"

Newegg. The Power Search, in particular. I find the ability to specify certain features for a product extremely useful. The guided search is also nice to quickly narrow down on more common criteria.

For example, here's the video card power search: http://www.newegg.com/Product/PowerSearch.aspx?N=100007709&#...

It allows you to specify manufacturer, port types, memory/memory type, chipset, etc. When I'm just starting to look for a new part or device being able to narrow down the list that way is very helpful.

This is interesting though, Newegg has to change the standard search box to something more guided to be useful. I think most sites have too little content for a visitor to find anything useful with a search box. In these cases a directory or tag system is a much better solution. Some exceptions to this are web applications where users are looking up content they created on their own.

Oh yea, I use the local equivalent (prisjakt.nu) all the time before buying just about anything. Yet for some reason I've never really thought of it as a search engine.

Often at database-like sites, where I can get a list of something in a selectable category that matches a particular term. For example,

IMDB can give me a list of all TV series / films / etc. with "Hobbit" in the title: http://www.imdb.com/find?s=tt&q=hobbit

Musicbrainz lets me search by track name, album name, artist name, etc.: http://musicbrainz.org/search/textsearch.html?type=track&...

MusicBrainz is nice. Thanks for sharing.

And is community built, so please contribute your CD listings.

I don't have any experience with their audio file (MP3, AAC, etc) fingerprinting, but I can tell you it's a great experience to pop in a freshly purchased CD and have MusicBrainz find it. I know that experience comes from some other kind soul having input the CD, so I try very hard to make sure I do the same.

>I don't have any experience with their audio file (MP3, AAC, etc) fingerprinting

I do, via the desktop program [Picard](http://musicbrainz.org/doc/PicardDownload). It really works well for properly tagging mp3 files (and renaming files to fit the tags) -- once they know the tracks. That often isn't the case with obscure, new releases. But you can't win them all. It's a program that's really worth using.

Maybe it would be possible to identify tracks not known to MusicBrainz by uploading them to Youtube and seeing who sends the automated takedown notice.

TinEye (http://www.tineye.com): a 'reverse image search'. You upload or link to an image and it finds you other copies of that image on the web. This is something you absolutely can't do via google, and is surprisingly useful, for example to find uncropped or pre-photoshopped copies of images.

Whaaat? Wikipedia's search sucks (at least it sucked last year, when I last used it). Now I use a Google search with the site specified to get Wikipedia pages.

To answer the question, I'd have to plug my own startup, http://historio.us. It's actually made bookmarking viable, for me.

I use Google for finding most pages with information on Wikipiedia, and I am a Wikipedia editor who has read three whole books about how to use Wikipedia. Wikipedia's search usability is a disaster compared to using Google to search for Wikipedia pages.

amazon.com (their focus on products that they sell makes their suggestions almost perfect).

I think Yelp is a good example. When you're looking for a local business, you have a category/search term in mind, as well as a location. The single input field in Google search is awkward when trying to fulfill this need.

More structured queries that those that main search engines offer. For example: kayak, travelocity, etc. Bing bought out farecast for this reason. These kinds of queries can't be done easily if the only input a user can provide is in one text box. That isn't to say that web search engines could tackle these areas, but they wouldn't have as much of a head start in the technology.

"flight to LA", "hotel room in New York next weekend", "metallica san jose tickets" etc. will return structured results, in both Google and Bing, soon I feel.

Google bought ITA and Bing bought Farecast for the purpose of getting direct access to this data. Since they can't web crawl this type of data, I would expect more acquisitions and/or licensing deals - such as the deal with Twitter.

Google has a decent foundation already with queries such as 'population of london' '<movie name>' (which shows local theatre times) and 'weather 94000' etc. Full list here:


Those queries don't even begin to give enough information.

Flight to LA from where, when, how long, economy/first, what times of the day, etc.

Hotel room in New York next weekend which days, smoking or not, upscale or not, which part of new york, which new york?

Metallica san jose tickets are closer and could generally get you the venue's page that will then ask which showing, which seats, how many tickets together, etc.

Google would use all the information it has to make assumptions about defaults to display. When you click-through to the site to make the booking or to drill in further you can specify the details.

eg. 'flight to LA' can show :

  SFO -> LAX  Tod  Tom  Wed  Thu  Fri
  Orbitz      $187 $210 $299 $330 $450
  Travelocity $200 $350  -   $89    -
etc. etc.

Very good point about the single textbox limitation.

I've been working a lot in this space, and I think there needs to be a protocol implemented beyond OpenSearch, which is essentially limited to 1 text box, with some nice additions (i.e. you can specify a query suggestion URL as well).

There should be some sort of superset to OpenSearch. I'm suggesting an AbstractForm protocol which would allow websites to export their basic query functionality into an an XML document. Relevant providers (such as Ubiquity, and the thing I'm working on that will be launched soon), would be able to use this protocol to integrate the provider's AbstractForm in a convenient interface.

With Kayak, for example, AbstractForm might specify a "from" field of type "city", a "to" field of type "city", and some other optional fields (date, round trip, etc). You could then have a service which implements this however it chooses... perhaps a program with loads of AbstractForm interfaces loaded, and when you start typing "kayak" it knows to show you "To" and "From" fields, etc. I really believe this type of thing would be a major boon to the internet.

Newzbin. Bespoke C in-memory search engine with full substring matches and loads of metadata to play with, indexing data Google has no interest in.

I may be biased, since I wrote half of it. Funny seeing it turn into some kind of zombie :/

For audio software -- http://www.kvraudio.com/get.php

Google's search engine is good, but not necessarily the best. For small datasets, I am using Apache's solr. Google's custom search doesn't produce as great results. solr is also highly scalable. read about it here: http://lucene.apache.org/solr/

just wanted to point out that this is not a website but a tool to implement for your own website...

Mainly White pages, when I need to find same phone numbers, but for the other stuffs I can think only at very specialized info like publications where google is not a good enough choice.

StackOverflow is a good example for this, I can usually reach what I was searching for in it with from a good google query.

Thanks for sharing. Do you have URLs for the aforementioned White Pages and specialized publications?

Is this a joke? ;-)

Well, in Italy we have "pagine bianche": ahttp://www.paginebianche.it/index.html

Which is simply the translation of white pages: http://www.whitepages.com/

Regarding specialized pubs, its a long time I don't use one, but this list seems fairly complete: http://en.wikipedia.org/wiki/List_of_academic_databases_and_...

The startup I work for will let you do queries using units of measure and price filters in the query string. For example:

http://www.shopwiki.com/LCD+TV+%3E43%22+%3C%241000?sb=1 (LCD TV >43" <$1000)

I looked at the results, and I found that Google has some problems parsing content. For example "50in TV $1200 after $500 discount" gets included because Google assumes $500 is the price.

Every single forum search engine, ever. Google is awful for that, the tools just arnt up for the job.

What, in particular, are you looking for when you search for forums? Example queries and the tasks that generated them would be most helpful.

There is a fair bit of data on forums that is not exposed in the UI, simply because no frontend engineer has had a chance to look at it. I'm a frontend engineer, I'm potentially looking for a new project or some 20% work, and so examples of where Google isn't quite working right are pretty helpful to me.

In addition to the travel search engines that have been mentioned, I'd include Hipmunk.

The best way to search StackOverflow is definitely using Google. site:stackoverflow.com

That's one of the first ways I will refine a search about a programming topic these days, but no need to use stackoverflow.com search.

lxrs are a good example of a kind of search that google can't replace yet: http://mxr.mozilla.org/search

i also find http://koders.com to be useful

bing video search has lots of advantages when it comes to showing and previewing results. (and google video search sucks)

for places photos i always use panoramio search. (compare their results to google images)

Google Blogs search has been completely replaced with twitter search.

and i use thepiratebay.org to search for torrents. (but only backups of things i own :))

eniro.se is much better for looking up people or companies in Sweden (basically yellow and white pages). I'm sure all countries have some sort of local equivalent.

Considered as a separate site, Youtube is actually #2 in search volume behind Google and ahead of both Yahoo and Bing.

Craigslist and Facebook are both in the top 10 search engines.

Where can I find a list of the sites with the highest search volume?

I'd say Torrentz for a torrent search engine.

Why are you being downvoted? Sites that have useful searches was exactly what I asked for. Thank you.

Hmm I don't know why. NP

I prefer BTJunkie


iStockPhoto - semantic, allows for moderately easy refining



Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact