

How do users search your website? - martian

I'm  working on some search tools to enhance my start-up's existing infrastructure. Users will be searching highly-structured data, and I would like to provide them something better than SQL's LIKE operator. I've looked into some options, like the Google Appliance, but many commercial tools are currently too costly for our bootstrapped company. It may be better, in the end, to build my own.<p>Are there good tools or resources that the YC community would recommend?
======
Readmore
If you're looking to search an existing MYSQL DB I hear Sphinx is a good
option. I personally use Ferret but everyone seems to think it sucks horribly
so I'm sure you'll hear that from people. I have had no problems with Ferret
at all however. Sphinx and Ferret are both Ruby based.

Solr is also an option but it tends to be slower than Ferret and require more
resources.

Here is a recent post of mine that gives a quick comparison of Ferret and
Solr. [http://www.embought.com/blog/show/10?t=SOLR-vs-Ferret-as-
a-S...](http://www.embought.com/blog/show/10?t=SOLR-vs-Ferret-as-a-Search-
Engine-Indexer)

~~~
martian
Thanks for the info, this is really useful. Currently I'm using mostly Python,
but a parallel Ruby install is not out of the question.

~~~
ra
+1 for sphinx.

It's fast and memory efficient, an in my experience rock solid. Comes with
excellent python bindings, and if your python is django, there is also a
really nice django-sphinx project on google code. Solr is also good, but it
does require java and behaves slightly differently. I would recommend you give
both of them a look.

------
rantfoil
I've built custom search engines using Lucene before --
<http://lucene.apache.org/java/docs/index.html> \-- problem is that you have
to build a lot of extra stuff to make it work for your scenarios.

Wikipedia lists a ton of sites that rely on Lucene:
<http://en.wikipedia.org/wiki/Lucene>

You might also want to look at Solr -- I haven't tried it, but sounds like you
have to make less special sauce to get it running.
<http://en.wikipedia.org/wiki/Solr>

Ferret is actually a port of Lucene to Ruby.

If I were engineering something, I'd probably want to use Facebook Thrift to
make my searches a service that can be called from anywhere -- just use the
Java Thrift binding on top of a Lucene process, and then call into it from my
web app.

------
okeumeni
I will say it depend on how much data you have to search and how tightly link
is the search to your business requirements.

If you are in search of some casual search mechanism with few thousand records
to index, go for open source; if not you may have to implement your own search
engine. I am in the business of search engine though I can’t tell you much
without violating my non-compete agreement.

Another option is to look at our hosted search solution not yet in beta. What
will be interesting for you will be the data search service, a completely
customizable search solution. Check it out here
<http://www.intelliverb.com/PESS/>

------
diego
Take a look at our open-source tool, <http://hounder.org>

Among other sites, it powers wordpress.com (over 3M weblogs indexed).

We are looking for feedback and feature suggestions, you can also join our
discussion group: <http://groups.google.com/group/hounder>

------
streety
I haven't played with it myself (though I intend to) but Lucene might be an
option. <http://lucene.apache.org/java/docs/index.html>

It has also been ported to the zend framework if PHP is your thing.

------
xirium
For a site search, see <http://news.ycombinator.com/item?id=184707>

