

Ask HN: Is there any one book or resource on search engine development & theory? - rneufeld

I'm working on a search engine for a web application I am developing and realized I really didn't know that much about making search engines. I've taken a bit of AI &#38; Expert Systems in school but never really run into any books specifically on developing search engines. Do any such books exist? If so, recommendations?
======
rmobin
Gred Linden likes Introduction to Information Retrieval: [http://www-
csli.stanford.edu/~hinrich/information-retrieval-...](http://www-
csli.stanford.edu/~hinrich/information-retrieval-book.html) (free online).

------
xinsight
This article gives a wonderful overview of the challenges:

"Why Writing Your Own Search Engine Is Hard"
<http://queue.acm.org/detail.cfm?id=988407>

(site is down currently.) google cache:
[http://74.125.95.132/search?q=cache:13tlOSQwtjAJ:queue.acm.o...](http://74.125.95.132/search?q=cache:13tlOSQwtjAJ:queue.acm.org/detail.cfm%3Fid%3D988407+writing+a+search+engine+is+hard&cd=1&hl=en&ct=clnk&client=safari)

------
michael_dorfman
There are some ACM/IEEE journals that have relevant papers, but you have to
ask yourself: is reinventing the wheel what you really want to be doing? Given
that there are lots of available COTS solutions, shouldn't you be focusing on
things that are unique to your app?

(Needless to say, if the search engine needs _are_ unique to your app, and a
COTS solution isn't viable, you might want to bring in someone with relevant
expertise.)

~~~
gtani
spot on. OP: Are you asking how basic tf-idf works, or is there something you
can't get lucene / SOLR / sphinx / tsearch to do easily?

nevertheless, here are some good background materials (search amazon on "data
mining"

<http://www.amazon.com/gp/product/1584504609>

[http://www.amazon.com/Data-Mining-Practical-Techniques-
Manag...](http://www.amazon.com/Data-Mining-Practical-Techniques-
Management/dp/0120884070/ref=pd_sim_b_8)

Also the Collective intelligence by Satnam alag is quite good (a lot of java
code to wade through tho

~~~
rneufeld
To be honest I hadn't even heard of tf-idf before you mentioned it. It is
definitely not the case I am stepping beyond the bounds of something like
sphinx.

I basically want to lay a bit of foundation before I start mucking around with
something I have no idea about.

I have a couple e-books on Data Mining but I didn't think it was applicable.
Are Data Mining and Search two things closely intertwined?

