

Introducing Tapir: Simple search for static sites - jkreeftmeijer
http://jeffkreeftmeijer.com/2011/introducing-tapir-simple-search-for-static-sites/

======
PStamatiou
As for "The problem I ran into was that it takes you off the website you were
searching on and takes you to a Google results page (complete with slightly
irrelevant ads)."

You can set it up to load the ads on the same page. For example, running on my
jekyll blog: <http://paulstamatiou.com/search>

~~~
PStamatiou
Err I meant "load the results on the same page", not ads.

~~~
thomasdavis
Awesome share, will be using this.

------
helium
Look, this is cool and all from a technical perspective, but why are people
strapping themselves in static site generator straightjackets like this? At
some point, doesn't it become much simpler to just use a server side
framework?

~~~
tptacek
They're significantly faster, they never break, and they're _way_ more secure.

Pretty much everyone already agrees they're less flexible.

------
corin_
Do you have any plans to monetise this ever, and/or might you ever release the
source so people can run it locally themselves?

I have a small enough amount of content on my Jekyll site that I don't want a
search function right now, but if I ever were to want one, the missing feature
I would like is the ability to add pages not from an RSS feed, as in standard
non-blog pages such as /index.html and /about.html (or whatever). Obviously I
could work around by adding them into an RSS feed... but that's a tiny bit
messy.

edit: Assuming I'm right in thinking that one of, if not the, most suited
system for Tapir is Jekyll, you could perhaps mention it on the actual Tapir
site so that search engines can pick it up. (Not that I'm in any way an SEO
expert, but in my experience with niche areas like this, very little or no
attention to SEO can still get you ranking highly for something like "jekyll
search").

~~~
matsimitsu
You can actually just push any content with a link trough the API (see
<http://tapirgo.com/#docs> > Push API). So with a deploy script that takes the
content and pushes it, you can already do this right now!

------
pencilcode
ideally it would skip the rss feed. most static sites are just plain html
files, they have no rss feeds.

~~~
jkreeftmeijer
We're definitely looking into this and hope to find a clean solution soon. We
started by indexing RSS since it was what we needed and since it was _way_
simpler to implement for a first version. Stay tuned! :)

~~~
_pdeschen
Given a sitemap.xml, that could be used to index static site without the need
for a feed.

<http://www.sitemaps.org/>

Since sitemap is a /standard/ (sic!) document hence no need to reimplement the
wheel for true static sites.

------
techtalsky
Never used Elastic Search, but it's not too hard to use Nutch + Solr for
static sites. Nutch to spider your (truly static) site, and Solr to store and
serve search requests.

------
mikemoka
wouldn't it be simple enough using Google's own AJAX search API ?
<http://code.google.com/apis/customsearch/v1/overview.html>

~~~
jkreeftmeijer
That would be the simpler approach, but Google wants you to put their logo on
your results page and limits you to 100 requests per day, unless you get a
paid plan. (<http://www.google.com/cse/docs/tos.html>)

If you're fine with that, Google Custom Search is a good option. If you're
not, maybe Tapir can help. :)

~~~
sebastianavina
> rate of $5 per 1000 queries

Damn, search is expensive

~~~
diego
Or you can try IndexTank. Free up to 100k documents, $49/mo up to 500k.
Unlimited queries. (disclaimer: it's my company)

~~~
blhack
Thank you for making reddit search...usable.

------
robert-boehnke
Can it search atom feeds as well?

~~~
jkreeftmeijer
Yes, it can handle atom feeds too. If you have a feed Tapir can't read, let me
know and I'll look into it. :)

------
mahmud
i use Montezuma, tiny Lucene clone in CL.

