

Caponia - an in memory, full text search system in clojure - brehaut
http://github.com/mattdw/caponia

======
jbooth
Why not just wrap Lucene? It's in Java, provenly awesome over the course of a
decade and supports in-memory, file-backed, whatever. Probably a lot easier
than writing your own tokenizers and indexing algorithms.

~~~
mattdw
Fair question. I'm running couchdb for a new project, and Lucene is the
'blessed' search backend for that.

However, (at least via couchdb) Lucene needs a whole JVM of its own, and I'm
planning to run this on a VPS with pretty tight resources. A couple of MB in
the same JVM as my site is a whole lot more appealing, and has the additional
advantage of using native data-structures directly.

~~~
jbooth
Lucene's just a toolkit, it'll run in the same JVM that you're running clojure
in, and while it can be memory-intensive if you're running a lot of searches
over a big index, it's probably not any more memory intensive than what you're
doing -- they've done a bunch of iterations already.

Maybe you were thinking of Solr? That's the REST service that wraps Lucene but
you can use the Lucene API from within your app just as easily if that fits
your use case.

~~~
mattdw
couchdb-lucene, which is what I was looking at, does require its own JVM. I
didn't think to look at using Lucene directly.

On the other hand, I'm quite happy with how this turned out, it's one less
dependency, and it's been fun to write.

~~~
jbooth
Yeah, kudos for writing it, it's always good to write something like that and
learn. If you find further needs, I'd really recommend looking at the "thin
wrapper around lucene" approach, they've been banging on this problem for a
decade and have a lot of lessons learned in their codebase.

