How-to: create a real-time search app with IndexTank and Heroku

barmstrong · on Dec 22, 2010

I played with IndexTank a bit this past weekend, and it seems like they are really hurting from lack of a good gem that fits into Active Record. When compared to something like ThinkingSphinx or a project like that it feels very un rails-ish.

I noticed this project has cropped up to try and solve it, which is awesome: https://github.com/flaptor/thinkingtank however it's not Rails 3 compatible right now.

If I were IndexTank, getting a gem like that up to speed would be my top priority I think to get real traction on Heroku or amongst rails folks.

diego · on Dec 22, 2010

We also noticed that someone put out this gem:

https://rubygems.org/gems/tanker

gregwebs · on Dec 22, 2010

Why would one want to use IndexTank instead of Sphinx Search?

jhandl · on Dec 22, 2010

IndexTank is a hosted service, you don't have to build, configure or manage your own search infrastructure.

samd · on Dec 22, 2010

When I first saw the contest I was only thinking about indexing all the existing data from some website, but clearly it's better to just start indexing the new data, and get a website up and running. Later I could write another program that fetches and indexes the old data in the background. Thanks for writing this, it gave me some new ideas.

Blankwood · on Dec 22, 2010

I see I can get a free 1million size account for the contest at heroku. How big is that for this type of application, like you show in the tutorial?

diego · on Dec 22, 2010

My app is using 25k docs. What I described in the article is very basic, the app that you can try at http://plixitank.heroku.com keeps a window of the 25k most recent items from Plixi and erases older stuff. That's enough for several hours' worth of search history.

Blankwood · on Dec 22, 2010

Just saw that you also built trendistic. How big is that index? Am trying to get a feel for the size of index to make a compelling contest entry.

santip · on Dec 22, 2010

Trendistic is a special-purpose app with several millions of documents. However, the contest accounts are limited to 1M documents so don't worry much about the size. You could try an interesting approach at indexing tweets and should be more than fine choosing up to 1M tweets with some criteria, be it recentness, popularity of the author or something else. You can contact us directly if you have other specific questions, support [at] indextank or through the chat box on our site.

diego · on Dec 22, 2010

This is because many people asked us for ideas for the contest. It's a sample app built from the ground up.

eidorianu · on Dec 22, 2010

diego, what was the language used to implement that search engine back in 98'?

diego · on Dec 22, 2010

C, I probably still have that source code around. It was very rudimentary: an inverted index with no relevance, only AND queries (intersection of word vectors). I had the index mmap'ed because it was pretty small.