Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How-to: create a real-time search app with IndexTank and Heroku (indextank.com)
33 points by diego on Dec 22, 2010 | hide | past | favorite | 12 comments



I played with IndexTank a bit this past weekend, and it seems like they are really hurting from lack of a good gem that fits into Active Record. When compared to something like ThinkingSphinx or a project like that it feels very un rails-ish.

I noticed this project has cropped up to try and solve it, which is awesome: https://github.com/flaptor/thinkingtank however it's not Rails 3 compatible right now.

If I were IndexTank, getting a gem like that up to speed would be my top priority I think to get real traction on Heroku or amongst rails folks.


We also noticed that someone put out this gem:

https://rubygems.org/gems/tanker


Why would one want to use IndexTank instead of Sphinx Search?


IndexTank is a hosted service, you don't have to build, configure or manage your own search infrastructure.


When I first saw the contest I was only thinking about indexing all the existing data from some website, but clearly it's better to just start indexing the new data, and get a website up and running. Later I could write another program that fetches and indexes the old data in the background. Thanks for writing this, it gave me some new ideas.


I see I can get a free 1million size account for the contest at heroku. How big is that for this type of application, like you show in the tutorial?


My app is using 25k docs. What I described in the article is very basic, the app that you can try at http://plixitank.heroku.com keeps a window of the 25k most recent items from Plixi and erases older stuff. That's enough for several hours' worth of search history.


Just saw that you also built trendistic. How big is that index? Am trying to get a feel for the size of index to make a compelling contest entry.


Trendistic is a special-purpose app with several millions of documents. However, the contest accounts are limited to 1M documents so don't worry much about the size. You could try an interesting approach at indexing tweets and should be more than fine choosing up to 1M tweets with some criteria, be it recentness, popularity of the author or something else. You can contact us directly if you have other specific questions, support [at] indextank or through the chat box on our site.


This is because many people asked us for ideas for the contest. It's a sample app built from the ground up.


diego, what was the language used to implement that search engine back in 98'?


C, I probably still have that source code around. It was very rudimentary: an inverted index with no relevance, only AND queries (intersection of word vectors). I had the index mmap'ed because it was pretty small.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: