Instant Word Search : Something I wrote for fun

aston · on Aug 31, 2007

If you're actually into making this more awesome, you almost certainly want to put this together with some in-memory data structure. And that data structure should probably be a suffix tree/trie. Check it out: http://en.wikipedia.org/wiki/Suffix_tree. You'd need to modify that to deal with having more than one word, but that's not too bad (you just lay the trees on top of each other).

kirubakaran · on Aug 31, 2007

Thanks a lot!

kirubakaran · on Aug 31, 2007

It is silly, really. But I just thought I'll share anyway.

ivankirigin · on Aug 31, 2007

Not so instant for me :-P

paulgb · on Aug 31, 2007

I found the same. You must be using LIKE across the whole dataset?

Why not build a hash table with every two-letter combination as they keys and the index of every word containing that combination as the value? That should narrow down the number of words you have to search through for each query, and it could all be done in MySQL with a many-to-many relation.

kirubakaran · on Aug 31, 2007

I use a text file (not db). I too thought that partitioning it would give a better performance. But profiler disagreed.

I use nearlyfreespeech.net, and they don't support FastCGI yet. Thats the bottleneck.

I am planning to try this in EC2... to see what happens.

kirubakaran · on Sept 19, 2007

It is much faster now.