Whatever web app framework you favor, there should be plugins for SOLR and sphinx that make fulltext indexing with reasonable defaults pretty easy. i.e. for rails thinking sphinx. I used to use acts_as_solr (I think a lot of people use sunspot now, and Xapian).
Play with a database or docs in a filesystem, do deltas of SOLR and sphinx, changing parameters like stopwords, token separators, stemmers, UTF-8 and ISO-Latin to ASCII mappings. See if you can get decent precision/recall metrics. There's quite a few degrees of freedom, depending on the database.
Play with a database or docs in a filesystem, do deltas of SOLR and sphinx, changing parameters like stopwords, token separators, stemmers, UTF-8 and ISO-Latin to ASCII mappings. See if you can get decent precision/recall metrics. There's quite a few degrees of freedom, depending on the database.
http://www.computationalmedicine.org/challenge/cmcChallengeD...
http://stackoverflow.com/questions/tagged/sphinx