

We relaunched similarity-search based on Y Combinator feedback. Thoughts? - eserorg
http://www.eser.org/usa/en

======
eserorg
We received a lot of valuable feedback on our similarity-search engine, which
we launched a few days ago.

Based on your feedback, we've made some major changes to improve recall.
Specifically, we've begun to include data from our web-crawler.

We've also started to prune many of the similarity-search results in order to
improve precision.

Finally, we cleaned-up the UI to make it more clear what the website does. I
think that we still have some work to do in this area, however.

Unfortunately, many of the changes we've made to the algorithm have
_dramatically_ slowed down performance. Most searches now take over a minute
to complete!

We're hard at work on fixing that, though. Specifically, we're playing around
with implementing multi-level counting bloom filters, count-min flajolet-
martin sketches, and quntile fm digests.

We should have some major performance improvements up over the next few days.

We're also looking at launching a pre-alpha of a stand-alone software package
that implements the ESer algorithm so that people can run similarity-searches
on their own private data sets.

Please comment with your feedback.

Thanks again!

~~~
eserorg
BTW, I want to apologize for how slow searches are running! We're going to
work on this until it's fixed.

It should probably take about a week to see an improvement -- we don't want to
sacrifice recall or precision to improve performance.

So, we're going to try to have our cake and eat it, too.

------
dkasper
It starts out pretty good, but gets kind of bizarre by the last 20 or so
results. Pretty cool idea though!

~~~
eserorg
Right, it's a contextual ranking algorithm -- it ranks links based on how
"similar" they are to what you typed in.

So, the further down the list you go, the lower the rank is -- the less
"similar" the links get.

We actually compute a numerical "similarity score" for each link. Perhaps we
should show it?

Thanks for trying it out!

------
jakewolf
Reminds me of adwords keyword tool. Fun. Good luck.

