Looking through HN posts over the last 30 days there are at least 15 posts for symantic search. I'm excited about the prospect but haven't seen good results.
Is there any info on query performance? And to builders, did you get better results by incorporating semantic search?
This engine used a neural network to train an autoencoder that crunches down the word counts for thousands of words to a moderate dimensional vector, say n=50. This captures correlations between words such that similar documents are more consistently close in the embedding space than they are in the very high dimensional word vector space.
This kind of system does not improve short queries (<10 words) but is great for "more like this" queries centered on a document and taking paragraph you wrote describing an invention and finding prior art.
We used the TREC evaluation methodology, public data, our proprietary data, and the opinions of users to conclude our product was much better than a simple baseline search engine and our competitors.