Sub-millisecond latency sounds impressive, but isn't network latency going to ov...

pornel · 2024-12-02T17:51:21 1733161881

When search is cheap and quick, it's possible to improve search by postprocessing search results and running more queries when necessary.

I use Tantivy, and add refinements like: if the top result is objectively a low-quality one, it's usually a query with a typo finding a document with the same typo, so I run the query again with fuzzy spelling. If all the top results have the same tag (that isn't in the query), then I mix in results from another search with the most common tag excluded. If the query is a word that has multiple meanings, I can ensure that each meaning is represented in the top results.

wolfgarbe · 2024-12-02T14:35:00 1733150100

It depends on the application.

When using SeekStorm as a server, keeping the latency per query low increases the throughput and the number of parallel queries a server can handle on top of a given hardware. An efficient search server can reduce the required investments in server hardware.

In other cases, only the local search performance matters, e.g., for data mining or RAG.

Also, it's not only about averages but also about tail latencies. While network latencies dominate the average search time, that is not the case for tail latencies, which in turn heavily influence user satisfaction and revenue in online shopping.

intelVISA · 2024-12-02T20:25:36 1733171136

A typical server is serving more than one request at a time, hopefully.