Hacker Newsnew | past | comments | ask | show | jobs | submit | tkfoss's commentslogin

Wouldn't holy grail then be parallel channels for candidate generation;

  euclidean embedding
  hyperbolic embedding
  sparse BM25 / SPLADE lexical search
  optional multi-vector signatures

  ↓ merge & deduplicate candidates
followed by weight scoring, expansion (graph) & rerank (LLM)?

that is pretty much exactly what we do for our company-internal knowledge retrieval:

    embedding search (0.4)
    lexical/keyword search (0.4)
    fuzzy search (0.2)
might indeed achieve the best of all worlds

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: