I’m wondering how to detect that there are no relevant matches in a specialized RAG. Could you add some “ringer” articles that are broadly relevant? If they come up first, there’s no good match.
I've advocated at work for a similar strategy using prompt injections and jailbreaks in the dataset, and to abort when those documents are matched. So far no traction. I think overall it is a mistake to build any such system with only positive examples or documents, but I'm a security person, and still learning machine learning.
Is the best match dot product (or other distance metric used) below some level? What the good level is would have to be examined for your particular use case, maybe make a histogram of the best (or several best) dot products from many good searches and pick some quantile.
Another way to do this could be to calculate the "average" of all of the embeddings of documents in the index, then try to figure out if the query embeds to a point that's a significant distance from that average. Maybe figure out that distance through experimentation?