If you use a few-shot technique (i.e. your prompt contains a couple of example questions and answers) you can mitigate this behavior by adding a question with the answer "I don't know".
More generally, if you teach the model to reject nonsense questions and admit if it doesn't know something it's more likely to do that
I agree with you principally, and generally, but in rather small domains like this I would imagine symptom management using negative examples (i.e. training pairs where the response is a refusal to answer) and adding more explicit statements about what is not true, possible, or known to the corpus would get you to a pretty good place.
> I didn't know there are many techniques to mitigate this
A trivial idea - you can use GPT-3 to inject bullshit/hallucinations into real text. Then train the model to solve the reverse task, of detecting bullshit in input text.
I didn't know there are many techniques to mitigate this