eh, I think this is pretty reasonable and not a "hack". It matches what we do as people. I think there probably needs to be research into how to tell it when it doesn't know something, however.
I think if you remember that LLMs are not databases, but they do contain a super lossy-compressed version of (it's training) knowledge, this feels less like a hack. If you ask someone, "who won the World Cup in 2000?", they may say "I think it was X, but let me google it first". That person isn't screwed up, using tools isn't a failure.
If the context is a work setting, or somewhere that is data-centric, it totally makes sense to check it. Like a Chat Bot for a store, or company that is helping someone troubleshoot or research. Anything where it really obvious answers that are easy to learn from volumes of data ("what company makes the corolla?"), probably don't need fact checking as often, but why not have the system check its work?
Meanwhile, programming, writing prose, etc are not things you generally fact-check mid-way, and are things that can be "learned" well from statistical volume. Most programmers can get "pretty good" syntax on first try, and any dedicated syntax tool will get to basically 100%, and the same makes sense for an LLM.
I think if you remember that LLMs are not databases, but they do contain a super lossy-compressed version of (it's training) knowledge, this feels less like a hack. If you ask someone, "who won the World Cup in 2000?", they may say "I think it was X, but let me google it first". That person isn't screwed up, using tools isn't a failure.
If the context is a work setting, or somewhere that is data-centric, it totally makes sense to check it. Like a Chat Bot for a store, or company that is helping someone troubleshoot or research. Anything where it really obvious answers that are easy to learn from volumes of data ("what company makes the corolla?"), probably don't need fact checking as often, but why not have the system check its work?
Meanwhile, programming, writing prose, etc are not things you generally fact-check mid-way, and are things that can be "learned" well from statistical volume. Most programmers can get "pretty good" syntax on first try, and any dedicated syntax tool will get to basically 100%, and the same makes sense for an LLM.