I wonder if large language model architectures can be devised to have this capability.
Right now a model like GPT 4V will confidently identify the blue (incorrect) picture as Neptune. Similarly, if asked, it says that itβs blue.
Even if trained on this new information, which supersedes all previous information on the color, it is outweighed by the sheer volume of incorrect information. The loss function optimises for statistically more common answers β not more correct answers.
We need to somehow have the AI recognise new information that supersedes older information and then invalidate its older facts.