"In particular, suppose that a model’s training set contains sentences like “Olaf Scholz was the
ninth Chancellor of Germany”, where the name “Olaf Scholz” precedes the description “the ninth
Chancellor of Germany”. Then the model may learn to answer correctly to “Who was Olaf Scholz?
[A: The ninth Chancellor of Germany]”. But it will fail to answer “Who was the ninth Chancellor of
Germany?” and any other prompts where the description precedes the name."
The operator are, which happens to look similar to an english word, has a meaning that depends on context. Another example is 5/2 means a simple fraction in one context, or a class of polytopes in another.
It does not mean $cows === $mammals.
"A is a B" Or "A's are B's" means that A is included in B.
$a === $b ("A is B") implies that $b === $a (B is A), in all cases that I know of.