Llama2:7b makes the same mistakes. It's not until you use something like Mixtral...

Llama2:7b makes the same mistakes. It's not until you use something like Mixtral or Llama2:13b that it actually gets the correct results (in my one example).

Interestingly Llama2:13b objects that there is no discount until I clarify: "the discount you're getting [with 2 apples]"

It's not just math though it's any kind of complex reasoning and ambiguity. Comparing to humans is always complex, but humans for the most part wouldn't balk at me asking what discount you're getting without specifying that it's the 2 apples that have the discount in this example. A more advanced model often states the assumptions.

There are lots of nuances in this question as well. I'm still paying 80c more than buying one apple, so I should only buy two apples if I would use two apples.