But your claim might be stronger, though I'm not sure in what sense it is stronger.
In Pony, it holds for the entire program. So, if A1 sends a message to A2, and then A1 sends a message to A3, and in response to the second message A3 sends a message to A2... in Pony, A2 is guaranteed to get the message from A1 before the message from A3.
Locally, a message pass never fails and always delivers the message directly. It can't be in-flight, so m1 is placed in the mailbox before m2 happens, and thus m3 will always come later.
In the distributed setting, A1, A2 and A3 might be living on three different nodes, and then m1 might get delayed on the TCP socket while m2 and m3 goes through quickly. Thus we observe the reverse ordering, m3 followed by m1. This is probably why the causal ordering constraint is dropped in Erlang. Is pony distributed?
On the other hand I could imagine that this could also cause problems: If you have a large program, could then this causality cause some messages to be delayed for a (too) long and not directly visible time in order to achieve the causality guarantee? But the claim was that there is no runtime overhead.
You have an actor A13 that communicates with actor A2. The guarantee allows you to break up A13 into the two actors A1 and A3 that now both communicate with A2 and be done with it.
In erlang, you might not've done so, because you know nothing about message ordering in non-tree-shaped topologies.