I chose that example because its easy to understand, obviously in modern processors with out of order execution and whatnot, you would need something a lot more elaborate.
Once you add the appropriate memory barriers, it looks a lot more "atomic"
Well, they force in order memory access. That doesn't look terribly "atomic" to me, but I understand your point.