
Apple Silicon implements tricky x86 behaviors in hardware for faster emulation - jsheard
https://twitter.com/never_released/status/1288660944885821442
======
jsheard
[https://github.com/saagarjha/TSOEnabler](https://github.com/saagarjha/TSOEnabler)

------
vardump
Shouldn't ARMv8 LDAR (Load Acquire) and STL (Store Release) instructions be
enough to emulate x86 memory model? What did I miss?

Or would that entail more overhead compared to just having TSO memory model?

~~~
saagarjha
Too slow, probably. Either they'd need to do advanced dataflow analysis to not
have them everywhere (slow/complicated) or just insert them everywhere (slow).

~~~
vardump
I don't see how any kind of dataflow analysis could capture for example
accessing thread stack from another thread. Just too many ways to get the
pointer.

------
mcraiha
And with Apple it doesn't even have to be additional CPU instructions since
Apple controls whole SoC. If your upcoming code uses any Apple specific stuff
(e.g. DSP, security chip, NN) with Apple only API then it basically becomes
"very hard" to port. So goodbye for Hackintosh.

------
acoye
I knew it, “apple silicon” branding allows them to de-normalize ARM at will.

~~~
rrss
How so?

TSO is a valid implementation of the ARM memory consistency model.

