There's some exciting preliminary experimentation doing an x32 abi variant for GHC (Haskell). The benchmarks with the hacky first attempt yielded a 15% perf boost. I'm hoping that an x32 variant with largish heap support will be in the ghc 7.10 release (which won't be for another 8-12 months).
Btw: ghc always welcomes new contributors! Getting started can be as simple as trying to build ghc HEAD on your favorite platform and reporting any test suite bugs or bugs in your own experimentation. If you get stuck or confused, the ghc irc channel on freenode is full of folks happy to help out too!
With a potential 32% boost in integer performance, I've been exited to try out X32 for years now. I was hoping this post meant I could finally use it. After re-reading the Wikipedia article, I was left wondering can my Ubuntu 12.04.3 run it? Where can I get the X32 software? According to Phoronix, no tier-one Linux distribution is shipping any official x32 images yet.[1]
With all the talk of 64bit CPU on the new iPhone 5s with only 1GB of RAM. I wonder if it will be subject to the 64bit memory address penalty, or if iOS is using something similar to X32?
Note that the boost is only as dramatic for programs that make very heavy use of pointers. For instance, the benchmark that gets a 32% boost [1] requires "about 100 or 190 MB" of memory on 32/64 bit, respectively. If we call the memory used for pointers A and the memory for other data B, then this means that
A + B = 100
2A + B = 190
=> A = 90, B = 10
So if 90% of your memory usage is pointers, this is great. But I don't think that this is a typical workload. Otherwise, the performance advantage is not as clear.
Wikipedia notes that "On average x32 is 5–8% faster on the SPEC CPU integer benchmarks compared to x86-64 but it can as likely be much slower. There is no speed advantage over x86-64 in the SPEC CPU floating point benchmarks."
Perhaps the reason that no tier-one Linux distro ships with x32 images is that the additional complexity of supporting an additional ABI is not seen to be worth the modest performance increases.
Another relevant benchmark is x32 vs. x86. In a memory constrained environment you may be forced to use x86. x32 gives you the ability to get the performance benefits of x86-64 (possibly plus some) while staying in your memory constraints.
Does this break vectorization? Specifically, saxpy (Y=M*X+B) requires extra operations to dereference every element and won't let you insert an FMA instruction.
ARM64 has 64 bit pointers, so even though only 33 bits of those are being used for addressing memory, iOS still suffers the increased memory penalty[1].
On the other hand, it's not like the bits of the pointer which are not used for addressing memory are necessarily wasted.
You can use them as tagged pointers [1], and I think that Apple's Obj-C runtime actually does this. (I don't know for sure, as it's not really something I'm interested in, but I think I saw an article about it. I could be wrong.)
Yes, those bits are put to very good use! In ARM64, some bits are actually used to store the reference count for the object, making -retain/-release calls super fast.
The wackiest thing about that is... if there are multiple pointers to an object (kind of the point of reference counting) and the reference count is part of the pointer... well... I don't really see that working well at all. OTOH I could see there being some optimization that means some callers might not need to modify the global reference count, only the one in their reference.
If I'm not mistaken, those extra bits are pulled from the "isa" pointer, present in all objects and similar to a C++ vtable pointer. That is, they're in the object itself, not the pointers to it.
FWIW I've been running x32 in production for over a year now. The system has its quirks, such as busted busybox/iptables support. Compiling these two packages against amd64 allow them to run, which is totally fine. With x32 you get the best of both worlds; I am using Gentoo.
Btw: ghc always welcomes new contributors! Getting started can be as simple as trying to build ghc HEAD on your favorite platform and reporting any test suite bugs or bugs in your own experimentation. If you get stuck or confused, the ghc irc channel on freenode is full of folks happy to help out too!