Is x86 at the end of the road? I don’t follow closely, but I impression is it’s ...

atty · on Oct 8, 2022

Both AMD and Intel have released some truly impressive CPUs lately, and this month both of them are releasing their next-gen products (ryzen 7000 and raptor lake respectively, I believe). The new ryzen chips are looking very impressive, and if Intel’s announced numbers are to be believed, raptor lake is going to be pretty great.

Where they fall down is on perf/watt, not raw performance. However there’s a lot of things that go into that difference, not just ISA, so I’m not sure if anyone has really decided if x86 is fundamentally less efficient, or just currently less efficient due to current design choices and constraints.

harpratap · on Oct 8, 2022

> Where they fall down is on perf/watt, not raw performance

Raptor Lake is a pretty big jump in perf/watt. In just one generation they are claiming similar perf to Alder Lake at 250W but just 65W on Raptor Lake. AMD did a similar huge jump in efficiency last year with Ryzen 6000 for mobile

KingOfCoders · on Oct 8, 2022

Intel x86 will make even greater leaps in the future. The eCores (where they added more this generation, but no (?) pCores) are as small as ARM cores and the saved die space can be used for chaches (Intel already increased caches again on Raptor) for performance and power efficiency (as Apple shows). Then there will be 5nm which is a big reason for Apples Perf/Watt performance.

jabl · on Oct 8, 2022

> Is x86 at the end of the road?

No. AMD and Intel are the oligopolistic providers for an unbelievably vast software ecosystem that practically rules all computing outside embedded (and some legacy mainframes here and there), are they going to throw away that market position just because the cleaner encoding of ARM or RISC-V would save an estimated low single-digit % of decoding power [1]?

[1]: https://www.usenix.org/system/files/conference/cooldc16/cool...

dmitrygr · on Oct 8, 2022

Top of the line x86 decoders are 6 wide (with limits of 64 aligned bytes on input, making average throughout around 4 per cycle as admitted by intel in their docs). Aarch64 can be decoded as wide as you wish thanks to fixed size. No solution to this other than a fixed length encoding.

snvzz · on Oct 8, 2022

>No solution to this other than a fixed length encoding.

This problem does not apply to RISC-V, where with the C extension you get either 32bit or 2x 16bit. The added complexity is negligible, to the point where if a chip has any cache or rom in it, using C becomes a net benefit in area and power.

ARMv8 AArch64 made a critical mistake in adopting a fixed 32bit opcode size. A mistake we can see in practice when looking at the L1 cache size that Apple M1 needed to compensate for poor code density.

L1 is never free. It is always very costly: Its size dictates area the cache takes, the latency of this cache, the clocks the cache itself can achieve (which in turns caps the speed of the CPU), and how much power the cache draws.

As you mentioned decoder width: There's Ascalon[0], a RISC-V microarchitecture that's 8-decode (like M1), and 10-issue, by Jim Keller's team at Tenstorrent. It isn't in the market yet, but is bound to be among the first RISC-V chips targeting very high performance.

Note that, at that size (8-decode implies lots of execution units, a relatively large design), the negligible overhead of C extension is invisible. There's only gains to be had.

C extension decode overhead would only apply in the comically impractical scenario of a core that has neither L1 Cache nor any ROM in the chip. Such a specialized chip would simply not implement C. Otherwise, it is a net win.

0. https://youtu.be/yHrdEcsr9V0?t=346

jabl · on Oct 8, 2022

> No solution to this other than a fixed length encoding.

Micro-op caching enters the room.

I mean, seriously, micro-op caching has been extensively used for over 20 years, including on ARM64 designs although Apple M1 doesn't have it.

ahartmetz · on Oct 8, 2022

Almost like AMD is saving Intel from itself...

In any case, if Intel needs to diversify, open source (RISC-V) does seem better than proprietary mortal enemy (ARM).

baruch · on Oct 8, 2022

Any Intel server comes with an ARM chip (at least one that I know of), the BMC.

tim-- · on Oct 8, 2022

And most modern AMD chips ship with ARM cores in them... https://en.m.wikipedia.org/wiki/AMD_Platform_Security_Proces...

acomjean · on Oct 8, 2022

Competition works wonders.. having both of them (what happened to cyrix?) keeps them honest to some extent.

mhh__ · on Oct 8, 2022

No. It's not a nice ISA but in the 20 Billion transistor limit it's not the pressing issue