
The Future of Core, Intel GPUs, 10nm, and Hybrid x86 - MikusR
https://www.anandtech.com/show/13699/intel-architecture-day-2018-core-future-hybrid-x86
======
Symmetry
>Q: In the demo of FOVEROS, the chip combined both big x86 cores built on the
Core microarchitecture and the small x86 cores built on the Atom
microarchitecture. Can we look forward to a future where the big and little
cores have the same ISA?

>R: We are working on that. Do they have to have the same ISA? Ronak and the
team are looking at that. However I think our goal here is to keep the
software as simple as possible for developers and customers. It's a challenge
that our architects have taken up to ensure products like this enter the
market smoothly. We’ll also have a packaging discussion next year on products
like this. The chip you see today, while it was designed primarily for a
particular customer to begin with, it’s not a custom product, and in that
sense will be available to other OEMs.

This seems like a big issue to me going foward. The Intel fuses off various
ISA features on various architectures for market segmentation reasons or just
doesn't put them in in the case of Atom means that they'll have a harder time
of it if they want to pursue big.LITTLE.

~~~
monocasa
Ironcially, Atom also gets ISA extensions that don't make their way into the
bigger cores too, like the Intel SHA extensions. I think the idea was that the
bigger cores were already smart enough to not really get a benefit from
dedicated SHA instructions, but the did help the simpler Atom cores.

Of course that now means that the big cores arent a super set of the small
cores, feature wise.

~~~
blattimwind
Ryzen shipped those extensions and they improve SHA performance quite a lot.

------
robbyt
Security s/fixes/features/g coming in 2020. Does this mean HW fixes for
spectre/meltdown?

~~~
Chyzwar
Whiskey Lake already has some fixes for meltdown/foreshadow.

[https://www.anandtech.com/show/13301/spectre-and-meltdown-
in...](https://www.anandtech.com/show/13301/spectre-and-meltdown-in-hardware-
intel-clarifies-whiskey-lake-and-amber-lake)

~~~
rurban
Those "some fixes" are not enough. It needs a cache redesign, which will be
too late. I also don't see a second C3, extra for the kernel.

------
ksec
I think this settles the debate whether Apple will be moving to its own ARM
Mac.

AMD will need to move and execute its plan to perfection. Hopefully they gain
enough market share to make a difference and making enough profits for long
term survival. Finger crossed.

~~~
npunt
My read is Apple is most definitely on for ARM Mac and this is too little, too
late (I assume that's your reaction as well). Doesn't seem like much IPC
increase and mostly just catch up on both features and process.

I'm sure these Intel chips will make their way to Macs whenever they ship
(2019 or beyond), but its likely * Cove is the last new Intel arch Apple uses
if they can release some A-series Macs by around ~2020. It seems like the
_only_ interesting IP Intel may have to Apple is Optane (theoretically), and
the only value Intel offers Apple for more than the next 12mos is Xeons slated
for Pro machines.

When you consider that the trend is integrating everything into SoCs, and
their features are more tightly designed according to actual usage up the
stack in OS and apps and use cases, and they're shipped on a strict yearly
cadence, and their R&D benefits from a massive market of iDevice sales, and
their value is inherently better captured via device sales and direct customer
relationship... it's kind of nuts that Apple is still with Intel.

That's not even accounting for the 10nm delay, nor Intel's weird market
segmentation of features, nor Intel's margins, nor Intel's clear
organizational issue (per article) of decoupling IP & process that add risk to
the roadmap.

Even if Intel gets to EUV before TSMC (they won't), it's just not structured
right. This can't last much longer.

~~~
ksec
Apple will still need to SoC from 5W ( MacBook ), 7W- 15W (MacBook Air),
25-45W ( MacBook Pro ), 65-95W ( iMac ), 95-200W ( iMac Pro and Mac Pro ) all
these combined of _only_ 20M unit per year.

I hope I am wrong, not sure if it make any financial sense.

~~~
npunt
I might simplify that further into three segments with different silicon
strategies:

Low (~15m units) - MacBooks except 15" MBP

Med (~2-4m units) - 15" MBP, iMac, and Mac mini

High (<1m units) - iMac Pro and Mac Pro

In LOW, silicon-wise its essentially free since they already have a SoC
powerful enough w/A13X+. They can always bump up the wattage/frequency and/or
decrease the chassis size to define a slightly new market segment where
compute expectations are lower (e.g. the Air approach).

Between LOW & MED is the mobile workstation use case of the higher spec 15"
(or larger) MBP. Cheapest solution: add a discrete GPU.

In MED, they could still use A13X+ and add discrete GPU at higher SKUs.
Alternately, they could create a beefier chip ("A13XL"?) that would just bolt
more cores onto the CPU/GPU.

Between MED & HIGH would be the semi-pro use case of the higher spec 27" iMac
(or even certain use cases of Mac Mini). Hard to say what cheap solution is
here beyond better GPU.

In HIGH, it feels like Intel is here to stay for a while. These use cases
still really need the beefy Xeons and are dealing with lots of pro-level edge
cases which favors staying with x86, ECC, etc, and cost-wise it's very low
volume.

On these Pro machines where price is less of an object, and to keep up with
the rapid feature development roadmap they get from A-series, it feels like
they'd do the 'why not both' strategy and throw in an A13X+ as the primary
processor and a Xeon as a co-processor with it's own RAM. Just like GPUs are
co-processors today. The OS, everyday apps, and specific tasks that favor the
A-series would run on the A13X+, while certain instructions and legacy x86
programs/VMs/etc would be routed to the Xeon.

\---

The more I think about it the more this co-processor approach really makes
sense for the next 5 years of Pro-level Macs, since it avoids the cost of
creating new low-volume, high complexity silicon.

It's fun to speculate because it represents quite an interesting business +
technical + market segmentation challenge. They don't want to siphon off or
detract Mac users any further than they already have, but also need to start
reaping the benefit of their highly competitive and high volume chips.

~~~
ksec
1\. You ignore the Tape Out cost on leading Edge node, with yield
unforeseeable due to higher voltage and frequency usage.

2\. Adding a Discreet GPU still requires a new set of drivers that is
previously not written on ARM Mac.

The 5M Mac will require addtional hundred million of investment, time, and
testing.

The Co-Processor approach also requires additional OS and software support.
Testing cost also.

The only way I see is Apple employ a similar strategy to AMD, chiplet design,
where a small CPU Core Die could be used for lots of different configuration.

We will have to see how EYPC 2 perform to judge whether it is a good solution.

~~~
npunt
I considered chiplets but I probably dismissed them prematurely based on my
understanding that they introduce some efficiency loss, and it would be un-
Applelike to make their iDevice cash cow less efficient in order to serve the
Mac niche. They do open up a ton of flexibility in using different nodes,
configs, improved yields, etc. Seems they're actively researching various
methods [1], and the whole semi field is moving that way, so I agree with you
if they work well that's probably the way they'd build chips for MED and HIGH
use cases. I revise my assessment after further thought :)

There's definitely software work that would need to make Intel chips co-
processors, but I'm not sure it's a monumental task. There's already some of
this happening today with T2 chips handling video encoding/decoding in certain
apps, and of course discrete GPUs already act as co-processors with their own
resources for many tasks, so this approach isn't unprecedented. In this co-pro
idea I'm proposing, the OS is run by arm64, and depending on the binary they
route to either x86 or arm64. Developers would have some control over that,
and legacy x86 binaries would automatically run x86 on machines that had Intel
chips.

Discrete GPU wouldn't require significant driver work if discrete GPUs were
only available on machines that had Intel chips. That might be a good
practical point to differentiate if the chiplet approach lets them design even
higher powered GPUs for LOW and MED products.

Again I don't see this much as a cost-issue, but as a strategic issue. They
can spread costs over 5 million Macs in multi-year transition, and this would
be partially or fully offset by cheaper hardware & ability to ship
differentiated hardware that better suits markets Apple wants to pursue.
Unless costs are massive, it seems like not a hard thing to justify.
Personally I just can't see x86 instruction set becoming _more_ important in
the next decade+ when already it's been eroded from mobile & server fronts and
less important because of improved software tooling, and Intel is perpetually
floundering. Feels like a bad bet.

The simplest path / null hypothesis to disprove is LOW/MED get A-series chips,
HIGH get Intel + dGPU + T-series, and multiple versions of macOS & multiple
binaries of apps are shipped. That's only acceptable for a few years I think
due to higher maintenance cost. If the Intel co-processor approach is not
difficult software-wise, that seems beneficial because it might simplify
OS/kernel/system maintenance as that would now just run on A-series chips.
Putting Intel in an easily-removable VM-like corner would give Apple more
flexibility to ship future products that serve different market segments (e.g.
the oddball Mac Mini), and potentially extend decision if/when to drop them.

The point about frequency/voltage is an interesting one. I figured there'd be
at least 10-30% possible difference, accepting efficiency loss at higher
frequencies (less an issue on larger chassis). At least in the past, A9 ran at
1.85ghz and A9X at 2.26ghz, but A10 v A10X and A12 v A12X [2] are equivalent,
so perhaps you're right that on leading edge nodes they're just going to
optimize for max efficient freq. Anyway, that's all single-thread CPU, and as
A-series chips get closer to desktop in IPC, differentiator for larger
machines is primarily more threads/ram/gpu.

Good chatting, thank you for insights.

[1] [https://www.macrumors.com/2018/06/12/apple-3d-chip-
packaging...](https://www.macrumors.com/2018/06/12/apple-3d-chip-packaging-
patents/) [2] [https://www.anandtech.com/show/13661/the-2018-apple-ipad-
pro...](https://www.anandtech.com/show/13661/the-2018-apple-ipad-pro-11-inch-
review/4)

------
rurban
So we'll have to wait until 2020(?) until we'll get a usable secure CPU, with
a redesigned cache? Sorry, too late. By that time AMD already took over.

    
    
        Willow Cove 2020 ?	10 nm ?	
        Cache Redesign, New Transistor Optimization, Security Features

