

The multi-core debate in mobile - alexvoica
http://www.alexvoica.com/all-your-cores-are-belong-to-us/

======
alexvoica
There have been a few articles recently about whether SoCs for mobile devices
need to scale beyond a certain core count.

One side says that 8 cores and above is overkill and that building highly
scalable dual- and quad-core processors is the way forward:

[http://www.forbes.com/sites/patrickmoorhead/2015/05/11/why-8...](http://www.forbes.com/sites/patrickmoorhead/2015/05/11/why-8-cpu-
cores-in-smartphones-are-a-bad-idea-an-auto-industry-lesson/)

Another side says that more cores offer more granularity for workloads and
that we shouldn't be afraid to move to 10- or 12-core processors:

[http://www.androidauthority.com/why-8-and-10-cpu-cores-in-
sm...](http://www.androidauthority.com/why-8-and-10-cpu-cores-in-smartphones-
are-a-good-idea-607894/)

I tend to be in the first camp so I've linked to an article I wrote on the
topic.

It would be interesting to see what others here think.

~~~
yaantc
I note that companies who design their own cores (Apple, Qualcomm) tend to
stick to a limited number of intermediate cores. Only companies tied to ARM
implementations went the big.LITTLE way, the one that leads to more cores. And
most struggle with the power consumption of the big cores.

The way I look at b.L is that ARM leveraged their dominant position in mobile
to prepare the ground for their entry in servers. The A15, then A57 and A72
and a bit too big for a mobile. To make them acceptable, there is b.L and more
cores. Apple and QCOM (with the Scorpion and Krait, not for chips where they
reuse ARM cores as is) made cores not as fast as the big ARM implementations,
but still very good. They're closer to the A12 and A17 from ARM, which arrived
after the A15 but are 32 bits so no longer attractive. I would guess that once
the A57/A72 are paid for, then a middle-of-the-ground ARM implementation in
the spirit of the A17 but 64 bits will come. That's what I'd like in my phone,
and no need for a zillion cores for me.

------
PaulHoule
The Forbes article is atrocious, the guy really doesn't know what he is
talking about. The analogy with cars is completely bogus, particularly when
the "8 cylinder engine" of CPUs was the Pentium 4.

The directly linked article is confused IMHO.

One kind of "12-core" mobile CPU is an extension of the big.LITTLE concepts to
have three clusters, to have a high, a medium, and a tiny cluster. You are
only running one cluster at a time. The benefit is the same as big.LITTLE but
more so.

Another possibility is to have 12 (or some other large number) of very little
CPUs and really run them at the same time. The advantage there is you can
lower the clock and voltage a lot and with CMOS that means a huge drop in
power consumption. Now for the "embarassingly parallel" jobs that I do a lot
of, it is pretty easy to get a near 8x speedup if you are using
ExecutorService in Java (and not so easy if you are playing with Fork/Join,
Actors or some other paradigm.) When you get to 16+ cores even jobs that are
easy to parallelize start to hit the wall.

You might be able to get video encoding and decoding to go very wide but for
common tasks like web browsing and word processing it is hard to translate a
large number of cores into more responsiveness. It is an active research area
and Mozilla would like to do it but it;s hard.

The SMT technology you are advocating is a different animal, the big advantage
is that it is a relatively easy way to "eat" the latency that comes from the
memory hierarchy. If you have a large number of instructions in flight from
different threads, high latency doesn't block throughput so much. Funny
enough, the most interesting SMT systems are on the server side such as Sun's
Niagra and some of the chips that Cray ships these days.

~~~
alexvoica
If this many-core, multiple cluster architecture implements HMP then you can
theoretically run all cores across every cluster.

I think we need a broader discussion about what mobile apps will use many-
core/multi-core processors. If we are talking video processing, then why do it
on the CPU, if you have a dedicated video encoder/decoder already there?

One situation where you might want to use a mobile CPU for video decoding is
for small window, low-res preview modes (some browsers and mobile apps use it)
but even then, do you really need eight or ten CPUs to do the job?

