
Arm announces its new premium CPU and GPU designs - ChuckMcM
https://techcrunch.com/2019/05/26/arm-announces-its-new-premium-cpu-and-gpu-designs/
======
ChuckMcM
This is interesting to me, lots of noise around "machine learning" in the GPU
rather than graphics which kind of validates Google's TPU, Apple's AI co-
processors, and Nvidia's Jetson. Saw a system at Makerfaire that does voice
recognition without the cloud (MOVI) and I keep thinking having non-privacy-
invading computers you could talk to would be a useful thing. Perhaps ARM will
make that possible.

~~~
dmix
How much data would it require for a real-world voice "model" that is locally
on a device, space-wise? I don't know much about machine learning's actual
implementations within software, I've always been curious how you go from
(say) Tensorflow -> embedded within real world software / APIs.

~~~
cpeterso
Earlier this year, Google introduced an Android "Live Transcribe" feature that
runs locally without the need for an internet connection:

[https://www.theverge.com/2019/2/4/18209546/google-live-
trans...](https://www.theverge.com/2019/2/4/18209546/google-live-transcribe-
sound-amplifier-accessibility-android-deaf-hard-hearing)

~~~
kyriakos
And it's very accurate. I tested it with English and Greek for which usually
dictation systems are very bad and was surprised how well it recognised every
word.

------
hyperpallium
The chart has 4 data points, but only 3 CPUs. The Cortex-76 is in the middle.
Can anyone explain?

Also, in the wider socioeconomic picture, everyone once got the same
computational technology eventually, because better was also cheaper.

Now, computation is stratifying, like any normal industry, where you get what
you pay for. This ending of egalitarianism is bad.

~~~
clouddrover
> _where you get what you pay for_

I think that's always been true for CPUs and GPUs. The faster ones were always
the more expensive ones.

~~~
onion2k
To me "you get what you pay for" implies a linear relationship between cost
and speed, and that has never really been the case. Sometimes you pay _a lot_
more money for a little more speed.

~~~
ska
There has always been a nonlinear ramp up for the latest and greatest -
typically targeted to enterprise applications that are more time sensitive
than cost sensitive.

If you are cringing at the idea of a 3k+ cpu (or 9k+ gpu), it quite literally
wasn't built for you.

------
Causality1
>the company argues that 85 percent of smartphones today run machine learning
workloads with only a CPU or a CPU+GPU combo.

Gotta admit I'm not real clear on what my phone does that needs on-device
machine learning.

~~~
antpls
If you need actual examples running right now :

    
    
      Recent Snapchat and Instagram selfie filters
      Google Keyboard's translation and prediction
      Google Traduction's lens
      Google Assistant's voice recognition
      Google Message's instant replies
    

Almost all of them are inference workloads. I believe only Google Keyboard
does on-device training in the background when the phone is charging.

~~~
vitorgrs
A few more:

\- Photos and a few other things on iOS, are all local. Same thing with
Windows 10 (Photos app on Windows, Keyboard suggestion too I believe)

\- Microsoft Edge pages suggestion (really, it's a onnx model)

\- OCR Text Recognition on Windows

------
sigmonsays
can someone explain to me why I want ML specific processor features or chips
in my phone?

I thought ML required massive amounts of data to be taught, most of which
makes more sense in the cloud.

Am I way off here?

~~~
btown
Training a model does require massive data and compute, but evaluating/using
an already model (e.g. running a Hot Dog/Not Hot Dog classifier) can be done
on mobile hardware. Accelerating this could, for instance, allow this to run
in real time on a video feed.

------
fxj
How does this NNPU compare to the other vendors (Goolge TPU (3 TOPS?, Intel
NCS2 (1 TOPS), kendryte RISC-V (0.5 TOPS), Nvidia Jetson (4 TOPS)? Can you use
tensorflow networks out of the box like the others provide?

------
KirinDave
Does someone have a mirror of this article that doesn't seize up if you refuse
to display ads?

~~~
jakeogh
Disable JS. I leave it off by default, the web is drastically nicer without
it.

~~~
KirinDave
I tried this and the full article didn't render.

~~~
jakeogh
I'm on surf: [http://surf.suckless.org/](http://surf.suckless.org/) (webkit2)
and it rendered nicely, inline images and all.

Surf has a keybinding Ctrl-Shift-S to toggle JS. Launch it with -s to disable
by default.

------
ksec
I think it is worth pointing out this new CPU possibly landing in Flagship
Android in late 2019 or early 2020 would still only be equal to or slower than
an Apple A10 used in iPhone 8.

Assuming Apple continue like they do in the past, drop iPhone 7 and moves
iPhone 8 down to its price range. They would have an entry level iPhone that
is faster than 95% of all Android Smartphone on the market.

~~~
Liquid_Fire
Isn't the entry level iPhone also significantly more expensive than most
Android phones?

~~~
ksec
Depends on Brand, although that will most likely be true since Top 5 Android
Phones maker are all Chinese except Samsung.

------
narnianal
What even is an "AI chip"? What's the difference to a GPU? As long as nobody
can explain that I have big doubts that it whould be more than a
GPU+marketing. So no big deal if they don't provide one.

~~~
fxj
This link gives a nice explanation of the google nn-processor aka TPU:

[https://cloud.google.com/blog/products/gcp/an-in-depth-
look-...](https://cloud.google.com/blog/products/gcp/an-in-depth-look-at-
googles-first-tensor-processing-unit-tpu)

It boils down to:

CPUs: 10s of cores

GPUs: 1000s of cores

NNs: 100000s of cores

NNs have very simple cores (fused multiply-add and look-up table functions)
but can run many of them in one cycle.

FTA: Because general-purpose processors such as CPUs and GPUs must provide
good performance across a wide range of applications, they have evolved myriad
sophisticated, performance-oriented mechanisms. As a side effect, the behavior
of those processors can be difficult to predict, which makes it hard to
guarantee a certain latency limit on neural network inference. In contrast,
TPU design is strictly minimal and deterministic as it has to run only one
task at a time: neural network prediction. You can see its simplicity in the
floor plan of the TPU die.

------
lone_haxx0r
Unreadable. As soon as I click on the (X) button, the article closes and the
browser goes back to the root page of
[https://techcrunch.com/](https://techcrunch.com/).

There's no way to avoid clicking on it, as it follows you around and grabs
your attention, not letting you read.

~~~
founderling
Even when you fight that annoyance with a content blocker, the page itself is
aggressive to no end.

I scrolled down to see how long the article is. That somehow _also_ triggered
a redirect to the root.

How is it possible that the most user hostile news sites often get the most
visibility? Even here on HN which is so user friendly. What are the mechanics
behind this?

The url should be changed to a more user friendly news source. How about one
of these?

[https://liliputing.com/2019/05/arm-launches-
cortex-a77-cpu-m...](https://liliputing.com/2019/05/arm-launches-
cortex-a77-cpu-mali-g77-gpu-and-arm-ml-npu.html)

[https://hexus.net/tech/news/cpu/130757-arm-releases-
cortex-a...](https://hexus.net/tech/news/cpu/130757-arm-releases-
cortex-a77-cpu-next-gen-mobile-devices/)

[https://www.xda-developers.com/arm-cortex-a77-cpu-
announceme...](https://www.xda-developers.com/arm-cortex-a77-cpu-
announcement/)

[https://www.theregister.co.uk/2019/05/27/arm_cortex_a77/](https://www.theregister.co.uk/2019/05/27/arm_cortex_a77/)

[https://venturebeat.com/2019/05/26/arm-reveals-new-cpu-
gpu-a...](https://venturebeat.com/2019/05/26/arm-reveals-new-cpu-gpu-and-
machine-learning-processor-for-5g-world/)

~~~
looeee
> Even here on HN which is so user friendly.

Except on mobile. The UI elements are tiny, try closing ten comments without
clicking the timestamp by mistake at least once.

~~~
wongarsu
And to vote on mobile I have to zoom in because the buttons are way to close
to each other and give no feedback which of the two was pressed (and no way to
find out after the fact).

~~~
kgermino
One tip, when you up vote or down vote a link is added to the comments header.
It will either be “unvote” or “undown” depending on whether you up or
downvoted the comment.

------
amirmasoudabdol
Sad autocorrection of "ARM" into "Arm" in the title on the text!

~~~
saagarjha
Arm is the company name: [https://www.arm.com](https://www.arm.com)

~~~
amirmasoudabdol
It was always the company name, the logo is lowercase _now_ , but the
abbreviation should still be upper case.

~~~
saagarjha
But I don't think the article is using the abbreviation?

------
bufferoverflow
del

~~~
MzxgckZtNqX5i
Wrong thread perhaps?

------
holstvoogd
hmm, I feel ARM performance is a bit like nuclear fusion. It's always the next
generation that will deliver an order of magnitude performance increase. Yet
some how ARM single core performance is still shit compared to x86. (No matter
how much I hope and pray for that to change, cause x86 needs to die)

~~~
pzo
I benchmarked few image processing algorithms (opencv) on my iphone xs. I was
surprised processing was faster than on my quadcore i7 macbook pro 2012. Sure
cpu is 6 generation behind but there haven't been much performance
improvements in the last 7 years on x64. Probably around + ~50% single thread.

~~~
runeks
That’s six years of difference, though. That’s not a fair comparison.

How much has x86 performance increased over the past six years?

~~~
pzo
Not that much. Each generation there was maybe 7% perf improvement so in 6
years: 1.07^6=1.50. Improvement was mostly with adding more cores.

This ~1.5x in single thread performance improvement matches with stats from
geekbench.com benchmark:

my macbook pro 2012 cpu: i7-3615QM => 3093 single thread [1] equivalent
current gen cpu with same TDP (45W) and similar clock (2.3GHz): i7-8750H =>
4617 single thread [2]

[1]
[https://browser.geekbench.com/processors/741](https://browser.geekbench.com/processors/741)
[2]
[https://browser.geekbench.com/processors/2144](https://browser.geekbench.com/processors/2144)

------
The_rationalist
When will Deep learning frameworks get a reality check and decide to support
openCL/SYCL? All this hardware is useless silicone until then

~~~
Joky
Tensorflow has the XLA compiler for targeting accelerators. In practice the
integration isn't necessarily amazing right now, but the plan is to make it
much better and able to generalize more easily.

Disclaimer: I work on the MLIR/XLA team.

~~~
The_rationalist
The openCL tensorflow issue is still open and no Google dev has shown
interest.. Yet I'm curious about your progress on this and when this could
target AMD / Intel / ARM

------
thomasfl
The day Apple releases their first ARM powered laptop, will will be a turning
point. This comment written on an ipad, the best product to come out of apple
to yhis day.

~~~
imperialdrive
While I don't agree with the first part, I do agree that the iPad is unique
and worthy being the only Apple device my hard earned money has ever gone
towards.

~~~
intricatedetail
Which one do you have? I have iPad 2 and it still works great. Is it worth
upgrading?

~~~
leadingthenet
iPad Air 2 or iPad 2? Because you should most definitely upgrade if you have
an old iPad 2. The new ones are truly amazing, especially the Pro models.

~~~
intricatedetail
I have the old one. It serves me well so I am intrigued.

~~~
xvector
I have heard nothing but universal praise for the Pro model. But no point in
upgrading just to upgrade - the beautiful thing about Apple products is that
they seem to work forever.

