
New Graphcore IPU Benchmarks - ingve
https://www.graphcore.ai/posts/new-graphcore-ipu-benchmarks
======
teilo
What does IPU stand for in this context? The only use I know of is Image
Processing Unit — silicon that is custom developed for cameras.
[https://en.wikipedia.org/wiki/Image_processor](https://en.wikipedia.org/wiki/Image_processor)

This is clearly not that.

Edit: Intelligence Processing Unit.
[https://www.graphcore.ai/products/](https://www.graphcore.ai/products/)

I assume that the Bionic neural engine in Apple's A11 and later is an example
of this?

~~~
rrss
"IPU" was made up by graphcore to describe their processor, so you call choose
to use the term for as many or as few processors as you want. IMO, any
processor can be an "intelligence" processor.

------
rrss
Given the impressive inference results shown on this page, I find it
interesting that graphcore did not participate in mlperf inference v0.5.

~~~
m0zg
It's probably because these are the best hand picked results they could come
up with (and also, crucially, _worst_ for the competition), and mlperf has a
fixed set of tasks and you don't get to do that. I've done this shit
professionally: if you're given full freedom wrt the choice of the task and
configuration, you can easily make it seem like your competition sucks beyond
belief. You can "prove" that a CPU is faster than a GPU without much trouble.
:-) And yet NVIDIA still dominates most rankings, mostly, even though in
absolute terms it's nowhere near the fastest or the most efficient option
(that'd be the TPU, on some tasks, at the moment). With so many performance
cliffs in the hardware tooling is super important, and NVIDIA is the only
viable acceleration option that has any meaningful tooling at all.

Caveat emptor: the published numbers (_any_ numbers, not just Graphcore's) are
mostly bullshit unless code is also published and hardware is available for
independent measurement. There's no way you're getting the claimed 13TFLOPs
out of your shiny new NVIDIA GPU. Take an off the shelf resnet50 (4GFLOPS) and
witness it run at about 700 samples per second, which pencils out to 2.8TFLOPs
not 13. Still amazing (and still easily 10x the high end CPU throughput), but
perf claims are often exaggerated by as much as 10x even by big names, let
alone startups.

------
GuyOnMySpace
They have been describing this as shipping (in the Dell systems mentioned) for
a while, I would guess 2 years. It isn't clear to me where the hardware is
going or why the company hasn't made more of a splash if performance is so
good.

~~~
wmf
The obvious answer is that it wasn't working until very recently.

------
The_rationalist
Fascinating.

