Hacker News new | past | comments | ask | show | jobs | submit login
New Graphcore IPU Benchmarks (graphcore.ai)
17 points by ingve 29 days ago | hide | past | web | favorite | 9 comments

What does IPU stand for in this context? The only use I know of is Image Processing Unit — silicon that is custom developed for cameras. https://en.wikipedia.org/wiki/Image_processor

This is clearly not that.

Edit: Intelligence Processing Unit. https://www.graphcore.ai/products/

I assume that the Bionic neural engine in Apple's A11 and later is an example of this?

"IPU" was made up by graphcore to describe their processor, so you call choose to use the term for as many or as few processors as you want. IMO, any processor can be an "intelligence" processor.

Graphcore are coining the initialism, therefore I imagine they’re not keen on it being applied to anything other than their design and its novel architectural approach.

It’ll likely end up just like GPU though, as part of the industry’s nomenclature.

Given the impressive inference results shown on this page, I find it interesting that graphcore did not participate in mlperf inference v0.5.

It's probably because these are the best hand picked results they could come up with (and also, crucially, _worst_ for the competition), and mlperf has a fixed set of tasks and you don't get to do that. I've done this shit professionally: if you're given full freedom wrt the choice of the task and configuration, you can easily make it seem like your competition sucks beyond belief. You can "prove" that a CPU is faster than a GPU without much trouble. :-) And yet NVIDIA still dominates most rankings, mostly, even though in absolute terms it's nowhere near the fastest or the most efficient option (that'd be the TPU, on some tasks, at the moment). With so many performance cliffs in the hardware tooling is super important, and NVIDIA is the only viable acceleration option that has any meaningful tooling at all.

Caveat emptor: the published numbers (_any_ numbers, not just Graphcore's) are mostly bullshit unless code is also published and hardware is available for independent measurement. There's no way you're getting the claimed 13TFLOPs out of your shiny new NVIDIA GPU. Take an off the shelf resnet50 (4GFLOPS) and witness it run at about 700 samples per second, which pencils out to 2.8TFLOPs not 13. Still amazing (and still easily 10x the high end CPU throughput), but perf claims are often exaggerated by as much as 10x even by big names, let alone startups.

This is probably why they didn't publish any Resnet50 numbers in this post.

They have been describing this as shipping (in the Dell systems mentioned) for a while, I would guess 2 years. It isn't clear to me where the hardware is going or why the company hasn't made more of a splash if performance is so good.

The obvious answer is that it wasn't working until very recently.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact