Hacker News new | past | comments | ask | show | jobs | submit login

"Tensor hardware" is a very vague term that's more marketing than an actual hardware type, I guarantee you that these are really SIMD or matrix units like the Google tpu that they just devised to call "Tensor", because, you know, it sells.



They're matrix units just like in the Google TPU but the TPU stands for "Tensor Processing Unit" so that's consistent. There's no reason to add special SIMD units when the entire core is already running in SIMT mode and by establishing a dataflow for NxNxN matrix multiplies you can reduce your register read bandwidth by a factor of N. Which isn't as huge for NVidia's N=4 as for Google's N=256 but is still a big deal, and diminishing returns might mean that NVidia is getting most of the possible benefit when stopping at 4 and preserving more flexibility for other workloads.


For me, the laymen, reading the matrix multiply stuff that's what it sounded like to me as well given my understanding of SIMD and such. Especially when they made mention to BLAS. But I am no expert.


Yup, the tpu also, it was just a systolic matrix multiplier, but hey, it's Google, and they called it a "Tensor processor" so let's get a hard on..m




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: