Hacker News new | past | comments | ask | show | jobs | submit login

Nvidia Titan V can do 110 TFLOPS, 12GB of 1.7 Gb/s Memory [1] and sells for 3,000$. TPU v2 does 180 TFLOPS, 64GB of 19.2Gb/s Memory [2].

That's a heck of a performance boost for a chip that's likely costing google way less than the nvidia flagship.

[1] http://www.tomshardware.com/news/nvidia-titan-v-110-teraflop...




It'd be really interesting to know the per-unit math on that.

Designing and taping out a new ASIC isn't cheap.

Presumably Google needs to use a fairly recent process (22nm or better?), which means GlobalFoundaries/TSMC or Samsung (do any of the Chinese native fabs have 22nm yet?). I wonder who us building them?

So many questions...


The TFLOPS numbers are not directly comparable. The TPUs use reduced precision in some areas, whereas I am guessing the Titan V numbers are based on single precision operations.


Titan V numbers are also reduced precision (16 bits), using their tensor cores.


If all you need is reduced precision, then it’s fair to compare the two. I’d also assume memory bandwidth matters just as much as TFLOPS for ml workloads.


It's not clear to me how programmable the tpu is. I'm sure it's great at convolutions and matrix multiplies. Can it do anything else?


Without speaking to the capabilities of TPUs, note that most ML models today are mostly convolutions and matrix multiplies.


Neither do tensor cores


The tensor core is one part of the GPU. It has plenty of other capabilities.


What else should it be doing?

It's an accelerator to run Tensorflow graphs, and TF graphs essentially are converted to matrix operations and convolutions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: