
My Experience and Advice for Using GPUs in Deep Learning: Which GPU to get - michaeln
http://timdettmers.com/2018/08/21/which-gpu-for-deep-learning/
======
lern_too_spel
The "I have almost no money" recommendation should include Colab.
[https://medium.com/deep-learning-turkey/google-colab-free-
gp...](https://medium.com/deep-learning-turkey/google-colab-free-gpu-
tutorial-e113627b9f5d)

Somebody who has almost no money isn't going to be able to equip a desktop
with a GTX 1050 Ti ($175), fast disk ($50), and RAM ($50) on an entry level
cpu/motherboard/power supply/case/monitor/peripherals ($300) and pay for the
electricity used during training. Colab can be accessed from a free public
computer or a cheap Chromebook ($200).

~~~
andy_ppp
What are the rules about datasets I upload to this free service? Do Google now
own them?

~~~
ColanR
My guess it's the standard caveat: if you don't pay for the service, you and
your stuff is the commodity.

------
sabalaba
The 2080Ti numbers are likely going to be a lot lower than that.

We’ve benched the 1080Ti vs the Titan V and the Titan V is nowhere near 2x
faster at training than the 1080Ti as suggested in that graph. We observed a
30% to 40% speedup during our benchmarking:

[https://deeptalk.lambdalabs.com/t/benchmarking-the-titan-
v-v...](https://deeptalk.lambdalabs.com/t/benchmarking-the-titan-v-volta-gpu-
with-tensorflow/108)

This is consistent with the 32% increase in FP32 flops from 11.3TFlops for the
1080Ti to 15TFlops for the Titan V. Additional speedups can be explained by
the increase in memory bandwidth for HBM2 and the mixed precision fused
multiply adds provided by the TensorCores.

Thus, given the quoted 13Tflop numbers for the 2080Ti, I would expect the
2080Ti to present something more like a 15-20% speedup over the 1080Ti. So
2080Ti is less bang for your buck. But benchmarking is the only way to tell
what’s better on a FLOPS/$ basis.

~~~
timdettmers
Your data are inconsistent with the benchmarks that I mention in the blog
post: [https://github.com/u39kun/deep-learning-
benchmark](https://github.com/u39kun/deep-learning-benchmark)

You also do not benchmark LSTMs: [https://www.xcelerit.com/computing-
benchmarks/insights/bench...](https://www.xcelerit.com/computing-
benchmarks/insights/benchmarks-deep-learning-nvidia-p100-vs-v100-gpu/)

If you put both of those benchmarks together my conclusion is quite
reasonable. But I see that you could also come to your conclusion with your
benchmarks. It is just a question which benchmarks are less biased and that is
too difficult to evaluate.

I guess we have to wait for real data, but thanks for putting your data out
there to get a discussion going.

------
ageitgey
This is a great article and I highly respect his opinions.

However, since you are probably eagerly reading this to see how fast the new
RTX cards are, so you should know upfront that the numbers he has so far are
just estimates based on specs:

> Note that the numbers for the RTX 2080 and RTX 2080 Ti should be taken with
> a grain of salt since no hard performance numbers existed. I estimated
> performance according to a roofline model of matrix multiplication and
> convolution under this hardware together with Tensor Core benchmarks from
> the V100 and Titan V.

~~~
nolok
A great way to turn a listing you can trust enough to use as one of your
comparison basis, into a listing made up of imaginary marketing numbers.

I guess the click baiting is needed / the best option, but I hate that's it's
what most web resources are like now.

~~~
r1nkgrl
But the article isn't hiding the fact that the numbers are estimates. People
are curious how the new cards will stack up, and this article provides the
best evaluation of that given the information they have available.

------
pirocks
Seems down for me:

[https://web.archive.org/web/20180821173206/http://timdettmer...](https://web.archive.org/web/20180821173206/http://timdettmers.com/2018/08/21/which-
gpu-for-deep-learning/)

------
scottlegrand2
The biggest advance here is that Nvidia has produced a consumer card that has
all the high-end deep-learning features. This was missing in both the Pascal
and Volta Generations even though in Pascal fp32 was full power. I think the
TPU scared them and that's a good thing.

------
syntaxing
Hacker news hug of death? Anyone here have any experience using AMD cards with
something like PlaidML? I have a 1050Ti SSC but I'm starting to feel the
limitation as my complexity grows. But getting a 1080 is a bit out of my
budget right now. I'm tempted to get the new Vega 56 released recently.

~~~
steve_musk
You could wait and see how pascal prices fall after Turing comes out.

------
fermienrico
The cost/performance plot - shouldn't it be "Lower is better"? It says "Higher
is better".

Lower value would indicate lower cost per unit level of performance.

It should be "Lower is better" or the plot needs to say "Performance/Cost". Am
I missing something?

~~~
timdettmers
Thanks for your feedback! Someone mentioned this on twitter as well and I
thought it was a good point so I implemented that change.

~~~
fermienrico
Thanks for being receptive. I wouldn’t call it a “good point” if it was a
mistake that was corrected.

------
dostres
An open question for me is the performance of two 2080tis using NVLink as one
virtual GPU. I imagine it’ll be close to linear, but I’ll be interested to
know for sure.

~~~
shaklee3
It won't be linear for memory-bound applications. The v100 was able to make it
close to linear with large enough transfer sizes, but it has 50% more memory
bandwidth than these.

------
KayL
Good article, but as a new learner, I'm interested in (your experiences on)
how much time taken for the common task to train a model? 1min vs 2mins,
probably I will get a cheaper GPU but if there's 5h vs 10h or 1 day vs 2 days,
I'd save more money for one with good performance

