
TensorFlow-DirectML - t4h4
https://github.com/microsoft/tensorflow-directml
======
lostmsu
Just tried it.

On integrated graphics (Intel HD 620) (batch size 1000):

\- was able to train a simple dense network, but no speed up over just doing
CPU training on the same processor (i3-7100U)

\- ResNet style architecture failed on the same HD 620 with "LLVM ERROR: SPIRV
internal error: Invalid magic number"

On a machine with NVidia GPU (batch size 1000):

\- unlike Intel GPU, ResNet trained without any errors (so it might have been
Intel driver issue)

\- using DirectML came out about 3 times faster, than CPU of the machine
(i7-8700K)

\- using DirectML came out about 12 times slower, than using regular
tensorflow-gpu with CUDA

So far mixed feelings, but I am excited to see how it runs on AMD GPUs, and on
Windows on ARM64 (e.g. Surface X).

P.S. I run [https://github.com/losttech/Gradient-
Samples/tree/master/Fas...](https://github.com/losttech/Gradient-
Samples/tree/master/FashionMnistClassification) and
[https://github.com/losttech/Gradient-
Samples/tree/master/Res...](https://github.com/losttech/Gradient-
Samples/tree/master/ResNetBlock)

had to add "batch_size: 1000" to the fit call to see speedups over CPU.

