If using Docker is an option, the official Dockerfile works well, you just need to modify the FROM line to "nvidia/cuda:8.0-cudnn5-devel-ubuntu16.04". Or "nvidia/cuda:8.0-cudnn5-devel-ubuntu14.04", depending on which version of Ubuntu you want.
This seems to be a frankenstein of cuda 7.5 instructions and cuda 8.0. Similarly ubuntu 14.04 and 16.04. As far as i know, these instructions will fail from gcc 5.4 errors, amongst other issues.
Still doesnt work for me though: even on a new box, I get:
ubuntu@somewhere:~/tensorflow$ python3 -c 'import tensorflow'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/ubuntu/tensorflow/tensorflow/__init__.py", line 23, in <module>
from tensorflow.python import *
File "/home/ubuntu/tensorflow/tensorflow/python/__init__.py", line 49, in <mod
ule>
from tensorflow.python import pywrap_tensorflow
ImportError: cannot import name 'pywrap_tensorflow'
- using branch r0.10, as suggested by https://news.ycombinator.com/item?id=12464835
- making sure to install the new r0.10 wheel, which has a different name than the r0 wheel built by master :-D
As stated elsewhere, this can actually be a very frustrating process. I lost a good chunk of my long weekend trying to build TF from source for CUDA 8.0 / cuDNN 5.1. Generally speaking the culprit is that the CUDA installers for Linux are highly dependent on your kernel and gcc versions. This is a huge headache for people who want to stay up-to-date on their distro packages. CentOS has no problem because hardly anything changes, but you're essentially handcuffed to whatever version s of Ubuntu or Fedora were out when NVIDIA decided to start packaging up the next release. Bumping gcc to 5.4 in Ubuntu 16.04.1 broke the 16.04 installer, which relied on gcc 5.3.
Because GPU-accelerated learning is exciting, and most of the directions you find for setting it up don't work. (Judging from other replies, this post may be no different.)
This probably has something to do with the fact that GPUs are flaky and idiosyncratic, and all the software that uses them depends on black-box libraries handed down by Nvidia, who is completely shit at maintaining software.
For the first time I was able to complete a build last night, Ubuntu 16.04, CUDA 8.0 RC + compiler patch, cuDNN 5.1, nvidia-driver-370, python-2.7, and compute capability 6.1 (for Pascal GPU) - but only when I switched to the r0.10 branch.
With r0.10 I see none of the multiple failure modes that I always see with master. It just went straight ahead and compiled the whole thing.
fwiw: twice now, I've successfully gotten a pip package linked with CUDA 8 & built Tensorflow from source — once for Python 2 and another for Python 3. Both on an Ubuntu 14.04 system
I'm not sure what Tensorflow source you're compiling, but I've been trying many times recently and it fails in many, many different ways. It's a neverending maze of fail, basically. I've never seen the end of it yet. It failed today, too, so the code base is not getting better.
I'm using Ubuntu 16.04, CUDA 8.0RC + the gcc patch, cuDNN 5.1, nvidia-driver-[367|370], tensorflow-master, python-2.7. My process is basically identical to yours.
"You must also have the 361.42 NVidia drivers installed"
No, that would not work with Pascal GPUs.
The only way I've seen it work is if you install CUDA 7.5 and cuDNN 4, and install Tensorflow from the binary package. But then you get weird errors if you run complex models on Pascal GPUs, because CUDA 7.5 doesn't work well with Pascal.
Seriously, if you made it work on Ubuntu 16.04 with CUDA 8 and it's GPU enabled, please upload the pip package somewhere. I'd love to give it a try.
I may go ahead and do a literal clone of your instructions. However, looking at your process, it's what I do, step by step, AFAICT without actually going ahead and doing it.
It's also the fact that it fails in so many different ways. Bazel bombs out after ./configure; the master branch today does not even begin to build at all, the old Bazel workaround is not working anymore. Then there's the gcc issues.
You may have gotten lucky once, who knows why.
Again, do you still have the pip package you claim you've built using this procedure? If so, can you upload it somewhere? I would very much like to test it. Thank you.
https://github.com/tensorflow/tensorflow/blob/master/tensorf...