Hacker News new | past | comments | ask | show | jobs | submit login
Hello, Tensorflow (oreilly.com)
596 points by lobsterdog on June 20, 2016 | hide | past | favorite | 41 comments

Wow! I was going to post this but here it is already! I wrote (with a lot of help) the article there. I also jotted down some notes on the process of writing it with O'Reilly in case anybody's interested in that side of things: http://planspace.org/20160619-writing_with_oreilly/

Just a CSS critique: could you add "monospace" to the end of your list of monospaced fonts?

I have perfectly good monospaced fonts on this computer, but none of them are Consolas, Menlo, Monaco, or Lucida Console, so I end up with a default proportional serif font.

This might seem like an innocent change but it really is another HTML/CSS browser hell hole. If I remember correctly it was Jeff Atwood or jzy who taught me this lesson at a time when Stackoverflow was just out of Beta:

If the keyword monospace is somewhere in your font stack some browsers use 13px as default font size instead of the usual 16px. The workaround used to be to use both, serif and monospace, in your font stack. This used to work some time ago, I don't know if it still does with contemporary browsers.

O'Reilly tells me that this issue has been resolved as of last Friday. Does it seem better now?

Thanks for the feedback! I'll pass along to the O'Reilly team and see if there's anything they can do.

O'Reilly tells me that this issue has been resolved as of last Friday. Does it look better for you now?

Great article! You did miss a GoT plug though "An Object has no name". Tisk! But seriously though thanks for the background post to go with. Cheers.

Thanks! May objects not bring you any unwanted gifts! :)

It was nice to hear a little behind the scenes. I'd be curious to hear any analytics about unique visitors, etc so far from the O'Reilly post if you can share.

I'm curious too! I don't know yet whether they'll pass that info to me, and then I don't know whether I would be able to pass on any further. I imagine at least some of the analytics are considered proprietary/competitive info. But we can see some public things like votes here, number of tweets, etc.

Good post man. Hope DC is treating you well these days.

Thank you! Things are going great in DC - I happen to be in NYC for ICML this week though, where things are also going pretty great! :)

As I've been reading about tensorflow lately I feel like I'm missing something regarding distributed processing. How can Tensorflow 'scale up' easily if you are outside of Google? We have big datasets that I want to run learning on but it seems awkward to do with tensorflow. We're big enough that the team managing our cluster is separate than development and it is a huge pain if we need them to go install tools on each node. Even with Spark support it seems like the tensorflow python libraries need to be set up on each machine in the cluster ahead of time.

Am I missing something?

No, you're not. Google did this with their build engine (Blaze, internally - Bazel is the open-source API, lacking a distributed build platform). Google are doing this with Apache Beam (the API to Google dataflow) - releasing an API for local testing but not releasing the distributed engine.

If you have your data in a Hadoop cluster and are doing image recognition, Yahoo's Cafe on Spark is the only truly distributed engine out there. It uses MPI to share model state between executors.

Keep in mind there's different kinds of parallelism though. If you mean model parallel, a lot of shops are doing that via RDMA as well as MPI. It depends on how you handle state though.

There's also data parallelism with parameter averaging which we've been doing in deeplearning4j for the last few years. We also support ALOT more than just images. We have the ETL pipelines (kafka etc) to go with it. Watch for a blog post from us on parallel for all (nvidia's blog) where we explain some of this.

I gave a framework agnostic view of the concepts you should consider when looking at distributed deep learning as well:


Have you read through https://www.tensorflow.org/versions/master/how_tos/distribut... ? Since version 0.8 of TensorFlow they've had a way to do distributed processing.

Their blog post in April mentioned it - https://research.googleblog.com/2016/04/announcing-tensorflo...

That said, I haven't actually attempted any distributed processing, but it looks possible. If anyone has actually tried it and can speak to it I would be curious to what people with experience have to say about it.

I have read that. I even re-read it before making my post.

That implementation requires starting individual tasks on each node in your cluster.

>To create a cluster, you start one TensorFlow server per task in the cluster. Each task typically runs on a different machine, but you can run multiple tasks on the same machine (e.g. to control different GPU devices).

I'm used to using tools that can roll out to a cluster with more finesse than that. The Spark wrapper seems to provide some capability to do this automatically, but even the Spark wrapper requires installing python libraries on each node.

Yeah, I'm trying to figure this out too. TensorFlow needs Yarn support. Ideally, Yarn would allocate resources and inform the processes of the various workers in the graph, etc. etc. I see that as the harder part. If you use mesos, then there is some preliminary support for that. https://github.com/tensorflow/tensorflow/issues/1996

Since TensorFlow has native dependencies on CUDA stuff for GPU support, I don't think there's much of a way to get around installing things on every machine. You might be able to package a python env without CUDA for spark to run using conda. Here's an interesting blog post about that: https://www.continuum.io/blog/developer-blog/conda-spark

But I'm not sure I see the point in running TensorFlow without GPU support. And if you're hoping to run GPU machines on an existing spark cluster and intelligently allocate the GPU stuff to the right machine. . . that's gonna be tough. Here's an interesting talk on that from the last spark summit: https://www.youtube.com/watch?v=k6IOWblLQK8&feature=youtu.be

Ultimately, you're probably better off just running your own gpu cluster strictly for your TensorFlow model on ephemeral AWS spot instances.

Or just use Google Cloud Machine Learning. That's what Google wants and expects you to do anyway. Borg is the Borg. You will be assimilated.

TensorFlow without GPU support is very useful for inference.

I think what is going on here is that what we see as complications are actually features, but that doesn't become clear until you are operating with your NNs in production, at scale.

What you want to be able to do is control which devices (CPUs, GPUs or co-processors[1]) execute which part of your model (eg, GPU for training, co-processors for inference, who knows what else).

Yahoo released some code to deal with similar issues, but with Caffe on YARN[2].

[1] https://cloudplatform.googleblog.com/2016/05/Google-supercha...

[2] http://yahoohadoop.tumblr.com/post/129872361846/large-scale-...

    TensorFlow is admirably easier to install than some other frameworks
I thought most frameworks are fairly easy to install in python, usually with a single call to pip. NLTK takes one "pip install nltk" and then "python", "import nltk", "nltk.download()" to download all the corpuses and miscellaneous data. Installing tensorflow seems complicated compared to that.

    # Ubuntu/Linux 64-bit, CPU only:
    $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl

    # Ubuntu/Linux 64-bit, GPU enabled. Requires CUDA toolkit 7.5 and CuDNN v4.  For
    # other versions, see "Install from sources" below.
    $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
Not that either are particularly complicated, but saying other frameworks (assuming they're referring to python frameworks) are "a lot harder to install" seems disingenuous.

That said, I haven't played around with AI frameworks too much, so I might just be missing a real stinker.

once you have to install a GPU device driver, CUDA, tensorflow itself, related python packages like numpy, and you have choices of native installer, Linux packages, pip, conda, there's potential to find yourself in dependency hell, all of those layers are evolving rapidly.

I found these helpful (on AWS)

- http://ramhiser.com/2016/01/05/installing-tensorflow-on-an-a...

- http://tleyden.github.io/blog/2015/11/22/cuda-7-dot-5-on-aws...

of course you don't need to install CUDA just to learn, can run tensorflow on CPU only, but part of the point of the graph paradigm is to design a computation and offload it to GPU.

Compared to other frameworks which allow for deep learning, it's much easier to install.

I see. The article was a bit vague for me in what it means by frameworks (does NLTK count? Django? DL frameworks in other languages?), and since I don't know the area too well, that struck out to me as odd.

It honestly looks like pretty cool stuff, looking forward to having time to play around with it some day.

Thanks! I didn't want to name names in the comparison to other frameworks, but I was thinking mostly of Caffe, which doesn't have any install path as simple as just a `pip install`. Other DL frameworks would be things like torch, theano, etc.

Fwiw, Deeplearning4j is also quite easy to install:



Deeplearning frameworks aren't your normal machine learning frameworks. The real problem is CUDA, which isn't allowed to be distributed separately.

By comparison, here's what you need to install (manually!) for Torch on OSX:

  CUDA (and there's a whole other thread trying to get that to work..)

clearly you haven't tried to install caffe

Like I said, I have not played around with a lot of DL frameworks, only more general python things. I'll take note of that, I'm really interested on how difficult different DL frameworks are to set up.

I like the clarity of thought and structure of the article. I have used Tensorflow and had to explain it to a friend. So many times, I end up assuming things which are obvious to me but not to someone getting started. As said in the article, Tensorflow stands out for the ease of use and is to the best of my knowledge first distributed learning framework. Theano, Torch et al are faster but do not come with goodies like Tensorboard.

There are different notions of "fast" as well. Theano might be faster on a single GPU (not actually sure if this is still true), but TensorFlow can be distributed to multiple GPUs in one machine or even across several machines. Several groups have reported speedups that are near-linear in the number of GPUs you distribute across. This can make TensorFlow orders of magnitude faster than Theano.

I presented a version of this at a meetup! Presentation and supporting materials here: http://planspace.org/20160629-presenting_hello_tensorflow/

I love this short tutorials that give you an introduction to anything in an hour. They help you get interested in stuff you wouldn't have gotten interested in otherwise.

> For more on basic techniques and coding your own machine learning algorithms, check out our O'Reilly Learning Path, "Machine Learning."

This learning path is also available free for Safari Books Online subscribers.


This is fantastic - thank you for doing this. When paired with the browser tool it makes a lot more sense: http://playground.tensorflow.org/

Is this planned to be released as an intro in a book about tensorflow?

Thanks! There aren't any firm plans, but we are thinking about what might make sense.

Wonder when we will see the day when an O'Reilly book is written by AI.

Could be a nice little loop if part of the creators of the AI that would accomplish this learned part of their craft from O'Reilly books.

Hey Aaron! Hello from one of the people you taught in your first GADS class in DC.

Hi! Good to hear from you here! :)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact