Hacker News new | comments | show | ask | jobs | submit login
Jeff Dean explains TensorFlow [video] (youtube.com)
383 points by quantisan on Nov 10, 2015 | hide | past | web | favorite | 62 comments

Just wanted to repost this from the other thread on TensorFlow, since I joined the party a bit late:

I think some of the raving that's going on is unwarranted. This is a very nice, very well put together library with a great landing page. It might eventually displace Torch and Theano as the standard toolkits for deep learning. It looks like it might offer performance / portability improvements. But it does not do anything fundamentally different from what has already been done for many years with Theano and Torch (which are standard toolkits for expressing computations, usually for building neural nets) and other libraries. It is not a game-changer or a spectacular moment in history as some people seem to believe.

Well, it looks way more scalable than Theano or Torch while being as easy to use as Theano. I'd say that's pretty exciting considering the number of groups working on way lower-level scalable neural nets.

This is "not a game-changer" in the same way map-reduce isn't a game-changer wrt for loops.

Also check out TensorBoard, their visualization tool (animation halfway down the page):


Only the single machine version is open sourced.

At the moment. They are working on making the distributed version available too.


Hope to make the distributed version available. That's not a promise. Google outsourced Blaze, their distributed build system, as Bazel earlier this year. However, no sign of a distributed version there yet. As a result, Bazel has had almost no adoption.

I think the fundamental differentiator might be how "production ready" TensorFlow is - the promise of simply declaring an ML pipeline and have that run in very heterogeneous compute environments from mobile phones to GPU blades to plain-old clusters, if fulfilled, can indeed be a huge game changer. The promise is that you literally do not have to write any new code when you are done with a research project / a series of experiments and are ready to deploy your pipeline at a large scale. None of Theano / Torch etc make that promise.

That's not really a fundamental differentiator. Torch/Theano are definitely production ready. I think the portability is definitely an advantage, though.

I think you would not use Theano in an end-user product. It's made for developers to run on developer machines. It's very fragile. It has a very long start-up time, might be in the order of several minutes at the first start.

Maybe it would work good enough in a service backend. But even there it would not scale that well. For example, it doesn't support multi-threading (running a theano.function from multiple threads at the same time).

Good point. Torch, then.

Torch is used in production by Facebook , deepmind and possibly baidu, amongst others. Facebook especially have released code to make torch run much faster on AWS with GPU cards. Also no startup time. The design is done with a high level language (lua) while computation done mostly in C. I'd be very surprised if tensorflow is actually faster than torch on a single machine

Torch has extremely difficult learning curve due to Lua. With Tensor Flow underlying engine in C++ it is likely as efficient as Torch. Special extension such as Nvidia Cudnn could also be used with tensorflow.

What do you mean by "diffcult learning curve due to Lua"? Lua isn't difficult. Lua is easier than JavaScript!

Or perhaps you mean to say that Torch itself is difficult to learn because of design choices that were made in order to use Lua?

Learning an entirely new language for a project, no matter how simple that language is, is certainly a barrier to entry.

If you have already used another dynamic, imperative language, you can probably learn enough Lua to use Torch effectively in 30 minutes.

Seriously, there are three features in the Lua language which are not trivial: metatables, coroutines and _ENV. None of those are needed to use Torch.

It will take more time to learn Torch-specific APIs, but the same problem exists with the other ML frameworks.

If somebody finds learning Lua to be difficult, then learning C++, learning basic statistics, or learning machine learning will be impossible for them.

I'm curious how the performance and scalability compares with Theano and Torch. I'm thinking the reason Google built this is that they wanted to scale computations to really large clusters (thousands, tens of thousands of machines) and the other options didn't really cut it.

Here's a page with various benchmarks: https://github.com/soumith/convnet-benchmarks

An issue has been created to add TensorFlow to this shortly.

This looks to be a single machine test, where this video and the poster above specifically talked about running against compute clusters. I don't think a single machine benchmark is going to be nearly as interesting.

They didn't release the multi-machine version of TensorFlow. They said they're still working on it and will release it when it's ready.

As someone who had to go through pain of using caffe, struggled with steep learning curve of Lua/Torch and frustrated by lack of simple features (train on gpu/test on cpu) of Theano. You could not be more wrong. Having a well written, consistent and portable system is a huge plus. Tensor Flow is likely to do to deep learning what Hadoop did to Big Data.

its probably another efficient library, but its good to have another baseline to compare things

FYI, Jeff Dean is the inventor of most of Google's distributed computing infrastructure including MapReduce. Definitely up there with the likes of John Carmack and Fabrice Bellard as one of the great software engineers of all time.

I'd wager that way more people know who Jeff Dean is than know who Fabrice Bellard is.

Maybe some people do, personally I'm more familiar with the works of Fabrice Bellard, and that's probably fair enough since I expect I have more pieces of software derived from Bellard projects installed on my machine than I do software derived from Jeff Dean projects (not to criticise Jeff Dean, who I'm sure is very inspirational).

I just wanted to do a +1 here, but as it is not very HN-worthy, I'll provide a little bit more of work, and just say that a search in HNSearch for "Fabrice Bellard" produced 3 pages of results for tha last year, whereas "Jeff Dean" only produced 2.

Personally, I feel I've been more exposed to Fabrice Bellard's work, which might be not true, but I first learned of Jeff Dean's existence yesterday.

This is the MapReduce white paper, from Jeffrey Dean and Sanjay Ghemawat, 2004, if people are interested. http://research.google.com/archive/mapreduce.html

The Jeff Dean Facts are worth reading: https://www.quora.com/What-are-all-the-Jeff-Dean-facts

Personal favorite: "Jeff Dean once shifted a bit so hard, it ended up on another computer."

"Jeff Dean's resume lists the things he hasn't done; it's shorter that way."

thanks for the link, it's really funny

Is there anything an "early" programmer like myself can do to play around with this stuff without a background in the related math?

I'm dying for this stuff to be dumbed down enough where Joe WebUser can feed in arbitrary data in a csv or point an app at a data source and get some sort of meaningful results.

It truly seems like an area where once the barrier to entry is greatly reduced, the creativity of laymen will lead to some truly amazing executions.

I recently asked the same question [1], and got several helpful replies.

I found this [2] very approachable, and you can find the material [3] and code [4] from the talks on github.

[1] https://news.ycombinator.com/item?id=10457439 [2] https://www.youtube.com/watch?v=r4bRUvvlaBw [3] https://github.com/scipy-conference/scipy2013_talks [4] https://github.com/jakevdp/sklearn_scipy2013

You can get away without Math now, but it's now it still a pretty steep learning curve.

I'd suggest http://karpathy.github.io/2015/05/21/rnn-effectiveness/ is a good place to start.

The other option is using nVidia Digits toolkit.

    I'm dying for this stuff to be dumbed down enough[...]
It kind of already is. Have you read the docs/examples? I don't think your mentality is fruitful. Having argued with people who shared your point of view, it seems there will always be something too difficult that prevents them from being good at X.

There's no substitute for sweat. Have fun with the code they gave you and see where you end up!

I think we have different definitions of what "dumbed down" means.

I never said I could/would never put in the work to learn it. I'm saying that the place it is in right now is still too advanced for someone with my background to pick up and play around with without sitting down to seriously study the underlying concepts that are objectively fairly dense subject matter that can require advanced math and CS backgrounds.

To be clear, I'm not advocating that everything should be dumbed down for the sake of it. My point was largely that when the barrier to creation gets low enough, more creative types that don't have the heavy technical backgrounds can pick it up and create things that more technical users may never have imagined.

Not everything is best served as remaining elusively complex for the layman.

Also, for the record, I will probably read up on some of this stuff because I find it interesting and enjoy learning. I just wish it was a step more accessible than it is today, even with this development.

I agree with Shostack. He is talking about removing friction, easing out the learning curve.

The command line make command on GNU/Linux is an example of something that "dumbs down"/makes easy a quick start , as opposed to editing and configuring the Makefile yourself. Similarly, yum/apt-get take this "dumb down" one step further.

Nothing wrong with removing friction. Infact this is an idea for a startup right here, remove friction from machine learning/NLP/API.

That is why I responded to shostack in the first place. The response was specific to his question and I got plenty of downvotes on my karma. No worries there :)

Hi Shostack,

I wasn't trying to bring you down. Maybe what you're looking for is a visual programming environment, where you can drag and drop functions, data, etc?

Not sure I even need a visual programming environment as I'm comfortable using a command line and hacking stuff together in Sublime.

However I am highly visual and visualizing the impact on the results would be really helpful. I deal with a lot of analytics and data as part of my day-to-day managing digital media. I often find that I can easily spot trends just by glancing at data visualizations, and infer insights from them.

Further, being able to visualize the nature of the functions/data/etc. would also be very helpful. I tend to need to visualize something to fully grok it.

If you have any suggestions for a more visual take on machine learning that is beginner friendly, I'd love a link.

It's probably best to understand what you're doing if you use something like this. I'd recommend at least going through an intro ML book/course to learn the basics first.

any suggestions on good intro ML books?

Hi shostack, You can try out ThatNeedle API. http://www.thatneedle.com/nlp-api.html . While it is only catering to retail for now. The underlying capability is broader and generic. If you have specific needs, let's get in touch and discuss. You should at least take a look at the gif demo on the site.

That's pretty cool, I'll play around with it. Thanks for sharing!

You are welcome!

TensorFlow looks amazing, and the Google team deserves huge kudos for open-sourcing it. It may well become one of the best supported OS DL frameworks out there.

People have been asking about its fundamental differentiators. I'm not sure there are any. Theano and Torch already set a pretty high standard.

We know what good tools look like, and those tools exist even if they're getting incremental improvements.

Now it's just a matter of building really cool things with them.

This is from ~22nd of October, and he is being asked whether it is an internal thing or going to be released - his answer is that it is internal and there are no plans to release it. Did they change their mind in a couple of days (not likely)? Was he not in the loop (also unlikely)? What else?

Maybe touching to lips gesture is a body language sign of not telling the exact truth:


He says "I don't have anything to announce" so technically not lying.

It would be great if one could automatically dispatch this to a commercial cluster. So you could say: I want this network to be trained in 1 day, and the system would say: that would cost $X, and it would instantiate some AWS/Azure/Google instances, and run the task.

> TensorFlowâ„¢ is an open source software library for numerical computation using data flow graphs.

> ...

> Gradient based machine learning algorithms will benefit from TensorFlow's automatic differentiation capabilities. As a TensorFlow user, you define the computational architecture of your predictive model, combine that with your objective function, and just add data -- TensorFlow handles computing the derivatives for you.

Interesting, it kind of look like the machine learning focused version of NASA's OpenMDAO (also a graph-based analysis and optimization framework with derivatives, but for engineering design).

Jeff Dean has an amazing resume. He designed and implemented MapReduce, BigTable and much more.

OT but how much does a super engineer like him get paid at Google?

Largely it'll come down to how much money he wants.

His salarly will probably be in the 6-figures, but he'll be a millionaire many times over. He joined Google in 1999 (IPO was in 2004), so his stock will have made him a very rich man.

Way off. He's a Senior Google Fellow and his bio used to be on the executive leadership page of Google (I can't find that page any more). He is being paid on the order of 10MM per year, base, easily.

Perhaps, I doubt his salary is that high, but he'll have significant amounts of of stock grants, options, etc.

Executive compensation is a very odd area, as I said, it pretty much depends how much money he wants.

Yeah I should have been more clear. But I'm pretty sure his equity alone is more than $30M / year.

This was a great watch. It made me want to apply to the residency program. Edit: wasn't sarcasm.

Yay!! An open sourced voice engine in the future? That would really shake things up.


a) Stuff similar to this has been available for ages and there are no (good) open source voice recognition packages.

b) It requires absolute mountains of training data which we don't have.

c) It requires designing a suitable network, which I'm not sure if we have, but I would doubt it.

d) It requires training a network on the mountains of training data using an immense computing cluster, which we requires money that we don't have.

Don't hold your breath.

Sometimes creativity is the only thing holding people back from exploiting the natural insects of the web.

Case in point, ever wonder why those captchas include street addresses or 'pick the shape with a hole in it?' Spoiler: you're building training data and validating training data.

How else can we silently retrieve training data?

Can anyone explain why was this downvoted?

I agree there. Plenty to be done still.

Who cares? Gradient descent is bandwidth limited not software limited. (edit: for ANNs)

Only for some applications, I think.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact