Hacker News new | comments | ask | show | jobs | submit login
Jeff Dean on Large-Scale Deep Learning at Google (highscalability.com)
140 points by charlieegan3 on Mar 16, 2016 | hide | past | web | favorite | 30 comments

It seems to work fine for me, a link to the actual talk on YouTube: https://www.youtube.com/watch?v=QSaZGT4-6EY

Jeff Dean - Chuck Norris for us nerds - fact as a bonus: "The rate at which Jeff Dean produces code jumped by a factor of 40 in late 2000 when he upgraded his keyboard to USB2.0."

Tangentially, watching the pace of papers coming out in machine learning is insane. It's so fast, people may literally cite powerpoint slides when the paper doesnt exist yet. The culture of openness seems to have fostered this insane pace. Contrasting that with the reclusive culture of life sciences explains why there is slow progress there.

I think what you're saying is broadly true, but I'd moderate it a bit by pointing out that reproducing and extending research in software is a heck of a lot simpler and faster than working in topic areas that require physical experiments and processes.

If someone with technical expertise wanted to keep up on this field, but it wasn't their profession - i.e., they don't need to know every detail and don't have time to read a lot - what would be a good source?

Follow Yann Lecun's posts on Facebook.

Arent's the systems being studied in life sciences much more complex than in machine learning?

Consider the problem of protein folding, which has taken up many a processor cycle over the past decade or so. And that's just a tiny sliver of life sciences.

ML and life sciences operate on different abstraction levels. It's like asking which is more complex: quantum mechanics or algebraic topology.

If you like this talk, come see him talk about what's even beyond that at GCP Next: https://cloudplatformonline.com/NEXT2016.html

Disclaimer: I will be there freaking out because I work at Google on Cloud and Jeff Dean is rad.

>> If you’re not considering how to use deep neural nets to solve your data understanding problems, you almost certainly should be. This line is taken directly from the talk

And this is exactly why Google's hype of their tech is getting dangerous for everyone else, who is not Google. Because they advocate, nay, they preach, that everyone should abandon what they're doing and do what Google tells them works. And, oh, look, we just released those nice, free tools you can use to do it like we do!

Which is insane. Google is a corporate entity. It has financial interests. The purpose of its existence is to sell you its stuff, it doesn't give a dime if you'll solve your problems or not.

This piece of advice is like Bayer, back in the day, selling its Aspirin as the cure of all ills: "If you're not considering how to take Aspirin to solve your health problems, you almost certainly should be".

Although Google is a corp and has financial interests, I think it's in Google's interest to share these ideas in workable form with the world. It can (and I hope it will) contribute a lot to improving a number of things that are wrong with the world.

When I was an academic scientist in the mid 2000s, I ended up with more data than I could deal with, and none of the computing systems in academia at the time dealt well with that (they were tuned for HPC/supercomputers). The bigtable, mapreduce, and GFS papers were huge to me, because they provided a nicer framework for data processing. Although Google made those tools for Search and Ads (and profited greatly from them) they also published them, and Doug Cutting and others incorporated them into Hadoop. A similar thing is happening now, but Google got better at releasing their codes as open source, which reduces the time between publication of a good idea, and replication of that work by others outside the corp.

(eventually, I went to google to get direct access to its infrastructure; built Exacycle, gave away an enormous amount of free computing time that cost Google rather than profiting it, the leadership loved it even though it cost money, and I even managed to get Googler to apply machine learning to academic problems I cared about).

So I don't think Google solely acts in its own short term financial interests.

Also, aspirin has turned out to be amazing at solving a wide range of health problems, so I think bayer was probably right (if not for the right reasons) on that one.

>> So I don't think Google solely acts in its own short term financial interests.

I think what your experience shows is that on the one hand individuals within Google (or any big corp) can and do align their own personal interest with that of the corp and on the other hand that the corp can benefit the community as long as it is making profit and serving its own purposes. Nothing surprising there.

As to releasing its tools, here's my Thought for the Day: There's no such thing as a free lunch and the only people who pretend there is are the ones who want to steal your lunch money. Google releases its tools when it is in the interest of Google to do so, not when it's in the interest of anyone else. Yes, they're doing better now than in the past in open-sourcing stuff and I can't know what's on their mind. But I can tell that it doesn't hurt them to get people adopting their tech even as Google itself develops it further and further to something that can only be used by a corp with Google's resources. In short, I'm pretty sure that their friendly offer of, frex, TensorFlow is just some trick to get people roped in to their technology, in the same way that other corps have tried to do before- except that they also made you pay for the privilege.


Did you really say that making TensorFlow open source is a trick to get people roped into Google technology?

That doesn't make any sense to me.

Another big point I think you missed is those individuals within Google influence the decisions about what gets open sourced. We have an entire team that facilitates taking Google-written code and opensourcing it.

OK, with the hindsight of a good night's sleep I admit that the bit about giving away TensorFlow does sound a bit tinfoil-hats on.

Let me rephrase that then: I can't possibly hope to know why Google is giving away free stuff. I can certainly know that they don't do it out of the kindness of their hearts though.

That said, I am indeed very concerned that Google is trying to shape, not only the market, but the science also, to suit its own interests. That could be really bad for everyone, including Google; if research stagnates, they too will find themselves unable to deliver on their big promises about ever speeding progress.

>> Did you really say that making TensorFlow open source is a trick to get people roped into Google technology?

Uh, yes? It's a market entity handing out free stuff.

I know, you're saying they do it out of the joy of handing out free stuff. But, well, that is what really makes no sense.

Google is made of thousands of engineers (and other employees). Most of these engineers love to solve cool stuff (well, while getting paid for it, of course). And, being engineers, when they think they found a particularly cool solution to a cool problem, they would love to talk about it to everyone else. Google has to actively police its own engineers to keep their mouth shut (just like any other company would).

Engineers want to talk about their ingenious solutions. Google wants to keep them happy (as long as it doesn't cost too much), because otherwise they will just leave and join Facebook. No need to imagine a conflict of interest where none are.

(Yes, yes, I know they are not legally engineers according to some laws in some part of the world.)

The concept of keeping secrets does not apply to companies who are their own best customer. Witness Amazon putting AWS out there for dirt cheap. Witness github and all the open sourcing. Witness Microsoft and its open sourcing. Witness Apple open sourcing part of its OS. The publication of 'secrets' is no longer revealing the company jewels. Secrets are so complex and the infrastructure is so large, that the best people who know their value, flaws and shortcomings are the owners! It would have been like AT&T revealing the 'secret' of their network switch. Who else could possibly use it. Or western electric revealing the 'secret' of their wire.

Absolutely, and even more, you can not do modern research unless you are part of Google. Only big corporations have computing power and data access to do large scale data research with neural networks. This basically rules out a lot of teams who have much less funding.

He gave a similar talk at stanford a few days later: https://www.youtube.com/watch?v=T7YkPWpwFD4

Nice. Forbbiden. Did we manage to crash the site? highscalability.com supposed to be pretty high-volume site.

Sorry about this. It means Squarespace has black listed your IP for some reason. Unfortunately I can't do anything about it. If you can try from another address it will probably work.

Wow :-) I am working from corp office. But thanks!

I am blocked on a (verizon) cellphone.

Blocked for me too....working from govt. office.

I am very interested in AI that can teach itself(sounds too great). Where can I learn up about such AI(related concepts and the whole 9 yards) to start reading papers in the field? I am just looking for comprehensive sources(preferably textbooks).

AI by Russell and Norvig. Machine learning by Murphy, Elements of Statistical Learning by Hastie et al. Just a few good ones out of many!

AI by Russell and Norvig is one of my favorite textbooks of all time.

I wish I had more than one upvote for this article. Read the article. If you have the time, just watch the video.

you need to understand the data before it can be made to good 'use'


From the article

"...it seems like an excellent time to gloss Jeff’s talk..."

"gloss" a talk? WTF?

To gloss is to annotate some text (or talk)[1], the word glossary comes from that. That meaning is overshadowed by the more modern association with shininess but the annotation meaning seems appropriate here.

[1] https://en.wikipedia.org/wiki/Gloss_(annotation)

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact