Hacker News new | past | comments | ask | show | jobs | submit login

I find it fascinating that deep learning research is no.1 on HN. Not even 6months ago it was nearly unheard of on the top 3-4 pages. Is this an indicator that machine intelligence is the new normal?



Nah. Since neural networks are called deep learning it became very fashionable.

There's some well-funded startup (SkyMind maybe?) who declares on their homepage that deep learning is sooo much better than machine learning (first wtf). Then they explain that neural nets with less than 3 hidden layers is machine learning, with more than 3 layers it's deep learning.(second)

I didn't know if I should laugh or cry.(Not that what they actually seem to be doing is bad, but this newly found hype is just wrong.)


Adam (founder of skymind here). I'd like to say that we don't encourage neural networks as a black box like a lot of the startups out there.

I also mainly advocate neural networks on unstructured data where the results are proven to be a significant improvement over other techniques. Many startups in the space would believe that you can have a simple GUI and you're somehow set to go.

In my upcoming oreilly book Deep Learning: A Practioner's Approach, I go through a practical applications oriented view of neural networks that I think will help change things in the coming years.

I'd also add that I've implemented every neural network architecture and allow people to mix and match. A significant result that many people are familiar with are the image scene description generators done by karpathy et.al[1].

Either way, unlike most our code is open source :). I don't claim things are ideal or perfect, but it's out there for you to play with. I focus on integrations and packaging and providing real value rather than pretending that some algorithm is going to be my edge. Many startups in the space will tell you they have awesome algorithms that are cutting edge when in reality you should be providing a solution for people.

[1]: https://github.com/karpathy/neuraltalk/tree/master/imagernn


Thanks for commenting here, Adam. I was wondering if you could address the issues that sz4kerto brought up. Do you think that deep learning is better than machine learning? And do you define deep learning as a neural network with >= 3 layers as opposed to machine learning being < 3 layers?

I don't see those claims anywhere on your website and I have no idea where the grandparent comment got those statements from. Frankly, they don't make any sense. But I do share his or her sentiment that the phrase "deep learning" is overused, poorly understood and is rapidly becoming a meaningless buzzword akin to "Big Data" or "Web 2.0".

I don't expect you to explain statements that you don't appear to have made, but I would like to hear an expert's view on exactly what deep learning is and how it compares to other machine learning techniques. I understand if you've already addressed this issue in your book and we can read your thoughts on this when it's published. But just a few sentences here might do a lot to clear up some misconceptions for fellow hn readers.


I think there's trade offs. I apologize if our marketing copy seems excessive, I'd love to fix that and take feedback seriously. Please do email me.

Deep Learning done right applied to unstructured data is a great part of either an ensemble of methods or great for working with hard to engineer features.

I think as for normal machine learning where you are typically doing feature engineering, you need to understand what's going on to make recommendations for actionable insights. Everything has its place.

I'd like to echo Andrew Ng here, the hype in deep learning is overblown. While there are great results, it's not magic.

Neural networks still need feature scaling, among other things to work well. Much of the hype comes from the giant marketing machine that are the PR firms for the research labs who need new data scientists.

While great work is being done in these labs, much of it isn't going to be applicable in the day to day work of a data scientist just trying to do some some basic A/B testing. Hope that makes sense!


http://www.skymind.io/contact.html

Quote:" On a technical level, deep-learning networks are distinguished from the more commonplace single-hidden-layer neural networks by their depth; that is, the number of node layers through which data is passed in a multistep process of pattern recognition. More than three layers, including input and output, is deep learning. Anything less is machine learning. The number of layers affects the complexity of the features that can be identified."

Again, it's great what you are doing in general, I just brought this up as a example of the buzzwords and hype around deep learning.


I see. We can modify that ;). Thanks for the shout out. Criticisms are how we learn.


this is fixed jfyi


Well,

May I ask what makes a neural network into "not-a-black-box"?

You do mention "proven results". It seems to me that an experiment where one approach does better than another is compatible with one or both approaches being black boxes, ie, there not being more of an explanation than "it works".

But if there's something more going on here, I would love to hear more details.


Sure. Neural networks don't have feature introspection. You use random forest for that. Hinton and others as well as what's reflected in my software encourage you to debug neural networks visually.

This could mean debugging neural networks with histograms to make sure the magnitude of your gradient isn't too large, ensuring debugging with renders in the first layer if you're doing vision work to see if the features are learned properly, for neural word embeddings, using TSNE to look at groupings of words to ensure they make sense, or mikolov et. al on word2vec give you an accuracy measure wrt predicted words based on nearest neighbors approaches.

For sound, one thing I was thinking of building was a play back mechanism. With canova as our vectorization lib that converts sound files to arrays, feed that in to a neural network, and then listen to what it reconstructs.

The take away is, while you can't rank features with information gain like random forest, you can at least go in not completely blind.

Remember, one of the key take ways with deep learning is that it works well on unstructured data, aka: things that have brittle feature engineering(manually) to begin with.

Edit: re proven results

Audio: https://gigaom.com/2014/12/18/baidu-claims-deep-learning-bre...

Vision: http://www.33rdsquare.com/2015/02/microsoft-achieves-substan...

Text: http://nlp.stanford.edu/sentiment/

To further clarify what I mean by black box: I don't like typing:

model = new DeepLearning()

What does that mean?

How do I instantiate a combination architecture? Can you create a generic neural net that mixes convolutional and recursive layers? What about recurrent? How do I use different optimization algos?

Not to pick on ml lib, but another example:

val logistic = new LogisticRegressionwithSomeOptimizationAlgorithm()

Why can't it be: val logistic = new Logistic.Builder().withOptimization(..)

The key here is visualization and configuration.

Hope that makes sense!


Hmm,

Perhaps a thing with lots of parameters and some tools/rules-of-thumb for debugging might be called a "gray box" where a "real" statical model with a defined distribution, tests of hypothesis validity and so-forth could be called a "white box".


Right. Being able to trouble shoot models with some level of certainty is better than nothing. Tuning neural nets is still more of an art than a science in some respects, I'm not aiming for anything magical here. My first step with neural nets is to make them usable.


I work / used to work in ML/AI. IMO I think he is saying that a "black box" approach is one that just shoves the methodology on everything without regard for first structuring and analyzing the problem, which is VERY common in the field. I respect his work greatly.


I'd like to add that one thing that makes Deeplearning4j's neural networks "not a black box" is the fact that they are open-source. Many deep-learning startups don't have their entire code base available on Github.


First of all, not all neural nets are deep, and many people discussing "deep" learning actually know the difference.

Secondly, if you'd like to dispute the series of records broken by deep learning across many benchmark datasets, feel free to take that up with Geoff Hinton and Andrew Ng. Skymind is hardly saying anything new when it points to the advances deep learning has made in unsupervised data.


Meh, people just like simple solutions (or at least those which sound simple).


The research project this is part of was discussed almost 3 years ago:

https://news.ycombinator.com/item?id=4779647

There's lots of other results for 'deep networks' in the search too (sorting by date, they appear quite consistently, many of them don't get many votes though).


I think it went 'mainstream' around 2007, with Hinton's TechTalk at Google http://www.youtube.com/watch?v=AyzOUbkUf3M And even being pretty far from the ML/AI community at the time, I remember playing with Bengio's GPU based Theano Deep Learning Tutorial at around 2008/09. What had happened right now is just that it had finally started beating SVMs consistently and is fast enough to be used for practical purposes.


Hinton's Coursera class a couple years ago pulled me in. I wish there were more Coursera classes at that level.


deeplearning has hit the front page regularly for over a year now.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: