Hacker News new | past | comments | ask | show | jobs | submit login

The problem is that the true, hidden function being modeled is non differentiable. We can try to approximate that hidden function with one that is differentiable, yes. But no good approximation is guaranteed to exist depending on the nature of the problem. An example from op:

What is true is that if you need to make a hard decision in one part of the network, for example deciding that the image is a cat, in order to further process that decision. This would not be differentiable.

CNNs can work for image classification because a smooth gradient will exist for incrementally "stepping" color values during backprop. But asking the question "is this an image of a cat" is much different than "what do I do if this is an image of a cat".

Also very deep neural networks (i.e. many, many time steps) essentially lose their gradient, or context. Even with attention and transformers. Which is partly why we haven't seen AIs that can write lengthy programs, books etc that are coherent and grammatically correct. And we probably never will by only relying on current "curve fitting" techniques.




"what do I do if this is an image of a cat"

You can train another NN to predict the best course of action for each decision of the first net. Or even train a single net to choose actions based on the initial input. Not sure what's the problem here.

we haven't seen AIs that can write lengthy programs, books etc that are coherent and grammatically correct. And we probably never will

Have you completely missed the recent NLP breakthroughs (BERT, GPT-2, XLNet, etc)? OpenAI even refused to publish their model because it could generate long coherent and grammatically correct text.


What do you mean by "good approximation"? Arbitrarily close?

Can you give an example of such a function? They very well may exist. But I think we can agree that non-differentiability of the target function is not sufficient.

The function that describes the neural network itself has to be differentiable. Whether we can create a differentiable function/NN for any kind of input remains to be shown.


Can you give an example of such a function?

Brownian motion. Try using a NN to model stock prices (not Brownian exactly, but same concept).


> Try using a NN to model stock prices

IIRC, the first neural network book I read in the 1990s had that as the big illustrative application in the latter part of the book.


Yes, but it doesn't work well.

There are plenty of other time series problems where NNs don't outperform classical methods such as ARIMA.


Once again I ask: What do you mean with "works well". Just because some other ML method is better in practice doesn't mean that a NN can not achieve the same degree of approximation in theory.


What do you mean with "works well"

There is no empirical data to support that it is more effective than other methods.

Theory is not that valuable when evaluating papers in AI. It is all about the empirical results.


I'm wondering if the ARIMA models aren't a very simple case of NN (time convolution+regression) ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: