Hacker News new | past | comments | ask | show | jobs | submit login

It's not that we need to understand our neural networks better, it's that we need to understand our problem domain better.

How 'bout "creating models that can work with more dimensions of the problem domain than are conveyed by standard data labeling"?

I mean, we don't simply want AI but actually "need" it in the sense that problems like biological system are too complex to understand without artificial enhancements to our comprehension processes - thus to "understand the problem domain better" we need AI. If it's true that "to build AI, we need to understand the problem domain better", it leaves us stuck in a chicken-and-problem. That might be the case but if we're going find a way out, we are going to need to build tools in the fashion humans used to solve problems many times before.




It will probably play out like a conversation. A data scientist trains an ML model, and in analyzing the results discovers some intrinsic property or invariant of the problem domain. The scientist can then encode that information into the model and retrain. And that goes on and on, each time providing more accurate results.

As an aside, I think it's important that we find a way to examine and inspect how an ML model "works". If you have some neural network that does really well at the problem, it would be nice if you could somehow peer into it and explain, in human terms, what insight the model has made into the problem. That might not be feasible with neural networks, as they're really just a bunch of weights in a matrix, but this is practical for something like decision trees. Just food for thought.


This is somewhat practical for neural networks. For example, instead of minimizing the loss function, why not tweak the input to maximize a neuron’s activation? Or with a CNN, maximize the sum of a kernel’s channel? This would tell us what the neuron corresponds with. This is what Google did with DeepDream.

An explanation/tutorial, with clean images of the process: https://github.com/tensorflow/tensorflow/blob/r0.10/tensorfl...

Google’s investigation of it’s GoogLeNet architecture: http://storage.googleapis.com/deepdream/visualz/tensorflow_i...

Now, I say somewhat because results can be visually confusing, ex Google’s analysis. Even then, we can see the progression of layer complexity as we go deeper into ImageNet. Plus, we can see mixed4b_5x5_bottleneck_pre_relu has kernels that seem to correspond with noses and eyes. mixed_4d_5x5_pre_relu has a kernel that seems to correspond with cat faces.


A data scientist trains an ML model, and in analyzing the results discovers some intrinsic property or invariant of the problem domain. The scientist can then encode that information into the model and retrain. And that goes on and on, each time providing more accurate results.

Mmmaybe,

It's tricky to articulate what pattern the data-scientist could see ... that an automated system couldn't see. Or otherwise, perhaps the whole "loop" could be automated. Or possibly the original neural already finds all the patterns available and what's left can't be interpreted.


The human participant may consider multiple distinct machine results, each a point in the space of algorithm, data set, bias applied to the problem domain. Human intuition is injected into the process and the result will be greater than the sum of the machines and a lone human mind.

What is interesting to note, now that above idea is considered, is that this process model itself belongs to the set of human-machine coordinations. Another process model is where low level human mind is used to perform recognition tasks too hard (or too slow) for machine to perform, for example using porn surfers to perform computation tasks via e.g. captcha like puzzles.

Long term social ramifications of all this is also interesting to consider as it motivates machines to breed distinct types of humans ;)


I imagine you need the data science to discern semantically relevant from irrelevant signals. How else do you “tell” your model what to look for? You could easily train for an irrelevant but fitting model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: