> do you think it would be possible to train a DNN to learn to visualize the "most important" neuron activations / interactions of another DNN?
That sounds like a really hard problem. I'm not entirely sure what it would mean even, but it would not surprise me at all if there was some refinement that could turn into an interesting research direction! :)
Thanks. I asked the question in such an open-ended way just to see if you had any crazy ideas. It does sound like a hard problem.
In terms of what it could mean, one idea I just had is to take a trained model, randomly remove (e.g., zero out) neurons, and then train a second model to predict how well the trained model continues to work without those removed neurons. The goal would not be to 'thin out' the first model to reduce computation, but to train the second one learn to identify important neurons/interactions. Perhaps the second model can learn to predict which neurons/interactions are most important to the first model, as a stepping stone for further analysis?
> do you think it would be possible to train a DNN to learn to visualize the "most important" neuron activations / interactions of another DNN?
That sounds like a really hard problem. I'm not entirely sure what it would mean even, but it would not surprise me at all if there was some refinement that could turn into an interesting research direction! :)