Branch Specialization

yorwba · on April 6, 2021

I wonder whether the split into high-frequency black-and-white and low-frequency color is an artifact of training on images that were compressed using chroma subsampling, which discards high-frequency color variations. That's a pretty common trick for getting better compression ratios without visibly affecting quality, because humans aren't as sensitive to changes in color as to high-frequency lighness changes.

PaulHoule · on April 6, 2021

No.

What is striking is that the animal visual system works the same in terms of horizontal and vertical splits. For each kind of feature neuron they find in AlexNet somebody found a neuron in an animal that fires like that back in the 1970s.

(It is structural that animal vision privileges value over hue: you have little trouble recognizing something in moonlight to be the same thing you saw in sunlight despite the fact that one uses rods and the other cones.)

All the time somebody shows me a picture and I tell them that I saw that in Scientific American magazine when I was a kid and they say... "no no, you are not allowed to make an analogy with animals!"

That is one reason why research in neural networks proceeds so slowly.

colah3 · on April 6, 2021

It's certainly true that there are strong biological analogies. The analogy between first layer conv features and neuroscience is pretty widely accepted -- a lot of theoretical neuroscience models produce the same features.(It's less clear for later layers whether they're biologically analogous. Several papers have found that the aggregate of neurons in those layers are able to predict biological neurons quite well, but I don't think we have the detailed and agreed upon a characterization of the features that exist on the biological side to make a strong feature-level case.)

The color vs black and white split also has biological analogies.

With that said, I'd hesitate to dismiss the GP comment. Separate from the color vs grayscale split, why do we observe low-frequency preferring to group with color? It seems very plausible to me that if there's a systematic artifact from how the data neural networks are trained on was compressed, that could play a role. Either way, it makes the argument that this is emerging from purely natural data and the network less clean. (One caveat is that these models are trained on very downscaled versions of larger images. Even if high-frequency data was discarded in the original, that wouldn't necessarily mean that high-frequency was discarded in the downsampled version the network sees. It would depend on details of the data processing pipeline.)

To be clear, I'm not a neuroscientist and this is all just my understanding from the ML side!

colah3 · on April 6, 2021

That's an interesting hypothesis which hadn't been on my radar. (I'm one of the authors.)

liuliu · on April 6, 2021

It can be quickly validated / disproved by doing unsupervised learning on RAW images. I believe there are a few large RAW image dataset available nowadays.