I'm not the only one[1] within the research community to think this.
If you find any insights from this, I'd honestly first be surprised and then second be interested to know what insights you gleamed from it.
Background: researcher who publishes papers in deep learning.
[1]: https://twitter.com/jackclarkSF/status/834461913262157824 (thread containing a member of OpenAI who specializes in communicating complex machine learning topics to the media and a primary developer of PyTorch / member of Facebook's AI Research lab)
- the images and false colors need to show some semblance of stability for a given network between epochs; and it needs to be robust against changing data or input structure.
- requiring visual inspection doesn't give you something you can automate with, unlike an evaluation score.
- if there is indeed a significant deviation in "MRI"-like scans between batches, its diagnostic utility ends there - it tells you nothing about what caused a change.
Without any explanation of the questions you raise, this page is 99% marketing speak, and to me, next to useless.
