More information is available from the web app.
Although it seems to me that high dimensional data, while not being black box, are very hard to interpret as well. Could you give an illustration of the interpretation process you are using?
I am also curious about the construction of your training dataset. Did you used some existing pre-labelled reference data or did you build it by hand?
A nice side effect is that, because there are only a few latent variables, you can manipulate each of them individually and observe the effects on the inputs. In the case of facial expressions it's quite graphic.
The data set is custom-made.