
Information Geometry - dil8
http://math.ucr.edu/home/baez/information/
======
abarachant
Information geometry is a very interesting research topic. In essence it
allows to define a metric (and therefore a measure of distance) between
probability distributions.

As a results, it has a lot of practical application in machine learning and
has been use successfully for classification in Neuroscience, Radar signal
processing and computer vision.

we can also note that Information Geometry can be seen as a sub-field of
Riemannian Geometry, with some equivalence between metric. For example, the
cannonical metric for symetric and positive definite (SPD) matrices in
Riemannian geometry is actually equivalent to the metric for multivariate
normal distribution obtained with Information geometry.

For some application, IG is very efficient. it has been used for multivariate
time-series for classification of EEG signal and was at the center of the
winning solution of 3 kaggle challenges :
[https://github.com/alexandrebarachant](https://github.com/alexandrebarachant)

~~~
tMcGrath
There's also an interesting set of applications in statistical physics: many
systems can be modelled by Markov chains whose rate parameters can change
through time. Changing these parameters non-quasistatically (i.e. so that the
system does not relax to the appropriate steady state) gives rise to distances
on parameter space that are related to the amount of energy dissipated along
the path through parameter space [0,1].

What's nice about this is that the derivation of a suitable metric allows us
to compute trajectories that minimise quantities we care about (e.g. minimise
energy dissipated), so this has clear potential to be useful. Some cool
examples are in spin systems [2] and a harmonic trap [3].

For any differential geometers reading this: it seems to me that a good
geometric way to think about this is as a fibre bundle, with the parameter
space being the base space and the simplex being a vector bundle over it (see
[4] on the simplex being a vector space).

[0]
[https://arxiv.org/pdf/0706.0559v2.pdf](https://arxiv.org/pdf/0706.0559v2.pdf)
[1] [https://arxiv.org/pdf/1201.4166.pdf](https://arxiv.org/pdf/1201.4166.pdf)
[2]
[https://arxiv.org/pdf/1607.07425v1.pdf](https://arxiv.org/pdf/1607.07425v1.pdf)
[3]
[http://journals.aps.org/pre/abstract/10.1103/PhysRevE.86.041...](http://journals.aps.org/pre/abstract/10.1103/PhysRevE.86.041148)
[4]
[https://golem.ph.utexas.edu/category/2016/06/how_the_simplex...](https://golem.ph.utexas.edu/category/2016/06/how_the_simplex_is_a_vector_sp.html)

------
jwmerrill
Skilling wrote a nice critique of information geometry that's worth reading:

[http://djafari.free.fr/MaxEnt2014/papers/Tutorial4_paper.pdf](http://djafari.free.fr/MaxEnt2014/papers/Tutorial4_paper.pdf)

The main point is that KL-divergence is not a metric, so imagining it as a
distance in a space may give you some wrong intuitions. Its matrix of 2nd
derivatives, the Fischer Information, works as a local metric, but then many
people want to draw global pictures that try to extend this back to a global
metric, which doesn't actually work.

~~~
otoburb
As a complete layman reading that paper, Skilling's arguments seem quite
damning of IG. In particular, Amari's fundamental assumption seems to be
refuted:

" _For inference, the only acceptable value for the Rényi-Tsallis parameter is
α = 1, which is the correct information. That negates the generalisation to α
!= 1 which underlies Amari’s “α-divergences” in information geometry._ "

Have any IG proponents responded to or refuted Skilling's critiques? This is
interesting because Shun'ichi Amari is credited amongst others with advancing
the field in the 80's[1].

[1]
[https://en.m.wikipedia.org/wiki/Information_geometry](https://en.m.wikipedia.org/wiki/Information_geometry)

------
fithisux
Not an expert in the field. But I liked Centsov's theorem (when a metric comes
from IG in the discrete case). I have not found a similar theorem for the
general case. Amari's book is hard to follow. There is a serious lack of
pedagogical intro to the subject, something like a starter : The main ideas +
achievements of the theory + how to use it. There is something in the theory
very deep but Amari just scratches the surface.

Can you imagine if GR metrics come from IG?

~~~
miles7
Have you heard about recent proposals that spacetime may emerge from quantum
information measures? [https://www.quantamagazine.org/20150428-how-quantum-
pairs-st...](https://www.quantamagazine.org/20150428-how-quantum-pairs-stitch-
space-time/)

In the article, entanglement is a precise mathematical notion that is like an
information based metric measuring the distance of a quantum state tensor
(wavefunction) from the manifold of rank 1 tensors.

~~~
fithisux
Nope, thank you miles7

If you have more material, just put it for all.

------
kevinalexbrown
Much of this is due to Amari, who got really into merging theoretical
neuroscience and IG, e.g.
[http://www.mitpressjournals.org/doi/abs/10.1162/089976602602...](http://www.mitpressjournals.org/doi/abs/10.1162/08997660260293238#.WF4QE7YrKAw)

------
enthdegree
Some interesting discussion here, particularly the refs in (6):
[http://mathoverflow.net/questions/215984/research-
situation-...](http://mathoverflow.net/questions/215984/research-situation-in-
the-field-of-information-geometry)

------
mitchtbaum
From out of left field, or thereabouts, see also pseudometric space and
"Geometry of Logic":
[http://finitegeometry.org/sc/16/logic.html](http://finitegeometry.org/sc/16/logic.html)
...

While there seems like a potential well-balanced in-between to these
complementing/contrasting philosophies and layers of view points, I feel
partly well-suited to vent that from what I've seen, statistical data analysis
seems to zealously want to gain understanding by using brute force, where
self-ordering "shapes" simply want to flow which shows their Nature.

