information geometry



Information geometry aims to apply the techniques of differential geometry to statistics. Often it is useful to think of a family of probability distributions as a statistical manifold. For example, normal Gaussian distributions form a 2-dimensional manifold, parameterised by (μ,σ)(\mu, \sigma), mean and standard deviation. On such manifolds there are notions of Riemannian metric, connection, curvature, and so on, of statistical relevance.

More precisely,

Kullback-Leibler information, or relative entropy, features as a measure of divergence (not quite a metric, because it’s asymmetric), and Fisher information takes the role of curvature. One useful aspect of information geometry is that it gives a means to prove results about statistical models, simply by considering them as well-behaved geometrical objects. For instance, it’s basically a tautology to say that a manifold is not changing much in the vicinity of points of low curvature, and changing greatly near points of high curvature. Stated more precisely, and then translated back into probabilistic language, this becomes the Cramer-Rao inequality, that the variance of a parameter estimator is at least the reciprocal of the Fisher information. (Shalizi)

Founders of the systematical theory are N. N. Chentsov and Shun-ichi Amari.


For XX a measurable space let SS be (a subspace of) the space of probability measures on XX, equipped with the structure of a smooth manifold.

The Fisher metric on SS is the Riemannian metric given on two vector fields v,wTSv,w \in T S by

g(v,w) s:=E s(v(logs)w(logs)), g(v,w)_s := E_s( v(log s) w(log s)) \,,

where E s()E_s(\cdots) denotes the expectation value under the measure sSs \in S of the function xv(logs) xw(logs) xx \mapsto v(log s)_x w(log s)_x on XX.

For instance (Amari, Section 2.1).


See also Fisher metric, where Fisher metric in other contexts and quantum generalizations are treated. See also quantum information.

Textbooks providing the big picture

For a series of articles, see

Lecture notes include

See also

A brief introduction with more references is

Several people have noted an equivalence between statistic inference as parametric model selection and statistical mechanics on a statistical manifold, e.g.,

The interpretation of a quantum field theory as a probability distribution on the space of field configurations so as to allow the conversion of techniques from information geometry to analogous measures of proximity between QFTs is in

A treatment of collective statistical inference as resulting in the partition function of a non-linear sigma model is in

More in the context of quantum field theory: