relative entropy



The notion of relative entropy of states is a generalization of the notion of entropy to a situation where the entropy of one state is measured “relative to” another state.

is also called


For states on finite probability spaces

For two finite probability distributions (p i)(p_i) and (q i)(q_i), their relative entropy is

S(p/q):= k=1 np k(logp klogq k). S(p/q) := \sum_{k = 1}^n p_k(log p_k - log q_k) \,.

Alternatively, for ρ,ϕ\rho, \phi two density matrices, their relative entropy is

S(ρ/ϕ):=trρ(logρlogϕ). S(\rho/\phi) := tr \rho(log \rho - log \phi) \,.

For states on classical probability spaces


For XX a measurable space and PP and QQ two probability measures on XX, such that QQ is absolutely continuous with respect to PP, their relative entropy is the integral

S(Q|P)= XlogdQdPdP, S(Q|P) = \int_X log \frac{d Q}{d P} d P \,,

where dQ/dPd Q / d P is the Radon-Nikodym derivative of QQ with respect to PP.

For states on quantum probability spaces (von Neumann algebras)

Let AA be a von Neumann algebra and let ϕ\phi, ψ:A\psi : A \to \mathbb{C} be two states on it (faithful, positive linear functionals).


The relative entropy S(ϕ/ψ)S(\phi/\psi) of ψ\psi relative to ϕ\phi is

S(ϕ/ψ):=(Ψ,(logΔ Φ,Ψ)Ψ), S(\phi/\psi) := - (\Psi, (log \Delta_{\Phi,\Psi}) \Psi) \,,

where Δ Φ,Ψ\Delta_{\Phi,\Psi} is the relative modular operator? of any cyclic and separating vector representatives Φ\Phi and Ψ\Psi of ϕ\phi and ψ\psi.

This is due to (Araki).

  • This definition is independent of the choice of these representatives.

  • In the case that AA is finite dimensional and ρ ϕ\rho_\phi and ρ ψ\rho_\psi are density matrices of ϕ\phi and ψ\psi, respectively, this reduces to the above definition.

Relation to machine learning

The machine learning process has been characterized as a minimization of relative entropy (Ackley, Hinton and Sejnowski 1985).


Relative entropy of states on von Neumann algebras was introduced in:

A characterization of relative entropy on finite-dimensional C-star algebras is given in

A survey of entropy in operator algebras is in

A characterization of machine learning as a process minimizing relative entropy is proposed in