Statistical Mechanics ↔ Information Theory

Boltzmann's entropy S = k_B ln W and Shannon's entropy H = −Σ p_i log p_i are formally identical — thermodynamic entropy IS the Shannon information entropy of the macroscopic probability distribution over microstates.

ESTABLISHED
statistical-mechanics information-theory thermodynamics

🔭 Overview

Boltzmann's entropy S = k_B ln W (W = number of equally probable microstates) and Shannon's entropy H = −Σ p_i log p_i (probability distribution over messages) are the same mathematical object up to the Boltzmann constant k_B and a factor of ln 2 (bit vs nat). The equivalence S = k_B ln 2 · H holds exactly when the microstate probability distribution is uniform. Jaynes (1957) showed the connection is not coincidental: the correct way to do statistical mechanics is to apply the principle of maximum entropy (MaxEnt) — select the probability distribution over microstates that maximizes Shannon entropy subject to known macroscopic constraints (energy, volume, particle number). This derivation recovers all of equilibrium statistical mechanics from information theory alone, without invoking ergodicity or time-averaging. The bridge implies that statistical mechanics IS Bayesian inference about physical systems: the macroscopic thermodynamic state is the least-biased inference given available macroscopic information. The second law of thermodynamics becomes the statement that our information about a system's microstate decreases over time (or equivalently, that the entropy of our probability distribution increases as correlations spread beyond our observational resolution).

⚙️ The Mathematical Bridge

This bridge connects Statistical Mechanics and Information Theory through shared mathematical structure. Status: Established connection.

↔️ Translation Table

Domain A Term Domain B Term Note
Boltzmann entropy S = k_B ln WShannon entropy H = −Σ p_i log p_i (uniform distribution)S = k_B ln 2 · H; differ only by units (joules/kelvin vs bits)
number of microstates W consistent with macrostatenumber of distinguishable messages of a given probabilitySame combinatorial object — Boltzmann's W is Shannon's code length exponent
thermodynamic equilibrium (maximum entropy state)maximum entropy distribution (MaxEnt principle)Jaynes showed equilibrium statistical mechanics = MaxEnt inference
partition function Z = Σ exp(−E_i / k_B T)moment generating function of the energy distributionZ encodes all thermodynamic information as a Laplace transform
free energy F = −k_B T ln Zlog partition function (cumulant generating function)Variational free energy = KL-divergence in variational Bayes

🗺️ Why Hasn't This Been Unified?

Physics and information theory developed the entropy concept independently (Boltzmann 1877 and Shannon 1948 are 71 years apart). Many physicists learn Boltzmann entropy as a physical fact, not as an inference framework. Many information theorists do not know that H is measured in joules/kelvin when multiplied by k_B. Jaynes's unification is underappreciated outside statistical physics.

🌱 Cross-Pollination Opportunities

Open Questions

📚 References