Statistical Mechanics ↔ Information Theory

Boltzmann's entropy S = k_B ln W and Shannon's entropy H = −Σ p_i log p_i are formally identical — thermodynamic entropy IS the Shannon information entropy of the macroscopic probability distribution over microstates.

ESTABLISHED

statistical-mechanics information-theory thermodynamics

🔭 Overview

Boltzmann's entropy S = k_B ln W (W = number of equally probable microstates) and Shannon's entropy H = −Σ p_i log p_i (probability distribution over messages) are the same mathematical object up to the Boltzmann constant k_B and a factor of ln 2 (bit vs nat). The equivalence S = k_B ln 2 · H holds exactly when the microstate probability distribution is uniform. Jaynes (1957) showed the connection is not coincidental: the correct way to do statistical mechanics is to apply the principle of maximum entropy (MaxEnt) — select the probability distribution over microstates that maximizes Shannon entropy subject to known macroscopic constraints (energy, volume, particle number). This derivation recovers all of equilibrium statistical mechanics from information theory alone, without invoking ergodicity or time-averaging. The bridge implies that statistical mechanics IS Bayesian inference about physical systems: the macroscopic thermodynamic state is the least-biased inference given available macroscopic information. The second law of thermodynamics becomes the statement that our information about a system's microstate decreases over time (or equivalently, that the entropy of our probability distribution increases as correlations spread beyond our observational resolution).

⚙️ The Mathematical Bridge

This bridge connects Statistical Mechanics and Information Theory through shared mathematical structure. Status: Established connection.

↔️ Translation Table

Domain A Term	Domain B Term	Note
Boltzmann entropy S = k_B ln W	Shannon entropy H = −Σ p_i log p_i (uniform distribution)	S = k_B ln 2 · H; differ only by units (joules/kelvin vs bits)
number of microstates W consistent with macrostate	number of distinguishable messages of a given probability	Same combinatorial object — Boltzmann's W is Shannon's code length exponent
thermodynamic equilibrium (maximum entropy state)	maximum entropy distribution (MaxEnt principle)	Jaynes showed equilibrium statistical mechanics = MaxEnt inference
partition function Z = Σ exp(−E_i / k_B T)	moment generating function of the energy distribution	Z encodes all thermodynamic information as a Laplace transform
free energy F = −k_B T ln Z	log partition function (cumulant generating function)	Variational free energy = KL-divergence in variational Bayes

🗺️ Why Hasn't This Been Unified?

Physics and information theory developed the entropy concept independently (Boltzmann 1877 and Shannon 1948 are 71 years apart). Many physicists learn Boltzmann entropy as a physical fact, not as an inference framework. Many information theorists do not know that H is measured in joules/kelvin when multiplied by k_B. Jaynes's unification is underappreciated outside statistical physics.

🌱 Cross-Pollination Opportunities

Variational inference in machine learning uses the free energy F = ⟨E⟩ − T·S (evidence lower bound / ELBO) — the same quantity as thermodynamic free energy. This suggests that training a neural network is a form of thermodynamic inference and that statistical mechanics methods (replica theory, cavity method) apply to learning theory.
Non-equilibrium extensions: the Jarzynski equality and Crooks fluctuation theorem connect work distributions to free energy differences — the information-theoretic content is that the KL-divergence between forward and reverse process distributions equals the dissipated work in units of k_B T.

❓ Open Questions

Unknown: u-boltzmann-shannon-nonequilibrium-bridge
Hypothesis: h-maxent-nonequilibrium-statistical-mechanics

📚 References

Boltzmann (1877) Über die Beziehung zwischen dem zweiten Hauptsatze der mechanischen Wärmetheorie. Sitzungsber Akad Wiss Wien 76:373-435
Shannon (1948) A mathematical theory of communication. Bell Syst Tech J 27:379-423
Jaynes (1957) Information theory and statistical mechanics I. Phys Rev 106:620-630
Jaynes (1957) Information theory and statistical mechanics II. Phys Rev 108:171-190