THE MAXWELL-BOLTZMANN DISTRIBUTION
The first probability distribution in the history of physics.
Maxwell's symmetry argument
Aberdeen, 1859. is twenty-eight, a year from his work on the rings of Saturn and a decade from the equations of electromagnetism. He sets himself a question no one had posed quantitatively: in a gas at equilibrium, the molecules cannot all share one speed — collisions constantly reshuffle them — so what fraction of them moves at each speed?
His argument is almost pure geometry. Resolve a molecule's velocity into components . Space has no preferred direction, so the distribution of cannot depend on or — the three components are statistically independent. Maxwell showed that the only distribution that is both independent in each component and isotropic (depending only on the total speed) is the Gaussian. Each component is therefore normally distributed about zero, with a spread set by temperature. The speed is then the length of a three-dimensional Gaussian vector — and that length has a distribution all its own. It was the first time a probability distribution was derived from physical principles, the birth of statistical physics.
The formula and its three speeds
Combining three Gaussian components and accounting for all the directions a given speed can point gives the Maxwell–Boltzmann distribution:
In words: the probability of finding a molecule near speed rises as at low speed, then is cut down by the exponential at high speed; the peak sits where the two effects balance. The curve has a long high-speed tail and a hard floor at .
Three characteristic speeds drop out, and they always appear in the same order. The Most-probable speed sits at the peak; the mean is a little higher; the root-mean-square — the speed that sets the kinetic energy — is higher still:
In words: the most probable, mean, and rms speeds stand in the fixed ratio , no matter the gas or the temperature. Heat the gas and all three slide right together; pick a heavier molecule and they all shrink.
Why the v-squared factor
Why does vanish at when the exponential there is largest? Because speed is a magnitude, not a coordinate. There is exactly one way to have zero speed — stand still — but an enormous number of ways to have a large speed, since the velocity can point anywhere on a sphere of radius . The surface area of that sphere grows as , and that is the in the formula: the phase-space volume of states with a given speed.
So is a tug-of-war. The geometric factor pulls the curve up at higher speeds; the Boltzmann exponential — the cost of kinetic energy — pulls it down. Their product peaks at the most probable speed and decays into the tail beyond it.
The tail does the work
Most of physics that depends on this distribution depends on its tail — the rare, fast molecules far to the right of the peak. Whether a molecule can escape a planet's gravity, evaporate from a liquid surface, or carry enough energy to react chemically is decided not by the average molecule but by the fraction above some threshold.
That fraction is governed by the Boltzmann exponential, and when the threshold is an activation energy it gives the Arrhenius equation that chemists had found empirically decades earlier:
In words: a reaction's rate is set by the small fraction of collisions energetic enough to clear the barrier , and that fraction grows explosively with temperature. This is why a ten-degree rise can roughly double a reaction rate, and why food keeps longer in a refrigerator: you are not stopping the fast molecules, only making them rarer.
Boltzmann's generalisation
In 1871 Ludwig Boltzmann lifted Maxwell's result out of the special case of molecular speeds and made it universal. For any system in thermal contact with a reservoir at temperature , the probability of finding it in a microstate of energy is proportional to the Boltzmann factor:
In words: high-energy states are exponentially less likely than low-energy ones, and the temperature sets how steep that penalty is. It does not matter whether the energy is the kinetic energy of a gas molecule, the height of a colloidal particle in gravity, or the alignment of a magnetic spin — the same exponential governs them all. This single factor is the engine of statistical mechanics; the partition function and the free energies of the next module are built entirely on top of it.
Stern weighs the molecules
A distribution is only physics if it can be measured. In the 1920s , master of the molecular beam, did exactly that. Molecules streamed from a hot oven through a slit and toward a rapidly spinning disc with a narrow slot. A molecule slipped through only if its flight time matched the rotation; faster molecules arrived sooner and were deposited at a different angle than slower ones. The film behind the chopper recorded a smear whose density, read off against position, was the speed distribution itself.
The measured profile matched Maxwell's — derived sixty years earlier from nothing but symmetry — and it has been confirmed countless times since. Stern's molecular-beam methods earned him the 1943 Nobel Prize.
From the crowd to the single particle
Maxwell's distribution describes the crowd — the statistical spread of an unimaginable number of molecules we never see individually. The natural next question is whether that invisible crowd leaves any visible trace. It does. A single particle large enough to watch under a microscope, suspended in a fluid, is ceaselessly battered by the molecules around it, and its resulting jitter — Brownian motion — turns the statistics of the unseen into something you can measure with a ruler and a stopwatch.
That same battering is what we treated, in bulk, as pressure on the container walls: the distribution of speeds we have just drawn is the distribution of those impacts. Maxwell gave us the shape of the crowd; Einstein and Perrin, a generation later, used a single jittering grain to count its members.