Kinetic Theory: Pressure and Temperature

In short

Kinetic theory explains the macroscopic behaviour of a gas — its pressure, temperature, and specific heat — as the statistical result of an enormous number of tiny molecules bouncing elastically off the container walls. The central derivation takes a cubical box of side L containing N molecules of mass m each, follows one molecule bouncing along the x-axis, computes the impulse it delivers to the wall per bounce and the frequency of bounces, adds up the contributions of all N molecules, and arrives at

\boxed{\;P = \frac{1}{3}\,\rho\,\langle v^2 \rangle = \frac{N m \langle v^2\rangle}{3V}\;}

where \rho = Nm/V is the mass density and \langle v^2 \rangle is the mean-square speed of the molecules. Comparing with the empirical ideal gas law PV = N k_B T produces the kinetic interpretation of temperature:

\boxed{\;\tfrac{1}{2} m \langle v^2 \rangle = \tfrac{3}{2} k_B T.\;}

Temperature is (up to a factor of \frac{2}{3 k_B}) the average translational kinetic energy of a single molecule. From this follow the root-mean-square speed

v_{\text{rms}} = \sqrt{\frac{3 k_B T}{m}} = \sqrt{\frac{3RT}{M}}

and its cousins the mean speed \langle v \rangle = \sqrt{8 k_B T/(\pi m)} and the most-probable speed v_p = \sqrt{2 k_B T/m}, all lying in the ratio v_p : \langle v \rangle : v_{\text{rms}} = 1 : 1.128 : 1.225.

For nitrogen (the dominant component of air) at 300 K, v_{\text{rms}} \approx 515 m/s — faster than a bullet. The room you are sitting in is filled with molecules hurtling at supersonic speeds, slamming into the walls 10^{23} times per square metre per second; the steady push you never feel is atmospheric pressure.

Open an LPG cylinder at a kitchen in Surat. It contains 14.2 kg of liquefied petroleum gas, mostly butane, stored at about 2 bar (twice atmospheric pressure) at room temperature. The steel wall of the cylinder is a few millimetres thick. What is it holding back? Nothing you can see — no flames, no piston, no compressed spring. Just a cloud of invisible molecules, each about 10^{-9} metres across, each drifting in empty space. And yet those invisible molecules collectively push hard enough on the inside of the cylinder that the steel has to be heavy and carefully engineered, or it would burst.

Here is the question the nineteenth century put to physics: where does that push come from? Pressure is a macroscopic thing — force per unit area, a number you read off a gauge. But a gas has no muscle. Its molecules don't know about each other's existence most of the time. How do small, free particles, each doing its own thing, add up to a steady, measurable shove on a steel wall?

The answer, worked out by Bernoulli and Maxwell and Clausius and Boltzmann, is that pressure is the time-averaged drum of countless tiny collisions. Every molecule that strikes a wall bounces off it and imparts a small impulse — a tiny, Newton's-third-law kick. There are so many molecules (\sim 10^{23} per cubic centimetre at STP) that the individual kicks average, within any measurable time, into a perfectly steady force per unit area. And the rate of those collisions, along with the momentum they carry, depends only on how fast the molecules are moving — which is to say, on the temperature. That is kinetic theory's central claim: pressure is microscopic momentum transfer; temperature is microscopic kinetic energy. Both of these were, before 1850, purely macroscopic quantities measured by gauges and thermometers. Kinetic theory reveals their molecular machinery.

This article does three things. First, it lays out the assumptions of the kinetic model — what simplifications you make about the molecules to get a tractable calculation. Second, it derives the pressure formula P = \frac{1}{3}\rho\langle v^2 \rangle from those assumptions, following a single molecule around a box and adding up its bouncing. Third, it compares the derived pressure with the empirical ideal gas law to extract the kinetic interpretation of temperature and the root-mean-square speed.

The kinetic model: what you assume

Every derivation in physics is a derivation from some set of assumptions. Writing them down in plain language first — before any algebra — is the honest way to start.

A gas consists of a very large number N of identical molecules, each of mass m. In one mole (6.022 \times 10^{23} molecules) you have a number so large that statistical averages are effectively exact. For a cubic-centimetre sample at STP, N \approx 2.7 \times 10^{19}; for a roomful of air, N \approx 10^{25}. The "very large" assumption is what lets you talk about averages without worrying about fluctuations.
The molecules are point particles with negligible size compared to the spacing between them. In a roomful of air at STP, the average molecule-to-neighbour distance is about 3 nm, while a molecule's diameter is about 0.3 nm — a factor of ten. The molecules occupy about 0.1% of the total volume, so for most purposes they can be treated as points.
Molecules exert no forces on each other except during brief collisions. Between collisions they move in straight lines at constant speed (Newton's first law in action). This is the "ideal gas" simplification — neglecting intermolecular attractions and repulsions. It fails at very high pressures or very low temperatures, where van der Waals forces and molecular finite size matter. For ordinary atmospheric conditions, it is an excellent approximation.
Collisions (both between molecules and between a molecule and the walls) are perfectly elastic. Kinetic energy is conserved in every collision. A molecule that hits a wall at 500 m/s bounces off at 500 m/s in the opposite direction.
Between collisions, the molecules obey Newton's laws of motion. No quantum mechanics, no relativity — ordinary mechanics applies to each molecule as if it were a small, hard ball.
The motion of the molecules is completely random. At any instant, equal numbers of molecules are moving in every direction. The average velocity \langle \vec{v} \rangle is zero; the gas as a whole is not drifting. But the mean-square speed \langle v^2 \rangle is large and nonzero.

These assumptions are the ideal-gas model. Every equation in kinetic theory is a consequence of applying Newton's laws and statistical averaging to a system obeying (1)–(6).

The kinetic model: $N$ point-like molecules in a cube of side $L$, moving in random directions at random speeds, colliding elastically with the walls and with each other. Between collisions they fly in straight lines. There are no forces between them except during brief collisions.

A live molecular picture

Six molecules bouncing inside a square container. Each flies in a straight line and reflects when it reaches a wall (the trick `abs((x % 2L) - L)` produces the sawtooth of specular reflection). In a real gas, $N \sim 10^{23}$ and the picture becomes so dense that what you notice is only the time-averaged push — the pressure — on the walls. Click **replay** to watch again.

Deriving the pressure formula

Here is the core calculation of kinetic theory — one of the cleanest in all of physics. You take the six assumptions above and, by tracking a single molecule's bouncing and summing, obtain P = \frac{1}{3}\rho\langle v^2 \rangle in about fifteen lines. Every step below is justified.

Setup. Consider a cubical container of side L (volume V = L^3) containing N molecules, each of mass m. Label the walls: the right wall has area A = L^2 and sits perpendicular to the x-axis. Pick one molecule with velocity components (v_x, v_y, v_z) and follow its bouncing.

Step 1. The molecule bounces off the right wall. What is the impulse delivered to the wall?

Before the bounce, the molecule's x-velocity is +v_x (moving right, toward the wall). After an elastic bounce, v_x reverses: now the x-velocity is -v_x. The y- and z-components are unchanged (the wall is perpendicular to x).

Change in the molecule's momentum (in the x-direction):

\Delta p_{\text{molecule}} = m(-v_x) - m(+v_x) = -2 m v_x.

By Newton's third law, the impulse delivered to the wall is +2 m v_x (opposite sign, same magnitude).

Why: when a ball bounces off a wall, the wall exerts a force pushing the ball away (flipping its velocity); the ball exerts an equal and opposite force pushing the wall outward. Integrating that force over the duration of contact gives the impulse 2 m v_x transferred to the wall.

Step 2. How often does this particular molecule hit the right wall?

After bouncing off the right wall at x = L, the molecule travels to the left, eventually bouncing off the left wall at x = 0, then returns. The round trip is 2L. Since the molecule travels with x-velocity of magnitude v_x (between collisions with other molecules, which average out and don't change the statistics), the time for one round trip is

\Delta t = \frac{2L}{v_x}.

So the frequency of collisions with the right wall is 1 / \Delta t = v_x / (2L).

Why: in the time 2L/v_x for a round trip, the molecule strikes the right wall exactly once. The assumption that other molecules don't change the statistics is an important one — in reality collisions between molecules redistribute velocities, but on average each molecule bounces back and forth at the same rate it did without collisions, because the total velocity distribution is preserved.

Step 3. Average force exerted on the wall by this one molecule.

Force = rate of change of momentum = impulse per bounce × bounces per second:

F_1 = 2 m v_x \cdot \frac{v_x}{2L} = \frac{m v_x^2}{L}.

Why: force is impulse per unit time. Each bounce contributes 2mv_x of impulse, and bounces happen v_x/(2L) times per second. Multiplying gives force. Note the cancellation: the 2 in the impulse cancels with the 2 in the round-trip time, leaving mv_x^2/L.

Step 4. Add contributions from all N molecules. Each molecule has its own x-velocity; call them v_{x,1}, v_{x,2}, \ldots, v_{x,N}. The total force on the right wall is

F = \sum_{i=1}^N \frac{m v_{x,i}^2}{L} = \frac{m}{L} \sum_{i=1}^N v_{x,i}^2 = \frac{N m}{L} \cdot \langle v_x^2 \rangle,

where \langle v_x^2 \rangle = \frac{1}{N}\sum_i v_{x,i}^2 is the mean of the squared x-velocities.

Why: the total force is the sum of individual molecular contributions. Pulling out the common factors m/L leaves the sum \sum v_{x,i}^2, which by definition is N times the average \langle v_x^2 \rangle.

Step 5. Pressure is force per unit area. Area of the wall is A = L^2, so

P = \frac{F}{A} = \frac{Nm \langle v_x^2 \rangle}{L \cdot L^2} = \frac{Nm \langle v_x^2 \rangle}{L^3} = \frac{Nm \langle v_x^2 \rangle}{V}.

Step 6. Replace \langle v_x^2 \rangle with a quantity involving the full speed v^2 = v_x^2 + v_y^2 + v_z^2.

By symmetry — no direction is special in a gas at rest — the mean-square velocity components are equal:

\langle v_x^2 \rangle = \langle v_y^2 \rangle = \langle v_z^2 \rangle.

Taking the average of v^2 = v_x^2 + v_y^2 + v_z^2:

\langle v^2 \rangle = \langle v_x^2 \rangle + \langle v_y^2 \rangle + \langle v_z^2 \rangle = 3 \langle v_x^2 \rangle.

Why: the gas is isotropic — it has no preferred direction. A molecule is as likely to be moving in the +x direction as in the +y or +z. By the same symmetry, the typical magnitude of the x-component equals the typical magnitude of the y- or z-component, and in particular the mean of the squares are equal. The sum of three equal things is three times one of them.

Therefore \langle v_x^2 \rangle = \frac{1}{3}\langle v^2 \rangle. Substituting into Step 5:

\boxed{\;P = \frac{1}{3} \cdot \frac{N m \langle v^2 \rangle}{V} = \frac{1}{3}\rho \langle v^2 \rangle,\;}

where \rho = Nm/V is the mass density of the gas. That is the fundamental result of kinetic theory: pressure equals one-third of the mass density times the mean-square speed.

The factor of \frac{1}{3} comes from the geometric fact that only one-third of the molecular kinetic energy is associated with motion in any one direction — the other two-thirds is in the two perpendicular directions.

Extracting temperature

The formula P = \frac{1}{3}\rho\langle v^2 \rangle is purely mechanical — it contains Newton's laws and statistics, nothing more. To connect it with temperature, you compare it with the empirical ideal gas law, which is a macroscopic experimental fact:

P V = N k_B T,

where k_B = 1.381 \times 10^{-23} J/K is Boltzmann's constant.

Equating the two expressions for P:

\frac{N m \langle v^2 \rangle}{3 V} = \frac{N k_B T}{V}.

Cancel N/V from both sides:

\frac{m \langle v^2 \rangle}{3} = k_B T.

Multiply both sides by \frac{3}{2} to rearrange into a recognisable form:

\boxed{\;\tfrac{1}{2} m \langle v^2 \rangle = \tfrac{3}{2} k_B T.\;}

Why: \frac{1}{2} m \langle v^2 \rangle is the average translational kinetic energy per molecule. The equation says that this average kinetic energy is proportional to the absolute temperature, with the constant of proportionality \frac{3}{2} k_B.

This is the kinetic interpretation of temperature. Temperature is not some abstract thermometer-reading — it is, up to a factor, the average kinetic energy of a single molecule.

A corollary: at T = 0 K (absolute zero), \langle v^2 \rangle = 0 — all molecular motion ceases. (Quantum mechanics modifies this at very low temperatures — there is always a small zero-point motion — but classically, absolute zero is the temperature at which a gas has no kinetic energy.)

Another corollary: because the right side depends only on T, the left side does too. For any gas — hydrogen, nitrogen, helium, LPG — at the same temperature, the average kinetic energy per molecule is the same. Hydrogen molecules at 300 K have exactly the same \langle \frac{1}{2} m v^2 \rangle as nitrogen molecules at 300 K. Since hydrogen is lighter (smaller m), hydrogen molecules move faster.

Root-mean-square speed and its cousins

From \frac{1}{2} m \langle v^2 \rangle = \frac{3}{2} k_B T, solving for \langle v^2 \rangle:

\langle v^2 \rangle = \frac{3 k_B T}{m}.

The square root of the mean of the squared speeds is called the root-mean-square speed:

\boxed{\;v_{\text{rms}} = \sqrt{\langle v^2 \rangle} = \sqrt{\frac{3 k_B T}{m}}.\;}

For one mole, it is more convenient to use the molar mass M = N_A m and the universal gas constant R = N_A k_B = 8.314 J/(mol·K):

v_{\text{rms}} = \sqrt{\frac{3 R T}{M}}.

Why: dividing numerator and denominator of 3 k_B T / m by N_A (Avogadro's number) turns k_B into R and m into M. Both forms are useful; use whichever matches the data you have.

Concrete numbers

For nitrogen (N_2, molar mass M = 28 \times 10^{-3} kg/mol) at T = 300 K:

v_{\text{rms}} = \sqrt{\frac{3 \cdot 8.314 \cdot 300}{28 \times 10^{-3}}} = \sqrt{\frac{7482.6}{0.028}} = \sqrt{267{,}236} \approx 517 \text{ m/s}.

For hydrogen (H_2, M = 2 \times 10^{-3} kg/mol) at 300 K:

v_{\text{rms}} = \sqrt{\frac{3 \cdot 8.314 \cdot 300}{2 \times 10^{-3}}} = \sqrt{3{,}741{,}300} \approx 1934 \text{ m/s}.

Hydrogen molecules move \sqrt{14} \approx 3.74 times faster than nitrogen because hydrogen is 14 times lighter. Both speeds exceed the speed of sound in air (\sim 340 m/s) — this is no coincidence; the speed of sound in a gas is of the same order as v_{\text{rms}} because sound is the propagation of density fluctuations by molecular collisions. (A precise relation: v_{\text{sound}} = \sqrt{\gamma k_B T / m} with \gamma \approx 1.4 for diatomic gases, compared to \sqrt{3 k_B T / m} for v_{\text{rms}}. The ratio is \sqrt{\gamma/3} \approx 0.68.)

Mean speed and most-probable speed

The full distribution of molecular speeds (derived independently by Maxwell in 1860) is the Maxwell-Boltzmann distribution. It has a peak at some speed and a long tail to higher speeds. Three characteristic speeds are commonly quoted:

Most-probable speed v_p — the peak of the distribution:

v_p = \sqrt{\frac{2 k_B T}{m}}
Mean speed \langle v \rangle — the arithmetic average of speeds (not squared, then not square-rooted):

\langle v \rangle = \sqrt{\frac{8 k_B T}{\pi m}}
Root-mean-square speed v_{\text{rms}} — the one you derive directly from kinetic theory:

v_{\text{rms}} = \sqrt{\frac{3 k_B T}{m}}

Their ratios are fixed, independent of the gas or the temperature:

v_p : \langle v \rangle : v_{\text{rms}} = \sqrt{2} : \sqrt{8/\pi} : \sqrt{3} \approx 1.414 : 1.596 : 1.732,

or dividing through by \sqrt{2}: 1 : 1.128 : 1.225. The most-probable is the lowest, the mean is in the middle, and v_{\text{rms}} is the highest because it weights the high-speed tail more (it is the square root of an average of squares).

The Maxwell–Boltzmann distribution of molecular speeds in a gas. Three characteristic speeds pick out different features: the most-probable $v_p$ (peak), the mean $\langle v \rangle$ (arithmetic average), and the root-mean-square $v_{\text{rms}}$ (square root of the mean square). They stand in a fixed ratio $1 : 1.128 : 1.225$, regardless of gas or temperature.

Worked examples

Example 1: How fast are the molecules in your room?

The air in your room is roughly 78% nitrogen and 21% oxygen at temperature 27 °C = 300 K. Find the rms speed of a nitrogen molecule and an oxygen molecule at this temperature.

The two main components of air at room temperature, with their rms speeds. Every molecule in the room is moving at these breakneck speeds — you just can't see them individually.

Step 1. Use v_{\text{rms}} = \sqrt{3RT/M} with R = 8.314 J/(mol·K), T = 300 K.

For nitrogen, M_{N_2} = 28 \times 10^{-3} kg/mol (convert from g/mol to kg/mol for SI units).

v_{\text{rms,N}_2} = \sqrt{\frac{3 \cdot 8.314 \cdot 300}{28 \times 10^{-3}}} = \sqrt{\frac{7482.6}{0.028}} = \sqrt{267{,}236} \approx 517 \text{ m/s}.

Why: plugging numbers into the formula. The units check out: [R T / M] = \frac{\text{J/(mol·K)} \cdot \text{K}}{\text{kg/mol}} = \frac{\text{J}}{\text{kg}} = \frac{\text{kg·m}^2/\text{s}^2}{\text{kg}} = \text{m}^2/\text{s}^2. Taking the square root gives m/s.

Step 2. For oxygen, M_{O_2} = 32 \times 10^{-3} kg/mol.

v_{\text{rms,O}_2} = \sqrt{\frac{3 \cdot 8.314 \cdot 300}{32 \times 10^{-3}}} = \sqrt{\frac{7482.6}{0.032}} = \sqrt{233{,}831} \approx 483 \text{ m/s}.

Why: same formula, heavier molecule. Because v_{\text{rms}} \propto 1/\sqrt{M}, oxygen (4/28 heavier than nitrogen by mass) moves \sqrt{28/32} = \sqrt{7/8} \approx 0.935 times as fast — 6.5% slower.

Step 3. Quick consistency check: compare to the speed of sound.

The speed of sound in air at 300 K is about 343 m/s. The ratio v_{\text{sound}}/v_{\text{rms}} for air (taking a weighted-average molar mass of 29 g/mol) should be close to \sqrt{\gamma/3} = \sqrt{1.4/3} \approx 0.683. Checking: 343/508 \approx 0.675.

Why: cross-checking kinetic-theory predictions against a quantity you can actually measure (the speed of sound) is a good habit. If the two don't agree to within a few percent, something is wrong with your assumptions or your arithmetic.

Result: Nitrogen molecules move at about 517 m/s, oxygen at 483 m/s — both faster than a passenger jet. Every cubic centimetre of your room contains \sim 2.5 \times 10^{19} molecules moving at these speeds in every direction.

What this shows: Molecular speeds are not modest. Air is a supersonic crowd. The only reason you don't notice is that equal numbers move in opposite directions, so there is no net wind; and the inter-molecular collisions are so frequent (once every \sim 10^{-10} s) that any individual molecule zigzags almost instantly.

Example 2: An LPG cylinder on the kitchen counter

A 14.2 kg LPG cylinder contains liquefied butane (C_4H_{10}, M = 58 \times 10^{-3} kg/mol). On a summer afternoon the cylinder sits at 45 °C = 318 K. The cylinder's internal volume is about 28 litres (V = 0.028 m³). Treating the gaseous phase as an ideal gas at saturation, and given that the butane's vapour pressure at 318 K is about 4.2 bar = 4.2 \times 10^5 Pa, find (a) the rms speed of the butane molecules, (b) the number of molecules in the gaseous phase, and (c) the average kinetic energy per molecule.

An LPG cylinder in a hot kitchen. Inside, liquid butane is in equilibrium with its vapour at 4.2 bar. The vapour molecules are flying at supersonic speeds; the pressure is the time-averaged push they exert on the steel walls.

Step 1. Compute v_{\text{rms}}.

v_{\text{rms}} = \sqrt{\frac{3 R T}{M}} = \sqrt{\frac{3 \cdot 8.314 \cdot 318}{58 \times 10^{-3}}} = \sqrt{\frac{7931.6}{0.058}} = \sqrt{136{,}752} \approx 370 \text{ m/s}.

Why: butane is much heavier than nitrogen (58 versus 28 g/mol), so at the same temperature its molecules move more slowly. Still Mach-1-and-change, but comfortably sub-nitrogen.

Step 2. Number of vapour molecules. Use the ideal gas law PV = Nk_BT. The vapour occupies some fraction of the 28 L — at room conditions the liquid takes most of the volume and the vapour sits above it, but the exact split depends on how full the cylinder is. For simplicity, estimate the vapour volume as \sim 5 L = 0.005 m³ for a half-full cylinder.

N = \frac{PV}{k_B T} = \frac{(4.2 \times 10^5)(0.005)}{(1.381 \times 10^{-23})(318)} = \frac{2100}{4.39 \times 10^{-21}} \approx 4.8 \times 10^{23} \text{ molecules}.

Why: pressure times volume divided by k_B T counts molecules — this is what PV = Nk_B T says. Almost half an Avogadro's worth of molecules in the vapour phase alone.

Step 3. Average kinetic energy per molecule.

\langle KE \rangle = \tfrac{3}{2} k_B T = \tfrac{3}{2} \cdot (1.381 \times 10^{-23}) \cdot 318 \approx 6.59 \times 10^{-21} \text{ J}.

Why: this is the kinetic interpretation of temperature. At 318 K, every molecule — butane, nitrogen, helium, anything — has this same average translational kinetic energy.

Step 4. Total translational kinetic energy of the vapour.

KE_{\text{total}} = N \cdot \langle KE \rangle = (4.8 \times 10^{23})(6.59 \times 10^{-21}) \approx 3160 \text{ J} \approx 3.2 \text{ kJ}.

Why: total kinetic energy is the per-molecule average multiplied by the number of molecules.

Result: The butane vapour molecules move at about 370 m/s, there are about 4.8 \times 10^{23} of them in the gaseous phase, each carrying about 6.6 \times 10^{-21} J of translational kinetic energy, for a total of about 3.2 kJ. To picture that total, it is roughly the energy it would take to lift a 1 kg weight by 320 m — a surprising amount of energy in a few litres of seemingly quiescent vapour.

What this shows: An LPG cylinder is not a static pressurised container — it is a kinetic one. The steel walls are continuously absorbing 10^{27} molecular impacts per second, each elastic, each transferring about 2mv_x of momentum. The aggregate shows up on a pressure gauge as 4.2 bar. When the cylinder heats up on a summer afternoon, the molecules move faster, the impacts carry more momentum, and the pressure rises — which is why LPG cylinders have safety-release valves and should never be stored in direct sunlight.

Common confusions

"All molecules move at v_{\text{rms}}." No — v_{\text{rms}} is a single summary statistic of a wide distribution. At 300 K, individual nitrogen molecules range from near zero up to several thousand m/s; the Maxwell–Boltzmann distribution describes the full spread. v_{\text{rms}} is a useful average because it's directly tied to the average kinetic energy (\frac{1}{2} m v_{\text{rms}}^2 = \frac{3}{2} k_B T), not because every molecule is at that speed.
"Mean speed equals v_{\text{rms}}." Close, but not equal. \langle v \rangle = \sqrt{8 k_B T / (\pi m)} while v_{\text{rms}} = \sqrt{3 k_B T/m}. Their ratio is \sqrt{8/(3\pi)} \approx 0.921. Because v_{\text{rms}} averages squares (which weights large values more), it is slightly larger than \langle v \rangle.
"The molecules push on the walls in all directions equally." They do, but the force per unit area (pressure) is the same on every wall by symmetry — an isotropic gas at rest has no preferred direction. The pressure derivation picked the right wall for concreteness, but the answer is the same for the top or the left wall or any other.
"Kinetic theory requires molecules to really be point particles with zero size." No — real molecules have a finite size, about 0.3 nm for simple gases. The derivation assumes that the molecule-to-wall collision times are brief compared to the round-trip time, which is true whenever the mean free path is much larger than the molecular diameter. For ordinary gases this is always true. At high densities it fails, and the van der Waals correction shows up.
"Temperature is what a thermometer reads; kinetic theory is just an analogy." The arrow of explanation runs the other way. The thermometer is a proxy for something that is physically the average molecular kinetic energy. When a thermometer bulb equilibrates with a gas, the mercury expands because its atoms, on average, are being jostled harder — Newton's third law at the surface. Temperature-as-kinetic-energy is what the thermometer is detecting.
"A gas at 0 °C has no energy." Wrong — 0 °C is 273 K, and \frac{3}{2} k_B \cdot 273 \approx 5.7 \times 10^{-21} J per molecule of translational kinetic energy. Absolute zero (0 K = −273.15 °C) is where classical kinetic energy vanishes, not the Celsius zero.

If you are comfortable with the derivation of P = \frac{1}{3}\rho\langle v^2\rangle and the interpretation of temperature as kinetic energy, you have the heart of kinetic theory. What follows deepens the derivation and traces where the assumptions come from and where they break down.

Why the cubical-box assumption does not limit generality

The derivation above used a cube of side L, but the answer P = \frac{1}{3}\rho\langle v^2 \rangle is independent of the container's shape. Here is why. The result is local: each small patch of wall feels a pressure that depends only on the local molecular density and mean-square velocity, neither of which knows about the distant walls. You could derive the same formula by considering the flux of momentum crossing any infinitesimal surface inside the gas (real or imaginary), with the same answer. The cube is just computational scaffolding.

Formally: the stress tensor of an ideal gas at rest is isotropic, \sigma_{ij} = P \delta_{ij} where P is the scalar pressure. Kinetic theory gives P = \frac{1}{3} n m \langle v^2 \rangle, where n = N/V is the number density, regardless of the container's shape. The cubical box was a pedagogical device to make the derivation bookkeeping easy.

Why exactly one-third?

The mysterious-looking factor of \frac{1}{3} in P = \frac{1}{3}\rho\langle v^2\rangle has a clean geometric origin. In three dimensions, the kinetic energy of a molecule decomposes into three equal parts — one for motion along each axis:

\tfrac{1}{2} m v^2 = \tfrac{1}{2} m v_x^2 + \tfrac{1}{2} m v_y^2 + \tfrac{1}{2} m v_z^2,

and by isotropy the three terms have equal averages. Pressure on the right wall comes only from v_x-motion, so it picks up one of the three equal pieces — hence the \frac{1}{3}.

If you lived in a two-dimensional world (a membrane of molecules), the factor would be \frac{1}{2}, and for a hypothetical d-dimensional world it would be \frac{1}{d}. The formula P = \frac{1}{d} n m \langle v^2 \rangle is dimensional in a very literal sense.

Relation to kinetic energy density

Rewrite the pressure formula slightly:

P V = \tfrac{1}{3} N m \langle v^2 \rangle = \tfrac{2}{3} \cdot N \cdot \tfrac{1}{2} m \langle v^2 \rangle = \tfrac{2}{3} E_{\text{trans}},

where E_{\text{trans}} is the total translational kinetic energy of the gas. So PV equals two-thirds of the total translational kinetic energy. For a monatomic gas, this is also the total internal energy U (monatomic atoms have no rotational or vibrational modes), giving U = \frac{3}{2} N k_B T = \frac{3}{2} PV. For diatomic gases like nitrogen or oxygen, U also includes rotational kinetic energy, and the prefactor changes (see Degrees of Freedom and Equipartition).

Maxwell's derivation of the speed distribution

Maxwell showed in 1860 that in thermal equilibrium, the probability that a molecule has velocity components in the infinitesimal ranges (v_x, v_x + dv_x), (v_y, v_y + dv_y), (v_z, v_z + dv_z) is

f(v_x, v_y, v_z) \, dv_x \, dv_y \, dv_z = \left(\frac{m}{2\pi k_B T}\right)^{3/2} e^{-m(v_x^2+v_y^2+v_z^2)/(2 k_B T)} \, dv_x \, dv_y \, dv_z.

Integrating over directions (with d^3v = 4\pi v^2 \, dv in spherical coordinates) gives the distribution of speeds:

F(v) \, dv = 4\pi \left(\frac{m}{2\pi k_B T}\right)^{3/2} v^2 e^{-mv^2/(2 k_B T)} \, dv.

The three characteristic speeds (v_p, \langle v \rangle, v_{\text{rms}}) come from, respectively, maximising F(v), computing \int v F(v) dv, and computing \sqrt{\int v^2 F(v) dv}. The ratios 1 : 1.128 : 1.225 drop out of the Gaussian integrals.

Maxwell's argument relied on a clever symmetry argument — requiring that the velocity distribution be isotropic and that different velocity components be statistically independent forces the distribution to be a Gaussian. This is now a standard result in statistical mechanics.

Where the "elastic" assumption comes from

The kinetic model assumes molecule-wall collisions are perfectly elastic. In reality, a gas molecule hitting a solid wall can temporarily stick, equilibrate with the wall's atoms, then depart in a random direction with whatever thermal velocity is characteristic of the wall temperature. This process is called accommodation, and the relevant parameter is the accommodation coefficient — the fraction of incoming kinetic energy actually exchanged with the wall.

For most gas-wall combinations at ordinary temperatures, the accommodation coefficient is close to 1 (nearly complete thermal accommodation), and the wall is itself at the same temperature as the gas — in which case the incoming and outgoing velocity distributions look identical in a statistical sense, exactly mimicking perfectly elastic bounces with the same distribution of v_x magnitudes. The macroscopic pressure comes out the same. The elastic assumption is a useful shortcut; the real mechanism is more subtle but yields the same answer for a gas in thermal equilibrium with its walls.

Failure modes: high pressure and low temperature

The ideal-gas model fails when two of its assumptions break down.

High pressure / high density. Molecular volume becomes non-negligible (assumption 2 fails). The van der Waals correction replaces V with (V - Nb), where b is a molar volume correction.

Low temperature / strong attractions. Intermolecular forces become non-negligible (assumption 3 fails). The van der Waals correction replaces P with (P + aN^2/V^2), accounting for the inward pull of molecules on each other that effectively reduces their pressure on the walls.

The combined van der Waals equation is

(P + a N^2/V^2)(V - Nb) = N k_B T,

and its study is the subject of Real Gases and van der Waals Equation.

Connection to Brownian motion

A tiny dust particle suspended in a gas is continuously bombarded by molecular impacts from all sides. Though the impacts average to zero force, statistical fluctuations give the particle a random walk — the Brownian motion observed by botanist Robert Brown in 1827. Einstein in 1905 showed that the mean-square displacement of the Brownian particle after time t is \langle x^2 \rangle = 2 D t, where the diffusion constant D = k_B T / (6\pi \eta R) depends on the viscosity \eta, particle radius R, and temperature T. Measuring Brownian motion in a microscope and comparing to this formula gave Perrin (1908) the first direct experimental determination of Avogadro's number — and therefore of the molecular existence that kinetic theory assumes.

Pressure, temperature, and the equipartition theorem

The result \frac{1}{2} m \langle v_x^2 \rangle = \frac{1}{2} k_B T (each component of velocity contributes \frac{1}{2} k_B T of average kinetic energy) is a special case of the equipartition theorem: in thermal equilibrium, every quadratic term in the energy (each translational velocity component, each rotational angular velocity component, each vibrational mode — potential and kinetic) contributes \frac{1}{2} k_B T to the average energy. This is the structural principle that lets kinetic theory extend from monatomic gases to diatomic, polyatomic, and even to solids. See Degrees of Freedom and Equipartition for the development.

A cautionary note on "temperature of a single molecule"

The phrase "temperature is the average kinetic energy per molecule" is sometimes over-interpreted to mean that a single molecule has a temperature. It does not. Temperature is a statistical property of a large ensemble; a single molecule has a kinetic energy at any instant, not a temperature. Kinetic theory makes this clear: T enters only through the average of v^2 over many molecules and long times. For a gas containing ten molecules, the fluctuations in the average kinetic energy would be comparable to the average itself — the gas would not have a well-defined temperature in the thermodynamic sense. You need the large-N limit to make the statistical average meaningful.

Where this leads next

Ideal Gas Laws — the empirical relations PV = Nk_BT and its historical precursors (Boyle, Charles, Gay-Lussac) that kinetic theory explains from the molecular picture.
Degrees of Freedom and Equipartition — how each mode of molecular motion picks up \frac{1}{2}k_BT, extending pressure-temperature connection to internal energy and specific heat.
Mean Free Path — the typical distance a molecule travels between collisions, the geometric quantity that controls diffusion, viscosity, and thermal conduction.
Real Gases and van der Waals Equation — how finite molecular size and intermolecular attraction modify the ideal-gas picture at high density and low temperature.
Internal Energy — the total microscopic energy of a gas (U = \frac{3}{2} N k_B T for a monatomic gas, more generally U = \frac{f}{2} N k_B T), the thermodynamic quantity that tracks what kinetic theory computes.