In short

Entropy is a state function, denoted S, defined for a system in any equilibrium state. Its change between two states is

\Delta S \;=\; \int \frac{dQ_{\text{rev}}}{T},

where the integral is taken along any reversible path connecting the two states, and T is the absolute temperature at which the reversible heat dQ_{\text{rev}} is exchanged. Because S is a state function, \Delta S depends only on the endpoints — even though Q and T along the actual (possibly irreversible) path may look nothing like the reversible path used to compute \Delta S. The second law of thermodynamics in entropy language is

\boxed{\;\Delta S_{\text{universe}} \;=\; \Delta S_{\text{system}} \;+\; \Delta S_{\text{surroundings}} \;\ge\; 0\;}

with equality only for reversible (idealised, infinitely slow, frictionless) processes, and strict inequality for every real process. This is the arrow of time: an isolated system's entropy can never decrease. Statistically (Boltzmann, 1877), entropy counts microstates:

S \;=\; k_B \ln W,

where W is the number of microscopic arrangements consistent with the macroscopic state and k_B = 1.38 \times 10^{-23} J/K is Boltzmann's constant. Macroscopic states with more available microstates have higher entropy; natural processes drift toward states with more microstates because there are simply more of them — a statistical bias so overwhelming it looks like a law of physics.

A tall glass of nimbu pani sits on a sun-warmed table in Chennai. Inside it, a cube of ice floats near the top. You glance at the glass, look away to check a WhatsApp message, and look back thirty minutes later. The ice is gone. The drink is cool. A thin film of condensation has beaded on the glass's outside. Nothing surprising has happened — this is the most ordinary of experiences, repeated in kitchens across India every afternoon of April and May. But you have just watched one of the deepest laws of physics work.

Why is it deep? Because the first law of thermodynamics — energy conservation — would equally permit the reverse process. Imagine the video of your glass played backwards: beads of condensation evaporate back into the air, the drink warms itself by a few degrees, and out of the uniform liquid a cube of solid ice spontaneously forms and rises to the surface. Every frame of this reverse video conserves energy perfectly. No joule has been created or destroyed; heat has simply moved from the drink to the ice instead of from the ice to the drink. The first law has no objection.

And yet, in every recorded second of every glass of nimbu pani in the history of Indian summers, no such reversal has ever been observed. The physics that forbids this — the physics that gives time its direction, that distinguishes "before" from "after" in a universe whose fundamental laws are time-symmetric — is summarised in one thing: entropy. An isolated system's entropy can never decrease. Hot things cool. Ordered things disperse. Information dissipates. Broken glasses never reassemble.

The goal of this article is to make entropy calculable and intuitive at the same time. Calculable in the sense that, given a process, you can compute \Delta S in joules per kelvin and stick it into the second law. Intuitive in the sense that, when you see sugar dissolve in your chai, you should be able to see why the second law forbids the reverse without reaching for a textbook. The construction begins from the Carnot cycle you already know — that is where Clausius discovered entropy in 1865 — and ends with the Boltzmann microstate formula, which tells you where entropy comes from in the first place.

The hint from the Carnot cycle

Return to the Carnot cycle derived in heat engines and Carnot cycle. A reversible engine between reservoirs at T_H and T_C absorbs Q_H from the hot reservoir, rejects Q_C to the cold one, and has efficiency

\eta_{\text{Carnot}} \;=\; 1 \;-\; \frac{Q_C}{Q_H} \;=\; 1 \;-\; \frac{T_C}{T_H}.

Cancel the 1s:

\frac{Q_C}{Q_H} \;=\; \frac{T_C}{T_H},

rearrange:

\frac{Q_H}{T_H} \;=\; \frac{Q_C}{T_C},

or, if you adopt the sign convention that Q_C is rejected by the system (so it is negative when counted as heat into the system):

\frac{Q_H}{T_H} \;+\; \frac{-Q_C}{T_C} \;=\; 0 \qquad \text{(signed, reversible Carnot)}.

Two heats, divided by their reservoir temperatures, sum to zero around one cycle. This is suspicious. A sum that is zero around every closed loop is the hallmark of a state function (the same way \oint \vec F \cdot d\vec r = 0 around every closed loop is the signature of a conservative force with a potential energy).

Clausius showed in 1865 that this observation extends to arbitrary reversible cycles — not just the Carnot cycle. Any reversible cycle can be approximated as a sum of infinitesimal Carnot cycles (imagine tiling the interior of any closed loop on a PV diagram with a fine grid of tiny Carnot cycles, whose mutual boundaries cancel). The general statement is the Clausius equality:

\oint_{\text{reversible}} \frac{dQ_{\text{rev}}}{T} \;=\; 0.

Why: a quantity whose closed-loop integral is zero is the differential of a function. The integral between two states then depends only on the endpoints, not on the path — just like the potential energy associated with a conservative force.

That function is the entropy S, defined so that

dS \;=\; \frac{dQ_{\text{rev}}}{T},

and over a finite reversible process from state A to state B,

\Delta S \;=\; S_B \;-\; S_A \;=\; \int_A^B \frac{dQ_{\text{rev}}}{T}.

Units: joules per kelvin (J/K). Entropy is extensive — double the amount of stuff, double the entropy — just like volume or internal energy.

The key subtlety — what "dQ_{\text{rev}}" means

A reader running into this definition for the first time can easily be tripped up. The integrand is dQ_{\text{rev}} — the heat exchanged along a reversible path. But S is a state function: \Delta S depends only on the endpoints, not on the path. So how do you use it in an irreversible process, where the heat exchanged is not dQ_{\text{rev}}?

The answer is the following rule, which will be used repeatedly in the worked examples: to compute \Delta S for any process (reversible or irreversible), find a reversible path between the same two endpoints and integrate dQ_{\text{rev}}/T along that path. The actual Q and T along the real (irreversible) path are irrelevant to computing \Delta S, because S is a state function.

This is the mental move that makes entropy calculations possible. In the worked examples below, you will apply it explicitly — one example involves a finite temperature difference (ice melting in lemonade), an irreversible process, but \Delta S is computed by pretending the process happened reversibly.

Why \Delta S_{\text{universe}} \ge 0 — the Clausius inequality

Consider now an irreversible cycle — a real engine, not a Carnot idealisation. Such an engine has efficiency \eta < \eta_{\text{Carnot}} (by Carnot's theorem), which means

1 - \frac{Q_C}{Q_H} \;<\; 1 - \frac{T_C}{T_H} \;\Longrightarrow\; \frac{Q_C}{T_C} \;>\; \frac{Q_H}{T_H}.

Moving both terms to the same side (with Q_C counted as outflow, so -Q_C is the signed heat into the system at T_C):

\frac{Q_H}{T_H} \;+\; \frac{-Q_C}{T_C} \;<\; 0 \qquad \text{(irreversible cycle)}.

The sum of Q/T taken around the cycle is negative for an irreversible engine. The generalisation to any cycle, reversible or not, is the Clausius inequality:

\boxed{\;\oint \frac{dQ}{T} \;\le\; 0,\;}

with equality iff the cycle is reversible.

Now use this to bound \Delta S for any real process going from state A to state B. Construct a cycle: the real (possibly irreversible) process from A to B, followed by a reversible return from B to A. Around this cycle the Clausius inequality says

\int_A^B \frac{dQ_{\text{real}}}{T} \;+\; \int_B^A \frac{dQ_{\text{rev}}}{T} \;\le\; 0.

The second integral is -\Delta S = S_A - S_B. Rearrange:

\int_A^B \frac{dQ_{\text{real}}}{T} \;\le\; S_B - S_A \;=\; \Delta S_{\text{system}}.

Why: the left side is the actual heat divided by temperature integrated along the real path; the right side is the state-function entropy change. The inequality says real heat flow produces less \int dQ/T than the entropy change demands — the difference is entropy generated internally by irreversibilities.

For an isolated system, there is no heat exchange with anything, so dQ = 0 everywhere along the real path. The left side is zero. Therefore

\Delta S_{\text{isolated}} \;\ge\; 0.

Why: an isolated system cannot dump entropy to its surroundings because it has no surroundings. Its entropy can only increase (irreversible process) or stay the same (reversible idealisation).

For a non-isolated system, the bookkeeping is done by combining system and surroundings (which is isolated, by definition — there is nothing outside "the universe"). Whatever heat leaves the system enters the surroundings; the surroundings' temperature T_{\text{surr}} is what sets the relevant T for the surroundings' entropy change. So:

\Delta S_{\text{universe}} \;=\; \Delta S_{\text{system}} \;+\; \Delta S_{\text{surroundings}} \;\ge\; 0.

This is the second law in the entropy language, and it is the cleanest form — one line, one inequality, everything else is consequence.

The arrow of time

Newton's equations, Maxwell's equations, and Schrödinger's equation are all time-reversal symmetric: if you take a solution and reverse the sign of every velocity (or every magnetic field, or the complex conjugate of every wave-function), the reversed evolution is also a valid solution. So why does the world we see have an arrow of time? Why do we remember the past and not the future? Why can eggs break but not un-break, why can perfume diffuse across a room but not un-diffuse?

The answer is statistical, and it is the content of the second law. A microstate in which the perfume molecules are all clustered in one corner is a valid microstate; time-reversed, the microstate in which uniformly distributed perfume molecules all rush back to one corner is also valid. But the first is one of vanishingly few microstates; the second is one of astronomically many. An isolated system, evolving under time-symmetric laws, wanders randomly through the space of microstates accessible to it, and overwhelmingly often finds itself in a high-entropy (many-microstate) macrostate, not a low-entropy (few-microstate) one.

The second law is not a new physical force. It is a statistical statement about counting. And the arrow of time — the feeling that "before" and "after" are different — is the macroscopic face of that counting.

The statistical picture — entropy as k_B \ln W

Suppose a system can be in any one of W distinct microstates, each equally likely. Boltzmann (1877) proved (with a calculation outlined in the going-deeper section) that the thermodynamic entropy of this system is

\boxed{\;S \;=\; k_B \ln W,\;}

where k_B = 1.38 \times 10^{-23} J/K is Boltzmann's constant and \ln is the natural logarithm. This formula is engraved on Boltzmann's tombstone in Vienna; it is one of the deepest equations in physics.

Two features make this formula powerful.

Additivity. If system 1 has W_1 microstates and system 2 has W_2, the combined system (treated as independent) has W_1 W_2 microstates. Entropies add:

S_{1+2} \;=\; k_B \ln(W_1 W_2) \;=\; k_B \ln W_1 \;+\; k_B \ln W_2 \;=\; S_1 \;+\; S_2.

Why: logarithms turn multiplication into addition, which is exactly what the additivity of an extensive quantity requires. If entropy were defined as W itself, it would multiply instead of adding; the \ln is what makes it an extensive thermodynamic quantity.

Low entropy = few microstates = improbable macrostate. A macrostate with W = 1 (only one microstate realises it) has S = 0 — an ordered crystal at zero temperature. A macrostate with W = 10^{25} (astronomically many microstates) has a large S — an ideal gas at room temperature. Processes that move a system from low-W to high-W macrostates are overwhelmingly probable because there are simply more high-W microstates to wander into. Processes in the reverse direction are not forbidden — they are merely, in the language of exponents, astronomically improbable.

A numerical feel for "astronomically improbable": a mole of gas has roughly 6 \times 10^{23} molecules. The number of microstates consistent with the gas being uniformly distributed in a 1 L container is of order W_{\text{uniform}} \sim e^{10^{23}}. The number of microstates in which every molecule has drifted into the left half of the container is W_{\text{left}} = W_{\text{uniform}}/2^{N} where N = 6 \times 10^{23}. So the ratio is

\frac{W_{\text{left}}}{W_{\text{uniform}}} \;=\; 2^{-6 \times 10^{23}} \;\approx\; 10^{-1.8 \times 10^{23}}.

This is so small that even if a Hubble volume's worth of gas were observed for the age of the universe, the spontaneous migration of all molecules to one half would never be seen. The second law is not a prohibition; it is a probability so biased that it behaves as one.

Worked examples

Example 1: Entropy change when ice melts in lemonade

You drop 50 g of ice at 0 °C into 250 g of lemonade at 27 °C in a thermally insulated glass. (Treat the lemonade's specific heat as that of water, c = 4186 J/(kg·K), and the latent heat of fusion of ice as L_f = 3.34 \times 10^{5} J/kg.) After a few minutes, everything is at a single final temperature T_f. Compute (a) the final temperature, (b) the entropy change of the ice, (c) the entropy change of the lemonade, and (d) the entropy change of the universe.

Ice melting into warm lemonade inside an insulated glassA schematic shows an insulated glass on the left labelled before, with a 50 g ice cube at 0 degrees Celsius floating in 250 g of lemonade at 27 degrees Celsius. On the right is the same glass after equilibrium, with no ice and the lemonade at a lower uniform temperature T_f around 13.7 degrees Celsius. An arrow labelled heat flows from the lemonade into the ice.Beforeice 50 g0 °Clemonade250 g @ 27 °CwaitAfterlemonade only300 g @ T_finsulated glass — no exchange with outside
Ice melts inside a thermally insulated glass of lemonade. Heat flows from the (warmer) lemonade into the (colder) ice, first melting the ice at 273 K and then warming the resulting water to the final temperature $T_f$.

Step 1. Find the final temperature.

Since the glass is insulated, energy is conserved inside it: heat lost by lemonade = heat gained by ice (in melting plus warming the meltwater from 0 °C to T_f).

Let m_i = 0.050 kg (ice), m_l = 0.250 kg (lemonade), T_i = 273 K (ice), T_l = 300 K (lemonade).

m_l c\,(T_l - T_f) \;=\; m_i L_f \;+\; m_i c\,(T_f - T_i).
(0.250)(4186)(300 - T_f) \;=\; (0.050)(3.34 \times 10^5) \;+\; (0.050)(4186)(T_f - 273).
1046.5 (300 - T_f) \;=\; 16700 \;+\; 209.3 (T_f - 273).
313950 - 1046.5\, T_f \;=\; 16700 + 209.3\, T_f - 57140,
313950 - 16700 + 57140 \;=\; 1046.5\, T_f + 209.3\, T_f,
354390 \;=\; 1255.8 \, T_f,
T_f \;=\; 282.2 \text{ K} \;\approx\; 9.2 \text{ °C}.

Why: straight calorimetry — the lemonade cools, giving up Q = m_l c (T_l - T_f), and that heat melts the ice (m_i L_f) and then warms the resulting water to T_f.

Step 2. Entropy change of the ice.

The ice undergoes two reversible-style sub-processes (for the purpose of computing \Delta S — remember, we use any reversible path between the same endpoints):

  • (i) Melt at constant T = 273 K.
  • (ii) Warm from 273 K to 282.2 K.

For (i): \Delta S_i^{(1)} = \dfrac{Q_{\text{melt}}}{T_{\text{melt}}} = \dfrac{m_i L_f}{273} = \dfrac{(0.050)(3.34 \times 10^5)}{273} = \dfrac{16700}{273} = 61.2 \text{ J/K}.

For (ii): \Delta S_i^{(2)} = \int_{273}^{282.2} \dfrac{m_i c\, dT}{T} = m_i c \ln\!\dfrac{282.2}{273} = (0.050)(4186)\,\ln(1.0337) = 209.3 \times 0.03316 = 6.94 \text{ J/K}.

Total: \Delta S_{\text{ice}} = 61.2 + 6.94 = 68.1 \text{ J/K} (positive — the ice melted and warmed, both of which absorbed heat).

Why: for a phase change at constant T, \Delta S = Q/T exactly. For a temperature change at constant mass, dQ_{\text{rev}} = mc\,dT and dS = mc\,dT/T integrates to mc \ln(T_f/T_i).

Step 3. Entropy change of the lemonade.

The lemonade cools reversibly (for the entropy calculation) from 300 K to 282.2 K:

\Delta S_{\text{lem}} \;=\; m_l c \ln\!\frac{T_f}{T_l} \;=\; (0.250)(4186) \ln\!\frac{282.2}{300} \;=\; 1046.5 \times \ln(0.9407) \;=\; 1046.5 \times (-0.06114) \;=\; -64.0 \text{ J/K}.

Why: the lemonade lost heat, so \Delta S_{\text{lem}} is negative. Its magnitude is slightly less than the ice's gain — that difference is the entropy generated by the irreversible heat flow across the finite temperature gap.

Step 4. Entropy change of the universe.

The glass is insulated, so the universe is just "system = ice + lemonade":

\Delta S_{\text{univ}} \;=\; \Delta S_{\text{ice}} \;+\; \Delta S_{\text{lem}} \;=\; 68.1 \;+\; (-64.0) \;=\; +4.1 \text{ J/K}.

Result: T_f = 282.2 K, \Delta S_{\text{ice}} = +68.1 J/K, \Delta S_{\text{lem}} = -64.0 J/K, \Delta S_{\text{universe}} = +4.1 J/K.

What this shows: Energy is conserved inside the insulated glass (first law), but entropy increases (second law). The positive \Delta S_{\text{universe}} is a quantitative measure of the irreversibility: heat flowing from the warm lemonade to the cold ice across a finite temperature gap is a one-way process, and the 4.1 J/K is how one-way. A perfectly reversible version of this process — for instance, using a Carnot engine to extract work from the temperature difference while equalising the temperatures — would leave \Delta S_{\text{universe}} = 0 exactly, but of course nobody does this with their lemonade.

Example 2: Free expansion of an ideal gas

A sealed thermally insulated container is divided into two halves by a partition. One half (volume V) contains n moles of ideal gas at temperature T; the other half (volume V) is evacuated. The partition is suddenly removed, and the gas expands to fill the full volume 2V. Compute \Delta S for the gas, the surroundings, and the universe.

Free expansion of an ideal gas into vacuumTwo rectangles side by side. The left rectangle shows an insulated box divided by a vertical partition. The left half is filled with gas particles and labelled n moles at temperature T, volume V. The right half is labelled vacuum, volume V. The right rectangle shows the partition removed and the gas evenly distributed throughout the full 2V box at the same temperature T.gas, n, T, VvacuumBeforeremovepartitiongas, n, T, 2VAfter
Joule's free-expansion experiment. An ideal gas doubles its volume into an empty half-container. Because no external work is done and no heat is exchanged (insulated box), the gas temperature stays fixed — but its entropy increases sharply.

Step 1. Identify the state change.

Before: volume V, temperature T, n moles. After: volume 2V, temperature T (ideal gas, U depends only on T, and U cannot have changed because the gas did no work and absorbed no heat — Q = 0, W = 0, \Delta U = 0).

Why: an ideal gas rushing into vacuum does no work because it pushes against nothing. With no heat exchange (insulated) and no work, \Delta U = 0; for an ideal gas, that means \Delta T = 0.

Step 2. Construct a reversible path between the same endpoints.

The actual process is violently irreversible — the gas is a pressure gradient racing into vacuum. But for computing \Delta S, replace it with an isothermal reversible expansion from V to 2V at temperature T (in contact with a thermostat). This is a valid substitute path because S depends only on endpoints.

Along the reversible isothermal expansion, \Delta U = 0, so Q_{\text{rev}} = W_{\text{rev}} = \int_V^{2V} (nRT/V')\,dV' = nRT \ln 2.

Step 3. Apply \Delta S = \int dQ_{\text{rev}}/T.

Since T is constant during the reversible substitute process,

\Delta S_{\text{gas}} \;=\; \frac{Q_{\text{rev}}}{T} \;=\; \frac{nRT \ln 2}{T} \;=\; nR \ln 2.

Why: the gas now has access to twice the volume per molecule, and logarithmically, the number of microstates scales as V^N, so \ln W scales as N \ln V and the change per mole is R \ln 2. The factor of \ln 2 is the statistical signature of "each molecule now has 2× the space".

Step 4. Entropy of surroundings.

The container is insulated. No heat was exchanged with the surroundings during the actual process. The surroundings' entropy is unchanged:

\Delta S_{\text{surr}} \;=\; 0.

Why: for the actual process (not the substitute reversible one), Q_{\text{surr}} = 0. The substitute path involved heat exchange with an imaginary thermostat, but that was only used to compute \Delta S_{\text{gas}} — it is not a real part of the actual process.

Step 5. Entropy of universe.

\Delta S_{\text{univ}} \;=\; \Delta S_{\text{gas}} \;+\; \Delta S_{\text{surr}} \;=\; nR \ln 2 \;+\; 0 \;=\; nR \ln 2 \;>\; 0.

For 1 mole (n = 1, R = 8.314 J/(mol·K)): \Delta S_{\text{univ}} = 8.314 \times 0.693 = 5.76 J/K.

Result: \Delta S_{\text{gas}} = nR\ln 2, \Delta S_{\text{surr}} = 0, \Delta S_{\text{univ}} = nR\ln 2 > 0. For one mole of gas at any temperature, 5.76 J/K is created from nothing.

What this shows: The free expansion is irreversible — it produced positive \Delta S_{\text{universe}} with no heat exchange at all. The entropy was generated internally when the gas raced into vacuum. To put the gas back into half the container, you would have to compress it, and that requires work — which, by the second law, you can never recover fully. The missing energy is traceable to exactly this 5.76 J/K of "lost availability" — a connection formalised in the concept of free energy (a topic for a later article).

Explore entropy change with a slider

The two examples above are snapshots. Drag the slider below to see how \Delta S = nR \ln(V_f/V_i) depends on the volume ratio during an isothermal expansion of one mole of ideal gas. The natural logarithm makes the first factor of 2 contribute the most; doubling again from 4 to 8 contributes less.

Interactive: entropy change during isothermal expansion as a function of volume ratio A plot of delta S equals R times natural log of V_f over V_i, in joules per kelvin, for one mole of ideal gas. As the reader drags V_f over V_i from 1 to 10, the entropy change rises logarithmically, reaching about 19.1 J/K at a ratio of 10. V_f / V_i (volume ratio) ΔS = R·ln(V_f/V_i) [J/K per mole] 0 5 10 15 20 2 3 4 6 8 10 ΔS(ratio) drag the red point along the axis
For one mole of ideal gas expanding isothermally from $V_i$ to $V_f$, $\Delta S = R \ln(V_f/V_i)$. The curve is logarithmic: doubling the volume adds $R \ln 2 \approx 5.76$ J/K, but doubling again (4× original) adds only another $R \ln 2$. Entropy compresses as volume grows — a statistical consequence of logarithms.

Why entropy "looks like disorder"

The phrase "entropy is a measure of disorder" appears in most Indian class-12 textbooks. It is a useful shorthand, but it can mislead. A precise version of the statement is: entropy measures the number of microstates consistent with a given macrostate. Macrostates that look ordered (every molecule in one corner; every spin aligned; every sugar molecule in the crystal) tend to have few microstates. Macrostates that look disordered (molecules spread out; spins random; sugar dissolved) tend to have many. The connection is statistical, not aesthetic.

Consider three concrete cases:

Sugar dissolving in chai. Crystalline sugar has a rigid, ordered lattice — each sugar molecule sits at a fixed site in the crystal. Its microstate count is small; its entropy is low. When the sugar dissolves, each molecule can now be anywhere in the cup, subject only to the constraints of being surrounded by water molecules. The microstate count explodes by a factor equivalent to V_{\text{cup}}/V_{\text{crystal}} per sugar molecule — roughly 10^4 or so. With 10^{22} sugar molecules in a spoonful, the total microstate count goes up by an absurd factor, and the entropy increase dominates everything. The reverse process — sugar molecules spontaneously finding each other and recrystallising — is not impossible, just impossibly improbable.

Ice melting in nimbu pani. Ice is an ordered crystal; liquid water is a tumbling, hydrogen-bond-shifting fluid. A liquid water molecule has many more rotational and translational microstates available than a crystal water molecule. Melting increases entropy; freezing decreases it. The second law does not forbid freezing — freezers exist! — but it insists that somewhere else in the universe, the entropy increases by more than the freezer's ice gained in order.

A broken glass. Before the fall: a few shards' worth of glass forming a functional drinking vessel. This is one specific arrangement among many. After the fall: shards scattered over the tile, each with a specific position and orientation. Before: one particular arrangement (or a small family of them — the glass didn't care exactly how its atoms were placed, as long as they held a drinking shape). After: essentially all arrangements on the tile are macroscopically indistinguishable from "broken glass". Microstate count went up by an enormous factor. Reverse — scatter of shards leaping back to form a whole glass — is not forbidden by any microscopic equation, just crushingly improbable.

Common confusions

Stop here if the point was to compute \Delta S and use the second law. What follows is the derivation of Boltzmann's S = k_B \ln W formula, the connection to information theory (Shannon's entropy), and a sharper form of the second law for open systems.

Derivation of S = k_B \ln W

Start from two systems, A and B, which can exchange energy. Let W_A(U_A) and W_B(U_B) be the number of microstates of each system as a function of its internal energy. The total system has total energy U = U_A + U_B fixed (isolated), and the number of microstates of the combined system when A has energy U_A is

W_{\text{tot}}(U_A) \;=\; W_A(U_A) \cdot W_B(U - U_A).

The combined system is overwhelmingly likely to be in the macrostate (value of U_A) that maximises W_{\text{tot}}. Take the derivative with respect to U_A and set it to zero:

\frac{d W_{\text{tot}}}{d U_A} \;=\; \frac{d W_A}{d U_A} W_B \;-\; W_A \frac{d W_B}{d U_B} \;=\; 0.

Divide by W_A W_B:

\frac{1}{W_A}\frac{d W_A}{d U_A} \;=\; \frac{1}{W_B}\frac{d W_B}{d U_B},

which is

\frac{d \ln W_A}{d U_A} \;=\; \frac{d \ln W_B}{d U_B}.

Why: the equilibrium condition is the equality of some quantity between A and B. Whatever that quantity is, it is the same for both systems in equilibrium.

Now invoke the thermodynamic equilibrium condition: two systems in thermal contact reach equilibrium when their temperatures are equal. And from the thermodynamic definition of temperature (derived in the article on temperature and thermometers):

\frac{1}{T} \;=\; \frac{\partial S}{\partial U}.

Comparing with the \ln W condition:

\frac{\partial S}{\partial U} \;=\; k \cdot \frac{\partial \ln W}{\partial U},

where k is some constant. Integrating,

S \;=\; k \ln W \;+\; \text{const}.

Fix the constant by the third law (at T = 0, a perfect crystal has W = 1 and S = 0): the additive constant is zero. Fix the proportionality k by matching the known thermodynamic entropy of an ideal gas to the microstate count of an ideal gas (a separate calculation, done in any graduate statistical mechanics text): k = k_B, Boltzmann's constant. Result:

S \;=\; k_B \ln W. \qquad \blacksquare

Shannon entropy and the information connection

In 1948, Claude Shannon defined the information entropy of a probability distribution \{p_i\} as

H \;=\; -\sum_i p_i \ln p_i.

For a uniform distribution over W outcomes, p_i = 1/W, and H = \ln W. Multiplying by k_B recovers the thermodynamic entropy. The two entropies — thermodynamic and informational — are the same object up to units: both measure "how many distinguishable configurations does this system have access to, on average".

This connection is not a metaphor. Erasing one bit of information requires dumping at least k_B T \ln 2 of energy as heat into the environment (Landauer's principle, 1961), which is a statement about the thermodynamic cost of reducing the information entropy of a physical system. Every bit your smartphone erases in Bangalore warms the room, very slightly, as a tax the second law imposes on the act of forgetting.

The second law for open systems — entropy flux and production

For a system that exchanges heat with its surroundings at temperature T, write the second law as

\frac{dS}{dt} \;=\; \frac{1}{T} \frac{dQ}{dt} \;+\; \sigma,

where \sigma \ge 0 is the entropy production rate (due to irreversibilities inside the system). The first term is the entropy flux — how fast entropy enters across the boundary. The total dS/dt can be positive, negative, or zero, depending on the balance of flux and production. A living cell has \sigma > 0 (irreversible chemical reactions) but dumps so much dQ/dt to the cooler surroundings that its total dS/dt is often slightly negative — it keeps itself in a low-entropy state by exporting more entropy than it produces. The whole of biology runs on this accounting.

The Gibbs entropy — for general probability distributions

For a system whose microstates are not equally probable, with probability p_i of being in microstate i, the Boltzmann formula generalises to the Gibbs entropy:

S \;=\; -k_B \sum_i p_i \ln p_i.

When all p_i are equal (the microcanonical ensemble, each accessible microstate equally likely), this reduces to k_B \ln W. When the probabilities follow the Boltzmann distribution p_i \propto e^{-E_i/k_B T} (the canonical ensemble, system in contact with a thermostat), the sum can be done and one recovers all the thermodynamic relations of a system at fixed T. This is the starting point of equilibrium statistical mechanics proper.

Why broken glasses never reassemble — a time-reversal argument

Take the microscopic trajectory of a glass being dropped. Every atom's position and velocity evolves under Newton's laws as the glass shatters into, say, 10^3 shards. Now imagine the time-reversed trajectory: every atom's velocity sign is flipped. By time-reversal symmetry of Newton's equations, this reversed trajectory is also a valid solution — it describes the shards spontaneously jumping up, fusing together, and reassembling into the glass. Why doesn't this happen?

Because the reversed initial condition — specifying the exact velocity of every atom in every shard with perfect precision — is a single microstate in a sea of countless similar-looking microstates. The shards plus surrounding air molecules have \sim e^{10^{25}} microstates all of which look like "a mess of shards on the floor". Only one of those has the time-reversed velocities that would lead to reassembly. A random initial condition drawn from the mess-of-shards macrostate overwhelmingly leads to the shards staying scattered, because the vast majority of microstates evolve to other mess-of-shards microstates. The assembly trajectory is one specific needle in the cosmic-scale haystack of disassembly trajectories.

This is the microscopic content of the arrow of time. It is not that Nature has a time-asymmetric law. It is that a random starting point in a high-entropy macrostate almost never leads, in the time-reversed direction, to a low-entropy one — because there is only so much "low entropy" to return to, and it is statistically invisible.

Where this leads next