Intensity and Loudness of Sound

In short

The intensity of a sound wave is the power it carries per unit area, I = P/A, measured in \text{W/m}^2. For a point source radiating uniformly into open space, the intensity at distance r follows the inverse-square law

I(r) = \frac{P}{4\pi r^2}.

Intensity is proportional to the square of the pressure amplitude: I = p_0^2/(2\rho v). Because human hearing spans twelve orders of magnitude in intensity — from 10^{-12}\ \text{W/m}^2 at the threshold of audibility to \sim 1\ \text{W/m}^2 at the threshold of pain — sounds are almost always described on the logarithmic decibel scale

\beta = 10\,\log_{10}\!\left(\frac{I}{I_0}\right)\quad\text{dB},\qquad I_0 = 10^{-12}\ \text{W/m}^2.

A factor of 10 in intensity is 10 dB. A factor of 100 is 20 dB. Doubling the intensity is 3 dB. The ear perceives "loudness" approximately logarithmically, which is why the decibel scale matches subjective experience so well.

On a Diwali night in Delhi, a laxmi bomb detonates in the alley outside your window. The sound pressure at your ear is of order 200 pascals — a thousandth of atmospheric pressure, but enormous as sound goes. Ten minutes later your mother speaks to you from across the room; her voice delivers to your ear a pressure of about 0.02 pascals. The firecracker's pressure was roughly ten thousand times larger than the conversation's. Its intensity — the energy delivered to your ear per second per square metre — was larger by a factor of a hundred million.

Yet the firecracker did not sound a hundred million times louder. If you had to guess, you might say ten times. Or fifty. Some very large number, but not eight orders of magnitude. Your ear compresses the dynamic range of the real world into a manageable subjective scale, and the compression is almost exactly logarithmic. The decibel scale is the physicist's way of writing intensities in the same logarithmic coordinates the ear uses naturally — turning an eight-order-of-magnitude intensity ratio into a civilised 80 decibels.

This article derives the intensity formula from the energy density of a sound wave, proves the inverse-square law for a point source, builds the decibel scale from scratch, and connects the formulae to real Indian sounds: the 35 dB hush of a pre-dawn temple, the 75 dB of Mumbai traffic on Marine Drive, the 110 dB of a Delhi Metro announcement, the 140 dB of a Diwali laxmi bomb at close range. By the end you will see why "loudness" is not just the intensity of the sound — it is the logarithm of the intensity, for reasons rooted in the mechanical response of the inner ear.

What intensity means

A sound wave carries energy. When the wave reaches your eardrum, the eardrum vibrates; work has been done on it. When the wave reaches a microphone, it drives a diaphragm; electrical energy emerges. The rate at which energy is delivered, per unit of area through which the wave passes, is the intensity of the wave:

I = \frac{\text{energy delivered per second}}{\text{area receiving the energy}} = \frac{P}{A}

where P is the power carried by the wave (joules per second, watts) and A is the area of the receiving surface (square metres). The units of I are therefore \text{W/m}^2.

Assumptions: The area A is oriented perpendicular to the direction of wave propagation. The wave amplitude is small enough that the medium behaves linearly. Any energy reflected or absorbed by the receiver is accounted for; P is the incident power.

This is the operational definition — it tells you how to measure intensity. But you would like a formula that expresses I in terms of the wave's own parameters (amplitude, frequency, medium properties). That comes next.

Intensity in terms of the wave amplitude

Consider a plane sound wave in a gas of density \rho travelling at speed v. Let the pressure variation at a fixed point be p(t) = p_0\sin(\omega t) — a sinusoidal oscillation of amplitude p_0 about the ambient pressure.

Each small volume element of gas oscillates back and forth. Its energy alternates between kinetic (when the element is moving fast through its equilibrium position) and potential (when the element is squeezed or expanded at the extremes). The total energy density u (energy per unit volume) averaged over a full cycle is

\langle u \rangle = \frac{p_0^2}{2\rho v^2}

Why: the derivation, which I defer to the going-deeper section, shows that the kinetic energy density averages to \tfrac{1}{4}\rho u_0^2 where u_0 is the particle velocity amplitude, and the potential energy density averages to the same value — these are equal because an SHM divides its energy equally between kinetic and potential on average. Converting u_0 to p_0 using the relation p_0 = \rho v u_0 for a plane wave gives the result above.

A plane wave travels a distance v in one second. The energy flowing per second through unit area perpendicular to the propagation direction is therefore the energy density times the speed:

I = \langle u \rangle \cdot v = \frac{p_0^2}{2\rho v^2} \cdot v = \frac{p_0^2}{2\rho v}

\boxed{I = \frac{p_0^2}{2\rho v}}

Why: this is the physically meaningful form. Intensity goes as the square of the pressure amplitude, scaled by the medium's impedance \rho v. Doubling the pressure amplitude quadruples the intensity. Halving the pressure amplitude quarters it. The proportionality I \propto p_0^2 is the reason factors-of-two in amplitude correspond to factors-of-four in loudness (in intensity) — not factors-of-two.

Numerical sanity check: for air at sea level, \rho = 1.2\ \text{kg/m}^3 and v = 343\ \text{m/s}, so \rho v = 412\ \text{kg/(m}^2\text{s)}. A conversational speech level has p_0 \approx 0.02 Pa, giving

I_{\text{speech}} = \frac{(0.02)^2}{2 \times 412} = \frac{4 \times 10^{-4}}{824} \approx 4.9 \times 10^{-7}\ \text{W/m}^2

About a microwatt per square metre — consistent with everyday experience of quiet conversation.

The inverse-square law for a point source

A point source radiates sound energy uniformly in all directions (or close enough, for most small loudspeakers, balloons bursting, or hand-held firecrackers). Let the source's total acoustic power output be P (watts).

At a distance r from the source, the energy per second is distributed over a sphere of area 4\pi r^2. By conservation of energy (no air absorption, no obstacles):

I(r) = \frac{P}{4\pi r^2}

Why: the surface area of a sphere at radius r is 4\pi r^2. A steady source emitting power P watts spreads its energy uniformly over this growing surface as the wave expands outward. So the intensity — watts per square metre — falls as 1/r^2. Doubling your distance from a point source quarters the intensity; halving your distance quadruples it.

This is the inverse-square law — the same law that governs the intensity of starlight, the brightness of a point light source, and the strength of an electrostatic field from a point charge. It is a geometric fact, not a material one: it comes from the geometry of spheres in flat three-dimensional space.

A point source emits a fixed total power $P$ in all directions. The energy spreads over expanding spherical shells; at radius $r$ the area is $4\pi r^2$, at radius $2r$ the area is $16\pi r^2$ — four times larger. Intensity (power per unit area) drops by the same factor.

Caveats to the inverse-square law

In practice, the law is modified by three real-world effects.

Absorption by air. At high frequencies and long distances, air absorbs sound energy — the attenuation is roughly exponential with distance. A 1 kHz tone decays by about 1 dB per 100 metres in dry air; a 10 kHz tone decays by 20 dB per 100 metres. High-frequency sounds (like the hiss of a cymbal) die quickly at distance, while low-frequency sounds (a temple drum, a monsoon thunderclap) carry for kilometres.
Ground reflection. If the source is near the ground, the reflected wave interferes with the direct wave, and the intensity pattern is not a simple 1/r^2.
Directional sources. A loudspeaker is not omnidirectional — it radiates more sound forward than to the rear. The formula I = P/(4\pi r^2) is for a true point source; a real speaker has a directivity pattern that concentrates the sound into some fraction of 4\pi.

For physics problems you will usually assume the idealised case: a point source, no absorption, no reflections, uniform radiation. The result is accurate to a factor of two or three for most real situations outdoors, which is plenty for the intensity levels we care about.

The decibel scale

Human hearing spans an enormous range of intensities. The quietest sound a young adult with healthy ears can just barely detect — the threshold of hearing — corresponds to I_0 \approx 10^{-12}\ \text{W/m}^2. The loudest sound the ear can tolerate before pain sets in — the threshold of pain — is about 1\ \text{W/m}^2 or so. The ratio is a trillion.

Expressing intensities directly in \text{W/m}^2 is cumbersome: you would be writing numbers like 10^{-10}, 10^{-8}, 10^{-6}, 10^{-4} all the time. The logarithmic decibel scale replaces this by a more manageable measure:

\boxed{\beta = 10\,\log_{10}\!\left(\frac{I}{I_0}\right) \text{ dB}}

where I_0 = 10^{-12}\ \text{W/m}^2 is the reference intensity (the threshold of hearing). The quantity \beta is the sound level in decibels (dB).

Why "deci"-bel? The bel is \log_{10}(I/I_0) directly — a unit named after Alexander Graham Bell, inventor of the telephone. One bel is a tenfold ratio. The bel proved too coarse for practical use, so the decibel (one-tenth of a bel) is the standard unit; hence the factor of 10 in the definition.

Reading the decibel formula

\beta = 10\,\log_{10}\!\left(\frac{I}{I_0}\right)

Evaluating at two reference points:

At I = I_0: \beta = 10\log_{10}(1) = 0 dB. The threshold of hearing is 0 dB.
At I = 1\ \text{W/m}^2 = 10^{12}\,I_0: \beta = 10\log_{10}(10^{12}) = 120 dB. The threshold of pain is 120 dB.

So the twelve-order-of-magnitude range of human hearing maps to 0–120 dB. A scale that ranges over twelve orders of magnitude physically, but over only a hundred-and-twenty on the dB scale, is exactly the compression your ear needs to describe what it hears.

Rules of thumb for dB arithmetic

Because \log_{10} has a few values that come up constantly, you can do most decibel arithmetic in your head:

Intensity ratio	dB difference
\times 10	+10 dB
\times 100	+20 dB
\times 1000	+30 dB
\times 2	+3 dB (since \log_{10}(2) \approx 0.301)
\times 1/2	-3 dB
\times 1/10	-10 dB

Doubling the intensity adds 3 dB. This is the most useful rule to memorise. If two equal-loud sources combine (incoherently, so their intensities just add), the combined level is 3 dB above either one alone. If one source is 10 dB louder than another, the combined level is essentially the louder source's level plus a negligible 0.4 dB.

A ladder of familiar Indian sounds

Source	Typical dB
Threshold of hearing (young healthy ear)	0
Pre-dawn inside a temple in the Himalayas	20
Library	30–40
Normal conversation at 1 m	60
Autorickshaw on a Bangalore road	75–85
Mumbai traffic on Marine Drive, peak hour	80–90
Delhi Metro platform announcement	95–105
Temple conch (shankh) blown at close range	110
Diwali firework (aerial burst) at 30 m	130
Laxmi bomb at arm's length	140+ (painful, risk of hearing damage)
Threshold of pain	120–130
Pneumatic drill at 1 m	100

India's standards for permissible ambient noise (from the Central Pollution Control Board) set 55 dB as the daytime limit in residential areas and 45 dB at night. Most Indian cities exceed this routinely by 15–30 dB — Mumbai and Delhi often sit in the 80 dB range at street level, which is why urban noise pollution is now a recognised public-health issue.

Adding two sources — the decibel calculation

Suppose two sources individually produce 80 dB at your position. What is the combined level?

Step 1. Convert dB back to intensities (relative to I_0).

80 = 10\log_{10}(I/I_0) \implies I = 10^8\,I_0

Each source contributes I_1 = I_2 = 10^8\,I_0.

Step 2. Add the intensities (assuming incoherent sources — otherwise add amplitudes first).

I_{\text{total}} = I_1 + I_2 = 2 \times 10^8\,I_0

Step 3. Convert back to dB.

\beta_{\text{total}} = 10\log_{10}(2 \times 10^8) = 10\log_{10}(2) + 10\log_{10}(10^8) = 3 + 80 = 83\ \text{dB}

Why: two incoherent sources of equal intensity double the total intensity, adding exactly 3 dB. Not 160 dB. Not 80 + 80 = 160. The dB scale is logarithmic — you do not add dB values directly, you add the underlying intensities, and then take 10\log_{10} of the sum.

Explore the dB scale

Drag the red dot to set a sound level $\beta$ (dB). The straight line $\log_{10}(I) = \beta/10 - 12$ maps the dB level to the intensity in $\text{W/m}^2$. Notice how the intensity changes by a factor of ten for every $10$ dB step — this is what "logarithmic" means quantitatively.

Intensity vs loudness — a distinction that matters

Intensity is a physical quantity measured in \text{W/m}^2. Given a microphone and an oscilloscope you can measure it with arbitrary precision. It is objective.

Loudness is a perception — it is what a listener reports about how loud a sound seems to them. It depends on the intensity, but also on the frequency of the sound, on the listener's age and hearing health, on the presence of other sounds, and on the duration of exposure. It is subjective.

For most purposes these track each other well: a louder-sounding sound is (almost always) a higher-intensity sound. But the relationship is not linear. The Weber-Fechner law, derived from a century of psychophysical measurements, says that the subjective sensation scales approximately as the logarithm of the physical stimulus:

\text{perceived loudness} \propto \log(I)

That proportionality is why the decibel scale — which is defined as \log(I/I_0) — corresponds so well to subjective experience. A sound that measures 20 dB higher than another feels roughly twice as loud, not 10^{2/10} = 100 times as loud. This is why professional audio mixers, architectural acousticians, and noise-ordinance writers all think in decibels.

The human ear's frequency dependence

There is one complication: the ear's sensitivity depends on the frequency. Sounds at 3000–4000 Hz (roughly the frequency of a baby's cry or an alarm bell) are perceived as louder than equally intense sounds at 100 Hz or at 15000 Hz. The equal-loudness contours (Fletcher-Munson curves) quantify this: at low intensities, low-frequency sounds need more physical intensity to sound equally loud. At high intensities, this effect diminishes.

Modern noise-level meters incorporate a frequency weighting called A-weighting to match the ear's sensitivity at moderate loudness, and quote levels in "dBA" (A-weighted decibels). A street noise measurement of 80 dBA is a physical intensity profile at several frequencies, weighted and summed to reflect how the human ear would integrate those frequencies into a sensation of loudness. For a pure tone this weighting changes the dB reading by up to \pm 40 dB depending on frequency; for broadband urban noise the correction is typically 0–5 dB.

Worked examples

Example 1: Safe distance from a Diwali firework

A Diwali aerial firework produces an acoustic power of 50 W at the moment of its burst. Treating it as a point source and ignoring atmospheric absorption, find: (a) the intensity 100 m away, (b) the sound level in dB at that distance, (c) the minimum safe distance if prolonged exposure requires the level to stay below 85 dB (the Indian occupational safe-exposure threshold).

A point-source firework of acoustic power $50$ W. Intensity falls as $1/r^2$; a given sound level corresponds to a specific radius from the burst.

Step 1. Intensity at 100 m.

I = \frac{P}{4\pi r^2} = \frac{50}{4\pi \times 100^2} = \frac{50}{125\,664} \approx 3.98 \times 10^{-4}\ \text{W/m}^2

Why: the surface area of a 100-metre sphere is 4\pi(100)^2 = 4\pi \times 10^4 \approx 1.26 \times 10^5\ \text{m}^2. Power 50 W spread over this area gives about 0.4 milliwatts per square metre.

Step 2. Convert to dB.

\beta = 10\log_{10}\!\left(\frac{3.98 \times 10^{-4}}{10^{-12}}\right) = 10\log_{10}(3.98 \times 10^{8}) = 10\,(0.60 + 8) = 86\ \text{dB}

Why: \log_{10}(3.98) \approx 0.60, and \log_{10}(10^8) = 8. Sum, times 10: about 86 dB. That is just above the safe-exposure threshold.

Step 3. Find the safe distance.

At the safe distance r_s, the sound level is exactly 85 dB, corresponding to I_s = 10^{85/10 - 12} = 10^{-3.5} = 3.16 \times 10^{-4}\ \text{W/m}^2.

Set P/(4\pi r_s^2) = I_s and solve for r_s:

r_s = \sqrt{\frac{P}{4\pi I_s}} = \sqrt{\frac{50}{4\pi \times 3.16 \times 10^{-4}}} = \sqrt{\frac{50}{3.97 \times 10^{-3}}} = \sqrt{1.26 \times 10^4} \approx 112\ \text{m}

Step 4. Reality check.

The answer is just slightly beyond 100 m, consistent with the fact that 86 dB at 100 m is only 1 dB above the safe threshold. Each 6 dB reduction requires doubling the distance (since \log_{10}(4) = 0.6 and 6 dB corresponds to a factor-of-4 intensity reduction, which corresponds to doubling the radius). So moving from 86 dB to 85 dB requires about \sqrt[3]{2} - 1 \approx 12\% further distance — and indeed 112/100 = 1.12. The arithmetic is self-consistent.

Result: At 100 m the level is about 86 dB. A safe distance for continuous exposure is 112 m.

What this shows: The inverse-square law means modest additional distance buys substantial protection from point-source noise. This is exactly why Diwali firecracker regulations (in cities like Delhi and Chennai) recommend minimum open-area distances — and why the laxmi bomb type, which puts out much higher acoustic power, is dangerous even at 10 m.

Example 2: Two autorickshaws

Two autorickshaws idle side by side in Bangalore traffic, each producing 85 dB of noise at your position. The signal turns green and one of them revs up, jumping to 91 dB (while the other stays at 85 dB). What is the combined sound level (a) before, and (b) after the rev-up?

Two equal $85$ dB sources combine to $88$ dB — a $3$ dB increase. When one source dominates by $6$ dB, the combination is only about $1$ dB above the dominant one.

Step 1. Compute intensities in \text{W/m}^2 (as multiples of I_0).

At 85 dB: I_1 = 10^{85/10}\,I_0 = 10^{8.5}\,I_0 \approx 3.16 \times 10^8\,I_0.

At 91 dB: I_1' = 10^{91/10}\,I_0 = 10^{9.1}\,I_0 \approx 1.26 \times 10^9\,I_0.

Step 2. Combine the intensities (incoherent sources — the phase relation of two separate engines is random, so their intensities add, not their amplitudes).

Before: I_{\text{total}} = 2 \times 10^{8.5}\,I_0 = 6.32 \times 10^8\,I_0.

After: I_{\text{total}} = 10^{9.1}\,I_0 + 10^{8.5}\,I_0 = (1.26 + 0.316) \times 10^9\,I_0 = 1.58 \times 10^9\,I_0.

Step 3. Convert back to dB.

Before:

\beta = 10\log_{10}(6.32 \times 10^8) = 10\,(0.80 + 8) = 88\ \text{dB}

Why: \log_{10}(6.32) \approx 0.80, so \log_{10}(6.32 \times 10^8) \approx 8.80. Times 10 gives 88 dB — exactly 3 dB above either one alone, as the doubling rule predicts.

After:

\beta = 10\log_{10}(1.58 \times 10^9) = 10\,(0.20 + 9) = 92\ \text{dB}

Step 4. Interpret.

Before the rev-up: two equal sources give 85 + 3 = 88 dB.

After: the 91 dB source dominates (it puts out four times the intensity of the 85 dB one). Adding 25\% more intensity to the louder one raises the dB level by 10\log_{10}(1.25) \approx 1 dB.

Result: 88 dB before, 92 dB after.

What this shows: Decibels do not add directly. When one source is 6 dB louder than another, adding the quieter source adds only about 1 dB to the total. This is why one loud machine on a shop floor can overwhelm many quieter ones — the intensity adds, but the louder source dominates.

Example 3: Temple conch and a microphone threshold

A temple priest blows a shankh (conch shell) producing 110 dB at 1 metre. A microphone with a threshold sensitivity of 20 dB (below which it produces no usable signal) is placed how far away would the conch sound just become inaudible to the microphone?

The $110$ dB conch at $1$ m falls by $20\log_{10}(r)$ dB at distance $r$. The microphone's $20$ dB threshold dictates the maximum usable distance.

Step 1. Relate the dB drop to the distance.

For a point source, I(r) \propto 1/r^2, so

10\log_{10}\!\frac{I(r)}{I(r_0)} = 10\log_{10}\!\frac{r_0^2}{r^2} = -20\log_{10}\!\frac{r}{r_0}

In other words: the sound level drops by 20\log_{10}(r/r_0) dB as you move from distance r_0 to distance r. Every doubling of the distance drops the level by 6 dB (since 20\log_{10}(2) \approx 6.02).

Why: inverse-square in intensity becomes linear in dB, with a slope of 20 dB per decade of distance (not 10, because intensity is squared in the dB formula through the implicit square in the area). This is why noise drops fast close to a source and slowly at large distances.

Step 2. Set up the equation.

We want \beta(r) = 20 dB, starting from \beta(1\text{ m}) = 110 dB. The drop required is 110 - 20 = 90 dB. Therefore

20\log_{10}(r/1) = 90 \implies \log_{10}(r) = 4.5 \implies r = 10^{4.5} \approx 31\,623\ \text{m} \approx 31.6\ \text{km}

Step 3. Reality check.

That is very far — more than the distance across most Indian cities. In practice air absorption and atmospheric mixing will make the conch inaudible at much shorter ranges; the calculation above is the geometric limit, ignoring absorption. For realistic conditions, atmospheric losses at 1 kHz give you another 1 dB per 100 metres, or about 10 dB per kilometre. Over 30 km that adds 300 dB of extra loss — completely dominating the geometric falloff. So in reality the conch is inaudible beyond perhaps 3–5 km, where geometric loss is \sim 70 dB and atmospheric loss is another \sim 30–50 dB.

Result: Geometric answer: r \approx 31.6 km. Realistic answer (with atmospheric absorption): 3–5 km.

What this shows: The inverse-square law is exact for ideal geometry but is not the whole story over long distances. Atmospheric absorption modifies the falloff, especially at higher frequencies. The conch blown at dawn in a hill temple is audible for kilometres — farther than human voice or a bell — because its lower pitch suffers less atmospheric absorption. The 20 dB drop per decade of distance is the quickest way to estimate how far sound will travel in the simple geometric model.

Common confusions

"Decibels are a unit of loudness." They are a unit of logarithmic intensity (or pressure, depending on the reference). They describe a physical quantity. Loudness is a subjective sensation, only approximately proportional to the dB level. The two are often close, but not the same — A-weighted decibels (dBA) are a closer proxy for loudness because they incorporate the ear's frequency response.
"Doubling a sound's dB value doubles its intensity." No — doubling the dB value squares the intensity ratio (approximately, for typical values). Going from 60 dB to 120 dB is a factor of 10^6 in intensity, not a factor of 2. The dB scale is logarithmic; arithmetic on dB values corresponds to exponential arithmetic on intensities.
"Adding two 60 dB sounds gives 120 dB." Absolutely not. Adding two equal intensities is 3 dB above either one. Two 60 dB sources combine to 63 dB. Two 100 dB sources combine to 103 dB. The dB scale does not permit simple addition; you must always convert back to intensity, add, and convert forward again.
"Negative dB values are impossible." They are perfectly fine. -10 dB means an intensity ten times lower than I_0 — i.e. 10^{-13} W/m². Physiologically this is below the threshold of hearing for adults, but it is a meaningful physical quantity. A healthy infant's ear can sometimes detect sounds at -5 to -10 dB; sensitive microphones routinely quote -10 dB noise floors.
"Sound intensity is proportional to pressure amplitude." It is proportional to the square of the pressure amplitude. Doubling the pressure amplitude quadruples the intensity (and adds 6 dB to the sound level). This is why a conversation held "twice as loud" in acoustic terms is actually a modest perceptual increase — about 3 dB, since doubling the intensity (not the pressure) is 3 dB.
"The inverse-square law applies in a room." Only in a room with perfectly absorbing walls. In a reflective room (the usual case), the direct sound falls off as 1/r^2 but the reverberant component is roughly constant with distance. Close to a source the direct sound dominates and you see 1/r^2. Far from the source, or in a highly reflective space, the reverberant field wins and the sound level is nearly uniform. This is why you cannot escape a loud speaker by moving to the far corner of a tiled bathroom.
"0 dB means no sound." No — it means "sound at the reference intensity I_0 = 10^{-12}\ \text{W/m}^2", the conventional threshold of human hearing. A sound at 0 dB is present but barely audible. Silence — no sound at all — is minus infinity on the dB scale.

If you came here to understand what intensity is, how it falls with distance, and how the decibel scale works, you have it. What follows is for readers who want the derivation of the energy density of a sound wave, the pressure-level scale, the A-weighting machinery, and the physiology of how the ear actually encodes loudness.

Deriving the intensity formula from the wave equation

For a plane sinusoidal sound wave in a gas, the displacement of a fluid element is

s(x, t) = s_0\sin(kx - \omega t)

The particle velocity (the velocity of the gas element) is

u(x, t) = \frac{\partial s}{\partial t} = -s_0\omega\cos(kx - \omega t)

with amplitude u_0 = s_0\omega. The pressure variation (derived in Sound Waves from the continuity and momentum equations for a fluid) is

p(x, t) = p_0\cos(kx - \omega t), \qquad p_0 = \rho v u_0 = \rho v s_0\omega

Step 1. Kinetic energy density.

u_K(x, t) = \tfrac{1}{2}\rho u^2(x, t) = \tfrac{1}{2}\rho u_0^2\cos^2(kx - \omega t)

Time-averaged over one cycle (\langle\cos^2\rangle = 1/2):

\langle u_K \rangle = \tfrac{1}{4}\rho u_0^2

Step 2. Potential energy density.

Compressing a gas element by a small strain \partial s/\partial x stores elastic energy \tfrac{1}{2}B(\partial s/\partial x)^2 where B is the bulk modulus. Using v^2 = B/\rho and the wave's \partial s/\partial x = s_0 k\cos(kx - \omega t):

u_P(x, t) = \tfrac{1}{2}B s_0^2 k^2\cos^2(kx - \omega t) = \tfrac{1}{2}\rho v^2 s_0^2 k^2\cos^2(kx - \omega t) = \tfrac{1}{2}\rho s_0^2 \omega^2\cos^2(kx - \omega t)

(using \omega = vk). Time-averaged:

\langle u_P \rangle = \tfrac{1}{4}\rho s_0^2\omega^2 = \tfrac{1}{4}\rho u_0^2

Why: kinetic and potential energies are equal on average — this is a universal feature of SHM (see Energy in SHM for the particle-on-spring version). The physical reason: a plane sound wave is a 1-D SHM at every location, with energy sloshing between kinetic (moving fluid) and potential (compressed fluid) as the wave passes.

Step 3. Total energy density.

\langle u \rangle = \langle u_K \rangle + \langle u_P \rangle = \tfrac{1}{2}\rho u_0^2 = \frac{p_0^2}{2\rho v^2}

where in the last step I substituted u_0 = p_0/(\rho v).

Step 4. Intensity.

Intensity is the energy density times the speed at which the wave carries energy — and the wave carries energy at the phase velocity v:

I = \langle u \rangle v = \frac{p_0^2}{2\rho v^2} \cdot v = \frac{p_0^2}{2\rho v}

which recovers the formula used above.

The sound pressure level

Because pressures are easier to measure directly than intensities (a microphone is a pressure transducer), sound levels are often expressed in terms of pressure. The sound pressure level is

\beta_p = 20\log_{10}\!\left(\frac{p_0}{p_{\text{ref}}}\right)

where p_{\text{ref}} = 20\ \mu\text{Pa} is the reference pressure corresponding to the threshold of hearing at 1000 Hz. The factor of 20 (rather than 10) reflects the fact that intensity goes as the square of pressure — so a pressure ratio of 10 corresponds to an intensity ratio of 100, which is 20 dB.

Some numerical correspondences:

p_0 = 20\ \mu\text{Pa}: \beta_p = 0 dB (threshold of hearing at 1 kHz).
p_0 = 0.02\ \text{Pa} (10^3 times reference): \beta_p = 60 dB (speech).
p_0 = 20\ \text{Pa} (10^6 times reference): \beta_p = 120 dB (threshold of pain).
p_0 = 2000\ \text{Pa} (10^8 times reference): \beta_p = 160 dB (jet engine at 1 m — immediate hearing damage).

In air at sea level, where \rho v \approx 412, the intensity level \beta_I = 10\log_{10}(I/I_0) and the pressure level \beta_p are numerically equal to within a fraction of a decibel. In other media (water, for example) they differ, because I_0 and p_{\text{ref}} are conventionally chosen to match in air but not in water.

The A-weighting correction

The human ear is not equally sensitive at all frequencies. The A-weighting filter, defined in international standards (IEC 61672-1), applies a frequency-dependent gain to a sound-level measurement to mimic the ear's response at moderate loudness (roughly 40 phon). Its gain in dB as a function of frequency f (Hz) is approximately

A(f) = 2.00 + 20\log_{10}\!\left(\frac{12200^2 f^4}{(f^2 + 20.6^2)(f^2 + 12200^2)\sqrt{(f^2 + 107.7^2)(f^2 + 737.9^2)}}\right)

This is the curve built into every noise meter's "dBA" setting. The gain is roughly -50 dB at 20 Hz (very low bass is not perceived as loud as its physical intensity would suggest), 0 dB at 1000 Hz (the reference), peak positive gain of a few dB near 2000–4000 Hz (the region of maximum ear sensitivity — matching the resonance of the ear canal and the pitch of a crying baby), and declining again at the upper end of the audible range.

For broad-spectrum noise (traffic, industrial machinery), A-weighted and unweighted measurements usually differ by 0–5 dB. For pure tones they can differ by tens of dB. A street-corner measurement of "75 dBA" and "75 dB" refer to physically different total acoustic energies; the first is what your ear reports, the second is the raw physical quantity.

Psychoacoustic loudness — the phon and the sone

Phon is the unit of equal loudness: a sound is N phons loud if it sounds as loud as a pure 1000 Hz tone at N dB. Phons thus reset the dB scale to match perceived loudness at each frequency. A 100 Hz tone at 60 dB is only about 40 phons loud — the bass is under-represented by the ear at moderate levels.

Sone is the unit of absolute loudness, designed so that a doubling of the perceived loudness corresponds to a doubling of the sone number. By convention, 1 sone = 40 phons. Two sones is about 50 phons; four sones, about 60 phons. The empirical relation is approximately L_{\text{sone}} = 2^{(L_{\text{phon}} - 40)/10}.

The sone scale is what radio and TV loudness normalisation ("loudness wars" in broadcast audio) is chasing — the goal is to keep the subjective sone level constant across programmes even when the physical dB level varies with content.

Beyond linearity — the ear's dynamic range compression

The inner ear's cochlea is not a linear detector. It performs a nonlinear compression of very loud sounds: as the physical intensity rises past about 60 dB, the electrical signal the auditory nerve produces grows more slowly than the stimulus. This compression is implemented by the outer hair cells, which amplify low-intensity sounds and resist high-intensity ones, giving the ear a usable dynamic range of \sim 120 dB in a system whose sensors can only report about 20 dB of linear range.

The consequence is that the "loudness" scale is not a simple logarithm but rather a piecewise function: approximately logarithmic below 60 dB, more compressive above. The phon/sone relations above are the empirical fit to this compression.

This is why prolonged exposure to sounds above 85 dB damages hearing: it is operating the ear in the compressed region for long enough that the outer hair cells' chemistry is depleted and the cells die. The Indian occupational safety limit of 85 dBA for 8-hour shifts, and 90 dBA for 2-hour shifts, is set by the integrated physical intensity that hair cells can survive.

The inverse-square law in flat space — why 4\pi r^2

The factor 4\pi in I = P/(4\pi r^2) comes from the fact that the surface area of a sphere of radius r in flat three-dimensional space is 4\pi r^2. In curved space the formula is different — in a hypersphere of positive curvature, the area of a sphere of radius r is 4\pi\sin^2(r/R)R^2 where R is the radius of curvature. For ordinary sound in ordinary space, this is a completely negligible correction; but in cosmology, where gravitational-wave intensities are computed over distances where spacetime curvature matters, the analogous inverse-square law is modified.

In two-dimensional spreading (say, waves on a drum head), the "sphere" is a circle, whose circumference grows as 2\pi r — so intensity falls as 1/r rather than 1/r^2. In one dimension (sound in a long thin tube), the intensity does not fall at all with distance (ignoring absorption), which is why tubular speaking-tubes (a staple of old passenger ships) carried voices over long distances without a microphone.

The geometry of the law reflects the geometry of space.

Where this leads next

Sound Waves — Nature and Propagation — the physical source of the intensity formula: compressions, rarefactions, and the wave equation for air.
Principle of Superposition — why two incoherent sound sources add intensities, while two coherent ones add amplitudes.
Beats — what happens when two close-frequency sounds superpose; a tool for tuning by ear that uses the intensity modulation principle.
Standing Waves and Normal Modes — the spatial distribution of intensity in a resonating system, where antinodes carry maximum energy flux locally but the global average flux is zero.
Doppler Effect — how the frequency and intensity of sound change when source or observer move, with applications from traffic radar to astrophysics.