In short

When two sound waves of nearly equal frequencies f_1 and f_2 superpose, the result at a fixed point is a tone at the average frequency \bar{f} = (f_1 + f_2)/2 whose amplitude varies slowly as a cosine envelope at the half-difference frequency (f_1 - f_2)/2. Because loudness responds to the square of the amplitude, the ear perceives a throb — a waxing and waning of intensity — every time the envelope reaches a peak, whether the cosine is +1 or -1. This happens twice per envelope cycle, so the audible beat frequency is

\boxed{f_{\text{beat}} = |f_1 - f_2|}.

Beats are how you can hear whether two instruments are in tune without knowing their absolute pitches: when the throbbing slows to zero, the two notes match. Every tabla player tuning their bayan against a tanpura drone, every piano tuner locking a string against a fork, and every sitar student pulling two chikari strings into unison is using this one formula.

A tanpura drones steadily in a corner of the rehearsal room — the tonic sa, a clean sinusoidal hum filling the space. A sitar student turns the peg of the sitar's main string, trying to bring it into tune with the drone. At first the two sounds clash — you hear a rough, throbbing sound, loud-quiet-loud-quiet, three or four times a second. The student tightens the peg a fraction. The throbbing slows — now twice a second. Tightens another hair's-width. Once a second. Tightens one more tiny amount and the throb disappears entirely. The two notes have locked into perfect unison; the rehearsal can begin.

You did not need to know what frequency either instrument was playing. You did not need a tuning machine or a reference fork or any measuring device at all. The sound itself told you, with precision of better than half a hertz, how far out of tune the sitar was. That is the phenomenon of beats: a ready-made tuning device that the superposition of two waves hands to you for free.

This article derives where that throbbing comes from, proves that its rate equals |f_1 - f_2|, and shows how to read the phenomenon in the algebra and in the ear. By the end you will understand why every stringed instrument in Hindustani and Carnatic music is tuned by listening for beats, and why the same trick underlies laser interferometry, radio heterodyning, and the way the human ear itself distinguishes nearly-identical pitches.

Two SHMs at a fixed point

Start with two sound waves of equal amplitude A but slightly different frequencies, both reaching your eardrum at the same location. At your ear, each wave is an SHM in time (see Principle of Superposition for why the wave at a single point is always an SHM). Take the two to be

y_1(t) = A\cos(2\pi f_1 t), \qquad y_2(t) = A\cos(2\pi f_2 t)

with f_1 and f_2 close to each other — say both near 440 Hz, differing by a few hertz. By the principle of superposition, the net displacement of your eardrum is simply the sum:

y(t) = y_1(t) + y_2(t) = A\cos(2\pi f_1 t) + A\cos(2\pi f_2 t)

The question is: what does this sum sound like?

Deriving the beat formula

Step 1. Apply the sum-to-product identity for cosines:

\cos\alpha + \cos\beta = 2\cos\!\left(\frac{\alpha + \beta}{2}\right)\cos\!\left(\frac{\alpha - \beta}{2}\right)

Set \alpha = 2\pi f_1 t and \beta = 2\pi f_2 t:

y(t) = 2A\cos\!\left(2\pi\,\frac{f_1 + f_2}{2}\,t\right)\cos\!\left(2\pi\,\frac{f_1 - f_2}{2}\,t\right)

Why: the identity is a standard trigonometric result proved by expanding \cos(\alpha + \beta)/2 \cdot \cos(\alpha - \beta)/2 using the angle-sum formulas — the cross terms cancel and the direct terms combine. What matters here is that it converts the sum of two cosines at different frequencies into a product of two cosines: one at the average frequency, one at the half-difference frequency.

Step 2. Name the two timescales.

Define

\bar{f} = \frac{f_1 + f_2}{2} \quad\text{(average frequency, "carrier")}
\Delta f = f_1 - f_2 \quad\text{(frequency difference)}

Then

\boxed{y(t) = \underbrace{2A\cos\!\left(\pi\,\Delta f\,t\right)}_{\text{slow envelope}}\,\cdot\,\underbrace{\cos(2\pi\,\bar{f}\,t)}_{\text{fast oscillation}}}

Why: if f_1 and f_2 are both around 440 Hz and differ by \Delta f = 4 Hz, then the "carrier" \bar{f} \approx 440 Hz is the rapid oscillation you hear as a tone, while the "envelope" \cos(\pi \Delta f\, t) varies at \pi \Delta f = 4\pi rad/s — a period of 0.5 s. The two timescales are cleanly separated by more than two orders of magnitude.

Step 3. Interpret the structure.

The sound you hear is a sinusoid at the average frequency \bar{f}, but whose amplitude is not constant — it is multiplied by the slowly varying factor 2A\cos(\pi \Delta f\,t). At moments when the cosine is +1, the amplitude is 2A (full constructive interference). At moments when the cosine is 0, the amplitude is 0 (full destructive interference — momentary silence). At moments when the cosine is -1, the amplitude is -2A — but the sinusoid is still at full magnitude, just with a flipped phase, which the ear does not distinguish from the +2A case.

Beat pattern: fast oscillation inside a slow envelope A fast oscillating sinusoid in red, bounded above and below by a slow cosine envelope drawn in dashed red-soft. The envelope reaches zero twice per envelope cycle, which is when the sound goes momentarily silent. t y beat period T_beat = 1/|f₁−f₂| envelope ±2A cos(π Δf t) carrier cos(2π f̄ t)
The solid red curve is the full signal $y(t) = 2A\cos(\pi\Delta f\,t)\cos(2\pi\bar{f}\,t)$. The fast wiggle at $\bar{f}$ is the perceived tone; the slow dashed envelope at frequency $\Delta f / 2$ is its amplitude modulation. The ear hears a throb every time the envelope reaches zero — and that happens *twice* per envelope cycle, once for the positive zero-crossing and once for the negative one.

Step 4. The key point — the ear tracks loudness, not amplitude.

The loudness of a sound at any instant is proportional to the square of the amplitude (intensity — see Intensity and Loudness of Sound). The envelope factor 2A\cos(\pi\Delta f\,t) is squared, giving 4A^2\cos^2(\pi\Delta f\,t) = 2A^2[1 + \cos(2\pi\Delta f\,t)].

That is the essential observation: the perceived loudness oscillates at \Delta f, not at \Delta f/2. The envelope swings positive, negative, positive, negative — but the magnitude of the envelope, which is what the ear registers, cycles at twice that rate.

\boxed{f_{\text{beat}} = |f_1 - f_2|}

Why: \cos^2\theta = \tfrac{1}{2}(1 + \cos 2\theta). Squaring a cosine doubles its frequency. The envelope of y(t) cycles at \Delta f /2, but the squared envelope — which is what the ear detects — cycles at \Delta f. So the number of loud-silent-loud transitions you hear per second is \Delta f, exactly the frequency difference.

Watch the beats

Animated: beat pattern of two cosines at 4 Hz and 5 Hz A red point traces the sum cos(2 pi 4 t) + cos(2 pi 5 t). The amplitude waxes and wanes with beat period 1 second — the difference frequency is 1 Hz. t (s) y
Superposition of two equal-amplitude cosines at $4$ Hz and $5$ Hz. The red trail is the sum; the pale trails are the $\pm$ envelope $\pm 2\cos(\pi t)$. The amplitude peaks at $t = 0, 1, 2, 3, 4$ s (every integer second) and drops to zero halfway between. Beat frequency: $|5 - 4| = 1$ Hz — which means one throb per second. Click replay to watch again.

Count the throbs in the animation. Between t = 0 and t = 4 s you see exactly four amplitude peaks — one per second, matching f_{\text{beat}} = 1 Hz. If you made f_1 = 4 Hz and f_2 = 6 Hz (difference 2 Hz), you would see two throbs per second. The formula is exact.

Why the tuning trick works

The strategy for tuning one instrument against another is now transparent. The moment the two instruments are in exact tune, f_1 = f_2, so \Delta f = 0 and f_{\text{beat}} = 0 — the throbbing stops entirely. Conversely, if you hear beats at all, the instruments are out of tune, and the number of beats per second tells you by how much.

The musician listens to the rate, adjusts the tuning peg, listens again, iterates. The rate slows as the tuning converges, and snaps to zero when it is exact. The sense that a particular tuning is "right" is not ear-magic; it is the disappearance of f_{\text{beat}}.

How to tell which way is sharp

The beat tells you |f_1 - f_2| but not the sign. Two tricks:

  1. Over-tighten and listen. Tighten the string a tiny bit past the tuning point. If beats slow (you were below), slacken back to the null. If beats speed up (you were above), slacken further past. A few iterations pin the sign.
  2. Use the tanpura as reference. A tanpura's drone is a thick chord of harmonics. Beats against a single pure tone produce a characteristic throb; beats against the rich harmonics are subtly coloured, and an experienced ear reads the colour as "flat" or "sharp."

In Indian classical tuning, the usual practice is to tune upward: start with the string slightly slack, bring it up slowly while listening to the beats against the tanpura, and stop the moment the beats disappear. The physiological logic is that the peg is easier to control when tightening than when loosening.

Worked examples

Example 1: Tabla against tanpura

A tanpura is droning a steady s (tonic) at exactly 220.0 Hz. A tabla player tightens the skin of the bayan (left drum) and strikes it while the drone sounds. The player hears three beats per second. What are the two possible frequencies of the bayan? If a further half-turn of the tightening ring produces a beat rate of one per second, and a full turn eliminates the beats, what was the direction of the initial mistuning?

Tabla bayan frequency possibilities relative to a 220 Hz drone A horizontal number line with 220 Hz marked at the centre and two candidate frequencies 217 and 223 Hz marked symmetrically at plus and minus 3 Hz. Arrows indicate the beat frequency is the absolute difference. tanpura 220 Hz bayan? 217 Hz bayan? 223 Hz Δf = 3 Hz Δf = 3 Hz Three beats per second ⇒ bayan is either 3 Hz above or 3 Hz below the drone
Three beats per second means the bayan is exactly $3$ Hz off from the $220$ Hz drone — but could be at either $217$ Hz or $223$ Hz.

Step 1. Apply the beat formula.

f_{\text{beat}} = |f_1 - f_2| = 3\ \text{Hz}

Why: the perceived throb frequency equals the magnitude of the frequency difference. It does not tell you which of the two tones is higher — only how far apart they are.

Step 2. Identify the two candidate bayan frequencies.

f_{\text{bayan}} = 220 \pm 3 = 217\ \text{Hz}\ \text{or}\ 223\ \text{Hz}

Step 3. Use the tightening experiment to pick between them.

Tightening the bayan's skin raises its pitch (shorter wavelength in the drum membrane for a given mode, hence higher frequency). After a half-turn of further tightening, the beat rate drops from 3 Hz to 1 Hz; after a full turn it drops to 0. The pitch is moving toward 220 Hz.

If the bayan had started at 217 Hz (below the drone), tightening would raise it toward 220, the beats would slow and finally disappear — consistent with what was observed.

If the bayan had started at 223 Hz (above the drone), tightening would raise it further, the beats would speed up — inconsistent with what was observed.

Why: the throb rate moves in the same direction as |f_{\text{bayan}} - f_{\text{drone}}|. When tightening decreases the rate, the bayan was below the drone.

Result: The bayan was initially at f_{\text{bayan}} = 217 Hz, i.e. 3 Hz below the tanpura drone. After one full turn of the tightening ring, it reached 220 Hz and was in tune.

What this shows: The beat rate is a magnitude. Whether you are above or below the reference is disambiguated by the direction of change when you deliberately shift your pitch. This is the universal procedure for tuning by ear — by watching how the beats respond to a known adjustment.

Example 2: Two sitar strings and a counted beat rate

A sitar has two unison drones called chikari, nominally tuned to the same high pitch (440 Hz). The student plucks both and counts 20 beats in 8 seconds, then tightens the peg of one string. Now they count 20 beats in 16 seconds. Find: (a) the frequency difference in each case, (b) which string was adjusted and by how many hertz.

Two chikari strings beating against each other Schematic of two parallel strings with close frequencies. A timeline below shows beat markers 20 beats in 8 seconds tightening to 20 beats in 16 seconds. string A: f_A string B: f_B 0 s 8 s before: 20 beats in 8 s ⇒ 2.5 Hz 0 s 16 s after: 20 beats in 16 s ⇒ 1.25 Hz
Two chikari strings played together. The student measures the beat count over a timed interval. Halving the beat rate corresponds to halving the frequency mismatch.

Step 1. Compute the beat frequencies.

Before tightening: f_{\text{beat},1} = 20/8 = 2.5 Hz.

After tightening: f_{\text{beat},2} = 20/16 = 1.25 Hz.

Why: beat frequency is the number of throbs per second — just count and divide by elapsed time.

Step 2. Translate into frequency differences.

|f_A - f_B|_1 = 2.5\ \text{Hz}, \qquad |f_A - f_B|_2 = 1.25\ \text{Hz}

The pitch gap halved, which means the tightened string moved 2.5 - 1.25 = 1.25 Hz closer to the other.

Step 3. Which string was tightened, and what is the new frequency?

If string B was originally at f_B = 440 Hz (reference), and string A was 2.5 Hz off, then f_A = 437.5 Hz or 442.5 Hz. Tightening raises the pitch by 1.25 Hz.

  • If f_A = 437.5 Hz (below B), tightening gives f_A = 438.75 Hz. Gap is |438.75 - 440| = 1.25 Hz. ✓
  • If f_A = 442.5 Hz (above B), tightening gives f_A = 443.75 Hz. Gap is |443.75 - 440| = 3.75 Hz. ✗

Only the first case is consistent with the measurement. So f_A was initially below 440 Hz, and tightening has moved it upward toward f_B.

Why: to pick between the two algebraically possible initial frequencies, use the observation that tightening made the beats slow down. If the adjusted string was below the other, tightening brings them closer (slower beats); if it was above, tightening moves it further (faster beats).

Result: String A was adjusted. Its frequency went from 437.5 Hz to 438.75 Hz, i.e. was raised by 1.25 Hz. Another similar adjustment would bring the two strings into unison.

What this shows: The beat method measures frequency differences with remarkable precision — one beat per ten seconds corresponds to an error of 0.1 Hz out of 440 Hz, or about 4 parts in 10^4. Beyond that, the ear can sometimes detect absence of beats more sensitively than a frequency counter could, because the tuning converges exactly in the limit of no throb.

Example 3: Stroboscopic beats — the spinning fan

A ceiling fan in a Mumbai apartment spins at an unknown speed close to 900 RPM. Under the flicker of a 50 Hz tubelight (which pulses at 100 Hz because each AC cycle produces two brightness peaks), the blades appear to creep backward at 1 revolution per minute. Find the true rotational frequency of the fan. (This is the same physics as acoustic beats: the superposition of the light's pulse rate with the fan's rotational rate produces an apparent slow motion at the difference frequency.)

Fan and stroboscopic light superposition A schematic fan rotor with three blades on the left, a lightbulb strobing at 100 Hz above, and a beat-rate indicator on the right showing 1 RPM apparent motion. fan at f_fan RPM tubelight 100 Hz apparent: 1 RPM backward
The tubelight strobes at $100$ Hz (i.e. $6000$ flashes per minute). If the fan rotates at exactly $6000/k$ RPM for integer $k$, the blades look frozen. The small mismatch between the fan rate and the nearest synchronous rate appears as a slow apparent motion at the "beat" rate.

Step 1. Convert strobe rate to RPM.

f_{\text{strobe}} = 100\ \text{flashes/s} \times 60\ \text{s/min} = 6000\ \text{flashes/min}

Step 2. Find the nearest synchronous fan rate near 900 RPM.

If the fan rotated at exactly 6000/k RPM for some positive integer k, each strobe would illuminate the blades at the same angular position, and the blades would appear frozen. Candidates:

  • k = 6: 6000/6 = 1000 RPM
  • k = 7: 6000/7 \approx 857 RPM
  • k = 8: 6000/8 = 750 RPM

Nearest to 900: the sync at 1000 RPM with k = 6.

Why: when the fan completes exactly one full revolution per six flashes, each flash catches each blade at the same spot as the previous flash — perfect sync. Near-sync shows as slow drift rather than full rotation.

Wait — there is also a three-blade consideration. With three identical blades, the apparent blade pattern repeats when the fan has turned one-third of a revolution. So "sync" happens whenever the fan has turned by any multiple of 120° in the inter-flash interval. Divide 6000 RPM by 3: effective sync rates are 2000/k RPM.

  • k = 2: 1000 RPM
  • k = 3: 667 RPM

Still, 1000 RPM is the nearest sync near 900 RPM.

Step 3. Apply the beat formula.

The blades appear to drift backward at 1 RPM. In the stroboscopic analogy, this is the difference between the fan's true rate and the nearest sync rate:

|f_{\text{fan}} - 1000\ \text{RPM}| = 1\ \text{RPM}

Therefore f_{\text{fan}} = 999 RPM or f_{\text{fan}} = 1001 RPM.

Why: the "beat" between a rotating object and a strobing light works exactly like an acoustic beat. The apparent motion reveals the difference f_{\text{fan}} - f_{\text{sync}} in sign and magnitude, because you can see which way the blades appear to move.

Step 4. The direction of apparent motion resolves the sign.

If the fan is slightly slower than 1000 RPM (say 999), each strobe catches each blade slightly behind where it was one flash ago — the blades appear to drift backward.

If the fan is slightly faster than 1000 RPM (say 1001), the blades appear to drift forward.

The problem states that the blades drift backward — so f_{\text{fan}} = 999 RPM.

Result: The fan's true rotational rate is 999 RPM.

What this shows: The same superposition mathematics that produces audible beats in acoustics produces visible stroboscopic beats in optics. This is the principle of the automobile timing light, the ultrasonic diagnostic velocimeter, and the heterodyne receiver in every radio: mix two close frequencies and read the slow beat to measure their difference with far greater precision than you could measure either one directly.

Common confusions

If you came here to understand why two close-frequency tones throb, what sets the beat rate, and how to tune by ear, you have it. What follows is for readers who want the energy accounting, the generalisation to unequal amplitudes, the connection to group velocity, and a look at the physiology of hearing that makes beats possible.

Unequal amplitudes — beats with partial cancellation

Repeat the derivation with y_1 = A_1\cos(2\pi f_1 t) and y_2 = A_2\cos(2\pi f_2 t), unequal amplitudes. The sum is

y(t) = A_1\cos(2\pi f_1 t) + A_2\cos(2\pi f_2 t)

The sum-to-product identity does not apply directly to unequal amplitudes. Instead, write the sum using phasors: at each instant, add two vectors of lengths A_1 and A_2 with a phase difference \varphi(t) = 2\pi\Delta f\, t that drifts slowly. By the cosine rule, the resultant amplitude is

A(t) = \sqrt{A_1^2 + A_2^2 + 2A_1 A_2\cos(2\pi\Delta f\,t)}

Step 1. Maximum: \cos = +1, amplitude A_1 + A_2.

Step 2. Minimum: \cos = -1, amplitude |A_1 - A_2|.

The envelope oscillates between A_1 + A_2 and |A_1 - A_2|, never reaching zero unless A_1 = A_2. The beats are still periodic at |f_1 - f_2|, but they are partial — the sound never completely dies away in the troughs.

This is why real-world beats (two instruments, never perfectly matched in volume) sound more like a gentle throb than a full on-off pulsing. In a perfect tuning demonstration with two electronic oscillators of equal amplitude, the silence at the trough is crisp; in the real world it is usually just a noticeable dip.

Energy accounting — where does the energy go at the null?

At the instant of maximum destructive interference, the eardrum is motionless — the sound has vanished. But the two sources are still pumping out energy. Where does it go?

The answer: nowhere. There is no contradiction, because conservation of energy applies to the entire spatial field, not to one point.

Consider two speakers pumping out pure tones at close frequencies. At any moment, the pattern of constructive and destructive interference is spatially distributed in the room. When the envelope is at zero at your ear, it is at a maximum at some other place. As time passes, the zones of constructive and destructive interference swirl through space, always rearranging. The total energy delivered to the room integrated over time, and over all positions, equals the sum of the two sources' outputs. Beats are a local phenomenon in time and space; the global energy balance is untouched.

More precisely: if you sum the intensity |y_1 + y_2|^2 over one full beat period, the cross-term 2 y_1 y_2 averages to zero (because it is periodic at the beat frequency). What is left is |y_1|^2 + |y_2|^2 — the sum of the two intensities, exactly what each source would contribute if the other weren't present.

Beats and group velocity

A wave packet — a pulse that is a superposition of waves at slightly different frequencies — travels through space at the group velocity v_g = d\omega/dk. The beat formula is the one-location special case of the general envelope-dynamics problem. At a single point, the envelope oscillates in time at the difference frequency \Delta\omega. As a function of position and time, the envelope moves at v_g.

For a packet whose frequencies are centred near \omega_0 with width \Delta\omega, the envelope has spatial extent \sim 1/\Delta k and moves at v_g. If \Delta\omega and \Delta k satisfy the wave's dispersion relation \omega(k), then v_g = d\omega/dk is generally different from the phase velocity v_\phi = \omega/k. For non-dispersive waves (like sound in air), v_g = v_\phi — the packet moves at the same speed as the individual wave crests. For dispersive waves (like deep-water waves or quantum-mechanical wave packets), they differ, and a packet can move faster or slower than its carrier wavelets.

The two-tone beat pattern is the simplest visualisable example of this group-velocity behaviour: the slow envelope is the "packet", and its oscillation in time at a fixed point (the beat) is what you hear as the throb. This is the link between the prosaic phenomenon of tabla tuning and the abstract formal machinery of wave packets and group velocity used throughout modern physics and communications engineering.

Multi-tone beats and the cochlea's place principle

Three or more close-frequency tones produce more complex beat patterns. For three tones at f_0 - \Delta, f_0, f_0 + \Delta with equal amplitude, the sum has a carrier at f_0 and an envelope (1 + 2\cos(2\pi\Delta t)) — the amplitude varies between -1 and 3. The pattern still has period 1/\Delta, but within one period it has a subtle substructure: a sharp main peak, a small secondary peak at the negative excursion. The ear hears a throb with a "skip" character rather than a simple pulse.

The human ear discriminates nearby pitches through the cochlea, a spiral-shaped fluid-filled organ in the inner ear whose basilar membrane has a position-dependent resonant frequency: high frequencies excite the thin end near the entrance, low frequencies excite the thick far end. Two nearby frequencies excite nearby regions of the basilar membrane, and their excitation patterns overlap. The local hair cells fire with the sum of the two oscillations — and what they send to the brain is, essentially, the beat envelope. The perception of "out-of-tuneness" and "roughness" in music, far from being mystical, is the local beat patterns in the cochlea being read out by the auditory nerve.

The just-noticeable frequency difference at 440 Hz is about 0.5 Hz for most listeners — which corresponds to a beat rate of 0.5 per second, i.e. one beat per two seconds, perfectly consistent with the ear's own beat-sensing machinery. Musicians trained in Indian classical music, where microtonal intonation is essential, can often discriminate 0.2 Hz differences at 440 Hz — the limit of practical biological precision.

Beats in three dimensions — the moiré pattern

If you draw two sets of parallel lines at slightly different spacings on two transparent sheets and overlay them, the interference of the two patterns produces a moiré pattern — visible dark and light bands at a much larger spacing than either of the line patterns. This is the two-dimensional beat: the spatial frequencies of the two line sets beat against each other to produce a coarse spatial modulation. Moiré is used in printing (to detect fakes), in strain analysis (to visualise microscopic displacement of a surface), and in displays (to prevent colour artefacts). It is the exact spatial analogue of the temporal beat you hear when tuning a tabla.

Where this leads next