In short
A measure of central tendency is a single number that represents the "centre" of a dataset. The three standard measures are the mean (the arithmetic average), the median (the middle value when the data is sorted), and the mode (the most frequently occurring value). Each answers a slightly different question, and the right choice depends on the shape of the data.
Suppose a company has 5 employees with the following monthly salaries (in thousands of rupees):
What is the "typical" salary at this company?
If you average all five numbers, you get (20 + 22 + 25 + 23 + 210)/5 = 300/5 = 60 thousand. The mean salary is ₹60,000.
But look at the actual salaries. Four of the five employees earn between ₹20,000 and ₹25,000. Only one person — perhaps the owner — earns ₹2,10,000. Nobody earns anything close to ₹60,000. The "average" is a number that describes none of the employees.
If instead you sorted the data and picked the middle value — 23 — you would get a much more representative picture of what a typical employee earns.
This is the central tension in statistics: there is no single "best" way to summarize the centre of a dataset. The mean, the median, and the mode each tell you something different. Knowing when to use each one is not a mathematical skill — it is a judgment call, and it matters more than the computation itself.
The arithmetic mean
The most familiar measure. Add up all the values and divide by how many there are.
Arithmetic mean
For n observations x_1, x_2, \ldots, x_n, the arithmetic mean is
The symbol \bar{x} is read "x-bar."
What the mean does. The mean is the balance point of the data. If you placed the data values along a number line and put a physical weight at each value, the mean is where you would place the fulcrum so the line balances perfectly. Every value contributes to the mean, pulled in proportion to how far it is from the centre.
This is both a strength and a weakness. It is a strength because the mean uses all the information in the data — no observation is ignored. It is a weakness because extreme values (outliers) pull the mean toward them disproportionately. The salary example above is a textbook case: one outlier (₹2,10,000) yanks the mean far from where most of the data sits.
Properties of the mean.
-
The sum of deviations from the mean is always zero: \sum (x_i - \bar{x}) = 0. This is not a coincidence — it follows directly from the definition. Expand \sum (x_i - \bar{x}) = \sum x_i - n\bar{x} = n\bar{x} - n\bar{x} = 0.
-
If every observation is increased by a constant k, the mean increases by k: \overline{x + k} = \bar{x} + k.
-
If every observation is multiplied by a constant k, the mean is multiplied by k: \overline{kx} = k\bar{x}.
These properties make the mean algebraically convenient — it behaves well under the standard arithmetic operations.
Mean from a frequency table
When data comes in a frequency table — value x_i appears f_i times — the formula adjusts:
This is the same idea: each value is weighted by how many times it appears.
Mean from grouped data
When data is grouped into class intervals, you do not know the individual values — only that some number of observations fell in each interval. The standard approximation: replace each interval with its class mark (midpoint), then compute the weighted mean using the class marks as values.
If the class intervals are [l_i, u_i) with class marks m_i = (l_i + u_i)/2 and frequencies f_i:
This is an approximation because within each interval, the actual values might not be centred at the midpoint. But for most practical datasets, the approximation is excellent.
The weighted mean
Sometimes different observations carry different importance. A student's final grade might weight the exam at 60% and assignments at 40%. If the exam score is 72 and the assignment score is 88:
Weighted mean
For observations x_1, x_2, \ldots, x_n with corresponding weights w_1, w_2, \ldots, w_n:
The arithmetic mean is the special case where all weights are equal.
The median
Sort the data. Pick the middle value. That is the median.
Median
For n observations arranged in ascending order:
- If n is odd, the median is the value at position (n+1)/2.
- If n is even, the median is the average of the values at positions n/2 and n/2 + 1.
For the salary data 20, 22, 23, 25, 210 (already sorted), n = 5 (odd), so the median is at position (5+1)/2 = 3. The third value is 23. The median salary is ₹23,000 — a much better summary of the "typical" salary than the mean of ₹60,000.
What the median does. The median splits the sorted data exactly in half: at least 50% of the observations are at or below it, and at least 50% are at or above it. The median is insensitive to outliers. You could change the ₹2,10,000 salary to ₹2,00,00,000 and the median would still be 23 — because the median only cares about which value sits in the middle position, not how far the extreme values are.
Median from grouped data. When data is grouped, you cannot pick the middle value directly. You use the median class — the class interval where the cumulative frequency first reaches or exceeds n/2 — and then interpolate within it:
where l is the lower boundary of the median class, F is the cumulative frequency of the class before the median class, f is the frequency of the median class, and h is the class width. This formula assumes the observations are uniformly distributed within the median class.
The mode
The mode is the value that appears most often.
Mode
The mode of a dataset is the observation with the highest frequency. A dataset can have one mode (unimodal), two modes (bimodal), or more. If all values appear equally often, the dataset has no mode.
For the data 2, 3, 3, 4, 5, 5, 5, 6, 7, the mode is 5 — it appears three times, more than any other value.
The mode has a natural interpretation that the mean and median lack: it is the value you are most likely to encounter. A shoe store deciding how many pairs of each size to stock cares about the mode — the most popular size — not the mean size (which might be 8.3, a size that doesn't exist).
Mode from grouped data. For grouped data, the modal class is the class with the highest frequency. If you want a single number instead of an interval, the standard formula is:
where l is the lower boundary of the modal class, f_1 is the frequency of the modal class, f_0 is the frequency of the class before it, f_2 is the frequency of the class after it, and h is the class width.
Choosing the right measure
This is the part that textbooks often skip, and it matters more than the formulas.
Use the mean when the data is roughly symmetric and has no extreme outliers. In this case, the mean, median, and mode are all close to each other, and the mean has the best mathematical properties (it uses all the data, it is unique, and it participates in further formulas like variance).
Use the median when the data is skewed or has outliers. Income data is the classic case: a small number of very high incomes pulls the mean up, making it unrepresentative. The median income is almost always a better summary of "what a typical person earns" than the mean income.
Use the mode when the data is categorical (favourite colour, most common shirt size) or when you want to know the most popular value. The mode is the only measure of central tendency that works for qualitative data — you cannot compute the mean of "red, blue, blue, green."
The empirical relationship. For moderately skewed distributions, there is a rough relationship:
This is an approximation, not an exact formula, but it gives you a quick sanity check: if you know two of the three measures, you can estimate the third.
Worked examples
Example 1: Mean, median, and mode of ungrouped data
The number of books read by 12 students during a summer break is:
Find the mean, median, and mode.
Step 1. Compute the mean.
Why: add all values and divide by the count. No value is weighted more than another.
Step 2. Find the median. Sort the data first:
There are 12 values (even), so the median is the average of the 6th and 7th values: (5 + 5)/2 = 5.
Why: with an even number of observations, the "middle" falls between two values. Averaging them is the standard convention.
Step 3. Find the mode. The value 5 appears 3 times; all other values appear once or twice. The mode is 5.
Why: the mode is simply the most frequent value. Here it is unambiguous — 5 occurs more often than any other number.
Step 4. Compare the three measures.
| Measure | Value |
|---|---|
| Mean | 4.83 |
| Median | 5 |
| Mode | 5 |
Result: Mean = 4.83, Median = 5, Mode = 5. The three measures are close, which tells you the data is roughly symmetric — no extreme outliers are dragging the mean away from the middle.
The closeness of the three measures is the signature of a symmetric distribution. When you see the mean and median diverge — as in the salary example — that is a signal of skewness, and the median becomes the more reliable summary.
Example 2: Mean from grouped data (using class marks)
The daily commute times (in minutes) for 40 office workers are grouped below.
| Commute time (min) | Frequency (f_i) |
|---|---|
| 10–20 | 4 |
| 20–30 | 8 |
| 30–40 | 14 |
| 40–50 | 10 |
| 50–60 | 4 |
Find the mean commute time.
Step 1. Compute the class mark m_i for each interval.
| Interval | Class mark m_i | Frequency f_i |
|---|---|---|
| 10–20 | 15 | 4 |
| 20–30 | 25 | 8 |
| 30–40 | 35 | 14 |
| 40–50 | 45 | 10 |
| 50–60 | 55 | 4 |
Why: the class mark is the midpoint of the interval — it is the best single-number representative of all values in that interval.
Step 2. Compute f_i \times m_i for each row.
| m_i | f_i | f_i \times m_i |
|---|---|---|
| 15 | 4 | 60 |
| 25 | 8 | 200 |
| 35 | 14 | 490 |
| 45 | 10 | 450 |
| 55 | 4 | 220 |
| Total | 40 | 1420 |
Why: each product f_i \times m_i represents the total contribution of that class to the overall sum. Fourteen workers commuting 35 minutes each contribute 14 \times 35 = 490 total minutes.
Step 3. Divide to get the mean.
Why: this is just the weighted-mean formula, with frequencies as weights.
Result: The mean commute time is 35.5 minutes.
The mean of 35.5 minutes sits in the modal class (30–40 minutes), exactly where you would expect it for a roughly symmetric distribution. The slight rightward pull comes from the 40–50 and 50–60 groups being slightly heavier (combined frequency 14) than the 10–20 and 20–30 groups (combined frequency 12).
Common confusions
-
"The mean is always the best measure." It is not. The mean is the best measure when the data is symmetric and free of outliers. When data is skewed (as income data almost always is), the median is a better summary. When data is categorical, only the mode makes sense.
-
"The median is always one of the data values." Only when n is odd. When n is even, the median is the average of two middle values, which might not be a value that appears in the data. For the data 2, 4, 6, 8, the median is (4 + 6)/2 = 5, and 5 does not appear in the dataset.
-
"A dataset always has exactly one mode." Not true. The data 2, 3, 3, 5, 5, 7 is bimodal — it has two modes (3 and 5). The data 1, 2, 3, 4, 5 has no mode at all, because every value appears exactly once.
-
"Changing one extreme value doesn't affect the mean much." It can affect it enormously if the change is large. Replace 9 with 900 in the books data and the mean jumps from 4.83 to 79.25 — a sixteenfold increase — while the median stays at 5.
-
"Mean from grouped data is exact." It is an approximation. By using the class mark as a stand-in for all values in the interval, you are assuming the data is evenly spread within each class. The approximation is usually good, but it is not exact.
Going deeper
If you came here to understand the three measures of central tendency and when to use each one, you have it — you can stop here. The rest is for readers who want the mathematical foundations and the connections to more advanced ideas.
The mean minimises the sum of squared deviations
There is a deep reason the mean is so important. Among all possible summary values, the mean is the unique number that makes the sum of squared deviations as small as possible:
To see this, expand the right side using a = \bar{x} + (a - \bar{x}):
The extra term n(a - \bar{x})^2 is always non-negative and equals zero only when a = \bar{x}. So moving away from the mean in any direction increases the total squared deviation. This property is why the mean is the natural starting point for measures of dispersion — variance is literally the sum of squared deviations from the mean.
The median minimises the sum of absolute deviations
Similarly, the median is the unique value that minimises the sum of absolute deviations \sum |x_i - a|. Squared deviations penalise large errors heavily; absolute deviations treat all errors equally. The mean and median are each "optimal" — but for different definitions of what "close to the data" means.
Relationship between mean, median, and mode in skewed distributions
For a unimodal distribution with moderate skew:
- If the distribution is right-skewed (long tail to the right), then mode < median < mean.
- If the distribution is left-skewed (long tail to the left), then mean < median < mode.
- If the distribution is symmetric, all three coincide.
This ordering is not a theorem (it can fail for unusual distributions) but it holds remarkably often in practice. Income distributions are right-skewed: the mean income is pulled right by high earners, the mode is the most common income (usually lower), and the median sits between them.
The geometric and harmonic means
The arithmetic mean is not the only kind of mean. The geometric mean of positive numbers x_1, x_2, \ldots, x_n is
It is the right tool when quantities multiply together — for instance, compound growth rates. If an investment grows by 10% one year and 20% the next, the average annual growth rate is not (10 + 20)/2 = 15\% but rather \sqrt{1.10 \times 1.20} - 1 \approx 14.9\%.
The harmonic mean is
It is the right tool when you are averaging rates. If you drive 60 km at 40 km/h and then 60 km at 60 km/h, the average speed for the whole journey is not 50 km/h but the harmonic mean 2/(1/40 + 1/60) = 48 km/h.
For any set of positive, non-identical numbers: H \leq G \leq A (the AM-GM-HM inequality).
Where this leads next
Once you can summarise the centre of a dataset, the next question is: how spread out is the data around that centre? Two datasets can have the same mean but very different shapes.
- Measures of Dispersion — range, variance, and standard deviation: quantifying how far the data spreads from its centre.
- Quartiles and Percentiles — dividing the data into quarters and hundredths for a finer summary than just the median.
- Correlation — when you have two variables and want to know whether they are related.
- Data Organization — the prerequisite: how to turn raw data into frequency tables and graphs before computing any summary.