Descriptive Statistics: Mean, Median, Mode, Standard Deviation, and More
Descriptive statistics summarise the key features of a dataset so you can understand its centre, spread, and shape without examining every single data point. Whether you are grading student scores, analysing survey responses, or tracking manufacturing tolerances, a handful of measures tell most of the story. This guide walks through the most important statistics, their formulas, and how to interpret them in practice.
Measures of Central Tendency
Central tendency describes the "middle" of your data. The three most common measures are the mean, median, and mode.
| Measure | Formula | Best used when |
|---|---|---|
| Mean | x̄ = (x₁ + x₂ + … + xₙ) / n | Data are symmetric with no extreme outliers |
| Median | Middle value when sorted; average of two middle values for even n | Data are skewed or contain outliers |
| Mode | Most frequently occurring value | Categorical data or identifying peaks in a distribution |
Use the Mean, Median, and Mode Calculator to find all three at once for any dataset, and the Average of Numbers Calculator for a quick arithmetic mean.
Measures of Spread: Variance and Standard Deviation
Knowing where the centre is not enough — you also need to know how spread out the values are. Variance and standard deviation both measure dispersion.
| Statistic | Population formula | Sample formula |
|---|---|---|
| Variance | σ² = Σ(xᵢ − μ)² / N | s² = Σ(xᵢ − x̄)² / (n − 1) |
| Standard deviation | σ = √σ² | s = √s² |
The sample formulas divide by n − 1 (Bessel's correction) to produce an unbiased estimate when you are working with a sample rather than an entire population. The Standard Deviation Calculator and the Variance Calculator handle both population and sample cases.
Z-Scores and Percentile Ranks
A z-score tells you how many standard deviations a single observation sits from the mean:
z = (x − μ) / σ
A z-score of 2 means the value is two standard deviations above the mean — unusual but not extreme. Negative z-scores indicate values below the mean. The Z-Score Calculator converts any raw value to a z-score instantly.
A percentile rank answers a complementary question: what percentage of the dataset falls at or below a given value? For example, scoring in the 85th percentile means you outperformed 85 % of the group. Use the Percentile Rank Calculator to find this without sorting data by hand.
Worked Example
Suppose a class of five students scored: 72, 85, 90, 90, 68.
- Mean: (72 + 85 + 90 + 90 + 68) / 5 = 405 / 5 = 81
- Median: Sorted: 68, 72, 85, 90, 90 → middle value = 85
- Mode: 90 (appears twice) = 90
- Sample variance: Deviations from 81: −9, 4, 9, 9, −13. Squared: 81, 16, 81, 81, 169. Sum = 428. Divided by 4 = 107
- Sample SD: √107 ≈ 10.34
- Z-score for 68: (68 − 81) / 10.34 ≈ −1.26
Inferential Statistics: Confidence Intervals and Margin of Error
Descriptive statistics summarise what you observed. Inferential statistics generalise to the broader population. A confidence interval gives a range of plausible values for a population parameter. At 95 % confidence:
CI = x̄ ± z* × (s / √n)
where z* = 1.96 for 95 % confidence. The half-width of that interval is the margin of error. The Confidence Interval Calculator and the Margin of Error Calculator apply these formulas automatically. Choosing how many participants to study before collecting data is done with the Sample Size Calculator.
Correlation
The Pearson correlation coefficient r measures the strength and direction of a linear relationship between two variables, ranging from −1 (perfect negative) to +1 (perfect positive). Values near 0 indicate no linear relationship. Correlation does not imply causation — two variables can move together for incidental reasons. Find r quickly with the Correlation Coefficient Calculator.
Counting Techniques: Permutations and Combinations
Statistics often involves counting arrangements or selections.
- Permutations (nPr): ordered arrangements: nPr = n! / (n − r)! Use the Permutation Calculator (nPr).
- Combinations (nCr): unordered selections: nCr = n! / [r! × (n − r)!] Use the Combination Calculator (nCr).
Common Mistakes
- Confusing population and sample formulas. Dividing by N instead of n − 1 underestimates the true population variance when working with a sample.
- Using the mean with skewed data. Income and home prices are heavily right-skewed; the median is usually the better summary.
- Ignoring outliers entirely. One extreme value can drag the mean far from the bulk of the data. Always check the range and z-scores.
- Mistaking correlation for causation. A strong r value alone cannot establish that one variable causes changes in another.
What is the difference between descriptive and inferential statistics?
Descriptive statistics (mean, median, standard deviation) summarise the data you actually have. Inferential statistics (confidence intervals, hypothesis tests) use that data to draw conclusions about a larger population.
When should I use the median instead of the mean?
Use the median whenever the data are skewed or contain outliers. In symmetric distributions without extreme values, the mean and median are close and either works well.
What does a standard deviation of zero mean?
Every value in the dataset is identical. There is no spread at all.