Measures of Central Tendency
Central tendency identifies a single value that represents the centre or typical value of a dataset. It answers the question: "What is the average or most representative value?" The three main measures are mean, median, and mode, each with specific uses and limitations.
Arithmetic Mean
The arithmetic mean (average) is the sum of all values divided by the number of values: x̄ = Σx/n. For grouped data (frequency distribution): x̄ = Σfx/Σf where f is frequency and x is class midpoint. Properties: uses all data values, affected by extreme values (outliers), unique for a dataset, and the sum of deviations from the mean equals zero. The mean is the most widely used measure in business — average salary, average sales, average cost. Shortcut method: x̄ = A + Σfd/n (where A is assumed mean and d = x − A). Step-deviation method: x̄ = A + (Σfu/n) × h (where u = d/h and h is class width).
Weighted Mean
The weighted mean assigns different weights (importance) to values: x̄w = Σwx/Σw. Used when items have unequal importance — calculating GPA (credits as weights), price indices (quantities as weights), and composite scores (marks with different maximum scores). Example: if a student scores 80 in a 3-credit course and 90 in a 2-credit course, the weighted mean is (80×3 + 90×2)/(3+2) = 84, not the simple average of 85.
Median
The median is the middle value when data is arranged in ascending or descending order. For ungrouped data: if n is odd, median = middle value; if n is even, median = average of two middle values. Position = (n+1)/2. For grouped data: Median = L + [(n/2 − cf)/f] × h, where L is lower boundary of the median class, cf is cumulative frequency before the median class, f is frequency of the median class, and h is class width. The median is not affected by extreme values, making it better than the mean for skewed data (e.g., income distribution).
Mode
The mode is the most frequently occurring value. A dataset can be unimodal (one mode), bimodal (two modes), multimodal (more than two), or have no mode. For grouped data: Mode = L + [(f₁ − f₀)/(2f₁ − f₀ − f₂)] × h, where L is the lower boundary of the modal class, f₁ is the frequency of the modal class, f₀ and f₂ are frequencies of adjacent classes. The mode is used for categorical data (most popular product, most common shoe size) and is not affected by extreme values.
Geometric Mean
The geometric mean (GM) is the nth root of the product of n values: GM = (x₁ × x₂ × ... × xₙ)^(1/n). Using logarithms: log GM = Σlog x / n. The GM is used for averaging rates of change, growth rates, and ratios. Example: if investments grow 10%, 20%, and 30% over three years, the average growth rate is the geometric mean, not the arithmetic mean. GM ≤ AM always, and GM is undefined for zero or negative values.
Harmonic Mean
The harmonic mean (HM) is the reciprocal of the arithmetic mean of reciprocals: HM = n / Σ(1/x). Used for averaging rates when the denominator varies — average speed (when distances are equal), average price (when quantities purchased are equal). HM ≤ GM ≤ AM always. Relationship: AM × HM = GM² (for two values).
Relationship Between Mean, Median, and Mode
In a symmetric distribution: Mean = Median = Mode. In a positively skewed distribution (tail to right): Mode < Median < Mean. In a negatively skewed distribution (tail to left): Mean < Median < Mode. Empirical relationship: Mode ≈ 3 Median − 2 Mean (approximate, for moderately skewed data). This relationship helps estimate one measure from the other two.
Summary
Central tendency — mean, median, mode, geometric mean, and harmonic mean — summarises data with a single representative value. Choosing the right measure depends on data type, distribution shape, and the business question being answered. The mean is most common but the median is better for skewed data, and the mode for categorical data.
Worked Example: Arithmetic Mean
The monthly salaries (in Rs) of 8 employees are: 25000, 30000, 28000, 35000, 40000, 32000, 27000, 33000. Calculate the arithmetic mean.
Solution: x̄ = Σx/n = (25000 + 30000 + 28000 + 35000 + 40000 + 32000 + 27000 + 33000) / 8 = 250000 / 8 = Rs 31,250.
Interpretation: The average monthly salary is Rs 31,250. This single value represents the central tendency of the salary distribution and can be used for budgeting, comparison with industry averages, and salary benchmarking.
Worked Example: Grouped Data Mean
The marks obtained by 50 BBS students in statistics are given below:
| Marks | Frequency (f) | Midpoint (x) | fx |
|---|---|---|---|
| 0–20 | 5 | 10 | 50 |
| 20–40 | 10 | 30 | 300 |
| 40–60 | 18 | 50 | 900 |
| 60–80 | 12 | 70 | 840 |
| 80–100 | 5 | 90 | 450 |
| Total | Σf = 50 | Σfx = 2540 |
Solution: x̄ = Σfx / Σf = 2540 / 50 = 50.8 marks.
The average marks scored by BBS students in this exam is 50.8 out of 100. This tells us the overall performance level and helps faculty evaluate whether the exam was appropriately challenging.
Worked Example: Median for Grouped Data
Using the same marks data above, find the median.
Solution: n/2 = 50/2 = 25. The cumulative frequencies are: 5, 15, 33, 45, 50. The median class is 40–60 (cumulative frequency first exceeds 25). Using the formula:
Median = L + [(n/2 − cf) / f] × h = 40 + [(25 − 15) / 18] × 20 = 40 + [10/18] × 20 = 40 + 11.11 = 51.11 marks
The median (51.11) is close to the mean (50.8), suggesting the distribution is approximately symmetric. If they differed significantly, it would indicate skewness.
Worked Example: Mode for Grouped Data
Using the same data: The modal class is 40–60 (highest frequency = 18). f₁ = 18, f₀ = 10, f₂ = 12.
Mode = L + [(f₁ − f₀) / (2f₁ − f₀ − f₂)] × h = 40 + [(18−10) / (36−10−12)] × 20 = 40 + [8/14] × 20 = 40 + 11.43 = 51.43 marks
All three measures (Mean = 50.8, Median = 51.11, Mode = 51.43) are close together, confirming the distribution is nearly symmetric.
Comparison Table: When to Use Each Measure
| Measure | Best Used When | Advantages | Disadvantages |
|---|---|---|---|
| Mean | Data is symmetric, no outliers | Uses all values, algebraically tractable, basis for further statistics | Affected by extreme values |
| Median | Data is skewed or has outliers (e.g., income data) | Not affected by extremes, easy to understand | Ignores most data values, not algebraically useful |
| Mode | Categorical data, finding most popular item | Only measure for nominal data, not affected by extremes | May not exist or may be multiple, ignores most values |
| Geometric Mean | Growth rates, ratios, percentages | Accounts for compounding, less affected by extremes | Cannot handle zero or negative values |
| Harmonic Mean | Averaging rates (speed, price per unit) | Gives proper weight to rates | Strongly influenced by small values, cannot handle zero |
Exam Tips for Central Tendency
Tip 1: Always check if the question asks for ungrouped or grouped data formula — they are different. Tip 2: For grouped data, identify class midpoints correctly (lower limit + upper limit) / 2. Tip 3: When comparing mean, median, and mode, comment on skewness. Tip 4: If the question gives growth rates or percentages, use geometric mean (not arithmetic mean). Tip 5: Show all steps clearly with formulas — partial marks are given for correct working. Tip 6: The empirical relationship Mode ≈ 3 Median − 2 Mean is commonly asked. Tip 7: In short-answer questions, always state which measure is most appropriate and why.