Sampling and Hypothesis Testing
Sampling selects a subset from a population to draw conclusions about the whole. Hypothesis testing uses sample data to test claims about population parameters. Together, they form the core of inferential statistics — making decisions under uncertainty.
Why Sample
A census is often impractical due to cost, time, destructive testing (testing light bulb lifespan destroys the bulb), or infinite population. Sampling is faster, cheaper, and can be more accurate (fewer errors with smaller, well-managed data). The key requirement: sample must be representative of the population.
Sampling Methods
Probability sampling (known chance of selection): Simple random (lottery, random number tables), Systematic (every kth item), Stratified (divide into groups, sample from each — ensures representation), Cluster (select entire clusters randomly — geographically convenient). Non-probability: Convenience, Judgement, Quota, Snowball. Probability sampling allows calculating sampling errors; non-probability does not.
Sampling Distribution
Many samples of size n from a population — their means form a sampling distribution. Mean of sample means = population mean (μ). Standard deviation of sample means = σ/√n (standard error). Central Limit Theorem: regardless of population shape, sampling distribution of means approaches normal as n increases (n≥30 sufficient). This is why normal distribution is so important.
Estimation
Point estimation: single value (sample mean x̄ estimates μ). Interval estimation: range with confidence level. Confidence interval for mean: x̄ ± Z(σ/√n) where Z depends on confidence (Z=1.96 for 95%, Z=2.576 for 99%). When σ unknown: x̄ ± t(s/√n). Wider intervals give more confidence but less precision — always a trade-off.
Hypothesis Testing Procedure
Step 1: State H₀ (null: no effect, e.g., μ=50) and H₁ (alternative: μ≠50, μ>50, or μ<50). Step 2: Choose significance level α (usually 0.05 or 0.01). Step 3: Calculate test statistic: Z = (x̄−μ₀)/(σ/√n) or t = (x̄−μ₀)/(s/√n). Step 4: Find critical value or p-value. Step 5: Reject H₀ if |test statistic| > critical value, or p-value < α. Step 6: Conclusion in context.
Types of Errors
Type I error (α): rejecting H₀ when true (false positive) — concluding drug works when it doesn't. Type II error (β): not rejecting H₀ when false (false negative) — concluding drug doesn't work when it does. Power = 1−β. Decreasing α increases β. Increasing sample size reduces both.
One-Tail and Two-Tail Tests
Two-tail (H₁: μ≠μ₀): rejection in both tails. Critical: ±1.96 at 5%. Use when any difference matters. One-tail (H₁: μ>μ₀ or μ<μ₀): rejection in one tail. Critical: 1.645 at 5%. More powerful for detecting effects in specified direction.
Summary
Sampling methods determine data collection. Estimation provides intervals for parameters. Hypothesis testing enables formal statistical decisions. Understanding Type I/II errors, significance levels, and test procedures equips students to critically evaluate research and make evidence-based business decisions.
Worked Example: Confidence Interval
A sample of 64 BBS graduates shows an average starting salary of Rs 28,000 with a standard deviation of Rs 4,000. Construct a 95% confidence interval for the population mean starting salary.
Solution: n = 64, x̄ = 28000, s = 4000, Z = 1.96 (for 95%)
Standard Error = s/√n = 4000/√64 = 4000/8 = 500
Confidence Interval = x̄ ± Z(SE) = 28000 ± 1.96(500) = 28000 ± 980
95% CI: Rs 27,020 to Rs 28,980
Interpretation: We are 95% confident that the true average starting salary of all BBS graduates lies between Rs 27,020 and Rs 28,980. This means if we took many samples and computed intervals, 95% of them would contain the true population mean. HR departments can use this range for salary benchmarking.
Worked Example: Hypothesis Testing
A coaching centre claims its students score more than 60 marks on average in BBS statistics exams. A sample of 36 students shows a mean score of 63 with SD of 12. Test the claim at 5% significance level.
Solution:
Step 1: H₀: μ = 60 (no effect) vs H₁: μ > 60 (claim — one-tail test)
Step 2: α = 0.05, one-tail critical value Z = 1.645
Step 3: Zₜₑₛₜ = (x̄ − μ₀)/(s/√n) = (63 − 60)/(12/√36) = 3/(12/6) = 3/2 = 1.5
Step 4: Compare: Zₜₑₛₜ (1.5) < Zₜₐƀₗₑ (1.645)
Step 5 — Decision: Do not reject H₀. The test statistic falls in the acceptance region.
Step 6 — Conclusion: At 5% significance level, there is insufficient evidence to support the coaching centre’s claim that average scores exceed 60. Although the sample mean (63) is above 60, the difference is not statistically significant — it could be due to sampling variation.
Decision Table for Hypothesis Testing
| Type of Test | H₁ | Critical Value (5%) | Reject H₀ if |
|---|---|---|---|
| Two-tail | μ ≠ μ₀ | ±1.96 | |Z| > 1.96 |
| Right one-tail | μ > μ₀ | 1.645 | Z > 1.645 |
| Left one-tail | μ < μ₀ | −1.645 | Z < −1.645 |
Exam Tips
Tip 1: Always write all 6 steps clearly — examiners award marks for each step even if the final answer is wrong. Tip 2: Identify whether the test is one-tail or two-tail from the wording: "more than," "less than," "exceeds" = one-tail; "different from," "changed," "not equal" = two-tail. Tip 3: Use Z-test when n ≥ 30 or σ is known; use t-test when n < 30 and σ is unknown. Tip 4: "Fail to reject H₀" is correct language — never say "accept H₀" (absence of evidence is not evidence of absence). Tip 5: For confidence intervals, wider interval = more confidence but less precision. 99% CI is wider than 95% CI.