Estimation
Estimation uses sample data to assign plausible values to population parameters. Point estimation gives a single best guess, while interval estimation provides a range capturing the parameter with specified confidence. Both are central to statistical practice.
Point Estimation
A point estimator is a statistic whose value approximates an unknown parameter. For example, the sample mean X̅ estimates the population mean μ, and the sample proportion p̂ estimates p. Different estimators may use the same data differently; quality criteria distinguish them.
Unbiasedness
An estimator θ̂ is unbiased if E[θ̂] = θ. The sample mean is an unbiased estimator of the population mean. The sample variance S2 with divisor n − 1 is unbiased; dividing by n produces a biased estimator. Unbiasedness is desirable but not sufficient on its own.
Consistency
An estimator is consistent if it converges in probability to the true parameter as n → ∞. Most reasonable estimators are consistent. Consistency, combined with bias and variance properties, guides estimator choice.
Efficiency
Among unbiased estimators, the one with minimum variance is called efficient. The Cramér–Rao lower bound gives a theoretical floor on the variance of any unbiased estimator, under mild regularity conditions.
Methods of Estimation
The method of moments sets sample moments equal to population moments and solves for parameters. Maximum likelihood estimation (MLE) chooses parameters maximizing the likelihood of observed data; under regularity it is consistent and asymptotically efficient. Bayesian estimation treats parameters as random variables with prior distributions and uses Bayes' theorem to form posteriors.
Interval Estimation
A confidence interval (CI) is a range of values computed from data that is likely to contain the true parameter. A 95% CI contains the parameter in 95% of hypothetical repeated samples. Higher confidence means wider intervals. CIs convey uncertainty more fully than point estimates alone.
CI for the Mean
When σ is known, a 100(1 − α)% CI for μ is X̅ ± zα/2 σ/√n. When σ is unknown (typical case), replace σ by S and z by tn − 1, α/2. Larger samples yield tighter intervals.
CI for Proportions and Variances
For a proportion, the normal-approximation interval p̂ ± zα/2 √(p̂(1 − p̂)/n) works for large n. Wilson and Clopper–Pearson intervals improve accuracy for small samples. Variance CIs use the chi-square distribution.
Sample Size Determination
Required sample size can be computed by inverting the margin-of-error formula. For a mean with margin E, n ≈ (zα/2 σ / E)2. Planning analyses carefully avoids underpowered studies.
Summary
Estimation provides numeric bridges from data to population parameters. Point estimators summarize; interval estimators quantify uncertainty. Understanding unbiasedness, consistency, efficiency, and the construction of confidence intervals is basic competence in data analysis.