Random Variables and Distributions
A random variable assigns a numerical value to each outcome of a random experiment. Random variables translate events into arithmetic, allowing probability questions to be analysed using calculus and algebra. They are the bridge from probability spaces to data.
Definition
A random variable X is a real-valued function on the sample space S. If X takes countably many values it is discrete; if it takes values in a continuum it is continuous. Random variables can also be mixed.
Probability Mass Function
For a discrete random variable, the probability mass function (PMF) is p(x) = P(X = x). It satisfies p(x) ≥ 0 and Σ p(x) = 1. Probabilities of events reduce to sums over the relevant outcomes.
Probability Density Function
For a continuous random variable, probabilities are described by a probability density function (PDF) f(x) satisfying f(x) ≥ 0 and ∫ f(x) dx = 1. The probability P(a ≤ X ≤ b) is ∫ab f(x) dx. The probability of any single value is zero for continuous variables.
Cumulative Distribution Function
The cumulative distribution function (CDF) F(x) = P(X ≤ x) is defined for all random variables, discrete, continuous, or mixed. It is nondecreasing, right-continuous, tends to 0 at −∞ and 1 at +∞. The PDF is the derivative of F for continuous distributions.
Expectation
The expected value or mean E[X] is Σ x p(x) (discrete) or ∫ x f(x) dx (continuous). It is the long-run average and satisfies linearity: E[aX + b] = aE[X] + b, and E[X + Y] = E[X] + E[Y] for any X, Y.
Variance and Standard Deviation
The variance Var(X) = E[(X − μ)2] = E[X2] − μ2 measures spread. The standard deviation σ is its square root and has the same units as X. Under affine transformation, Var(aX + b) = a2 Var(X).
Moments and Moment Generating Functions
The kth moment is E[Xk]; the kth central moment is E[(X − μ)k]. The moment generating function M(t) = E[etX], when it exists, encodes all moments and uniquely determines the distribution.
Joint and Marginal Distributions
For two or more random variables, the joint PDF/PMF describes their simultaneous behaviour. Marginal distributions are obtained by summing or integrating out the other variables. Independence of X and Y means the joint equals the product of the marginals.
Covariance and Correlation
The covariance Cov(X, Y) = E[(X − μX)(Y − μY)] measures linear association. Correlation ρ = Cov(X, Y) / (σX σY) is dimensionless and bounded in [−1, 1].
Summary
Random variables let us transport probability problems into analysis. PMFs, PDFs, CDFs, expectations, variances, and moments supply the primary numerical summaries. These ideas underpin every statistical method studied in the rest of the course.