Chapter 3 1 min read
Save

Exploratory Data Analysis

Data Science and Analytics · BCA · Updated Apr 23, 2026

Table of Contents

Exploratory Data Analysis

EDA uses statistical and visual methods to understand data before modelling. It reveals patterns, relationships, anomalies, and insights.

Descriptive Statistics

Central tendency: mean, median, mode. Dispersion: range, variance, standard deviation, IQR. Shape: skewness and kurtosis.

Data Visualization

Histograms (distribution), box plots (quartiles, outliers), scatter plots (relationships), bar charts (categories), line charts (trends), heatmaps (correlations). Libraries: matplotlib, seaborn, plotly.

Correlation Analysis

Pearson (linear, -1 to +1), Spearman (monotonic, rank-based). Correlation ≠ causation. Correlation matrices and pair plots for multiple variables.

Distribution Analysis

Normal, exponential, uniform, Poisson. Q-Q plots, Shapiro-Wilk test. Distribution knowledge guides model selection.

Univariate and Multivariate

Univariate (one variable), bivariate (two variables, scatter plots), multivariate (PCA, clustering, dimensionality reduction).

Best Practices

Start with shape and types. Check missing values. Visualise distributions. Examine correlations. Look for outliers. Document findings. EDA is iterative.

Summary

EDA reveals data characteristics and relationships through statistics and visualisation, informing all subsequent analysis decisions.

Related Notes

Discussion

0 comments

Join the discussion

Log in to share your thoughts and help fellow students.

Log in to comment

No comments yet. Be the first to share your thoughts!