Linear Regression — the bedrock of supervised learning
A real-estate firm wants to predict house prices from size, location, and age. The relationship looks roughly linear — can we find the best straight line through the data?
Imagine stretching a rubber band through a cloud of points. The band settles where it's least "pulled" by all the points simultaneously. That settling position is the regression line — it minimises the sum of squared vertical distances to every point.
ε = y − ŷΣ(yᵢ − ŷᵢ)²Click canvas to add points. Drag any point to update the regression line and R² in real time.
ŷ = β₀ + β₁x₁ + β₂x₂ + … + βₙxₙ
J(β) = (1/2m) Σᵢ (ŷᵢ − yᵢ)²
β = (XᵀX)⁻¹ Xᵀy
R² = 1 − SSres/SStot | SSres = Σ(yᵢ−ŷᵢ)² | SStot = Σ(yᵢ−ȳ)²
OLS finds the unique line that minimises the sum of squared errors — no iteration required for small datasets.
R² = 0.85 means 85% of variance in the target is explained by your features. The remaining 15% is noise or missing features.
Always plot residuals vs fitted values. Random scatter = good. Patterns = your linear assumption is violated.