Advanced Regression — polynomial features, interactions, and non-linear fits
A dataset of salary vs years of experience doesn't follow a straight line — early-career growth is steep, senior growth plateaus. A linear model underfits. We need curved fitting.
Polynomial regression is still linear regression — we just engineer new features from existing ones. Add x² and x³ as features, then fit a straight line in that higher-dimensional space.
ŷ = β₀ + β₁x
ŷ = β₀ + β₁x + β₂x²
ŷ = β₀ + β₁x + … + β₁₀x¹⁰ ← danger zone
[x] → [1, x, x², x³, …, xᵈ] then standard OLS
ŷ = β₀ + β₁x₁ + β₂x₂ + β₃x₁x₂
The x₁x₂ term captures how the effect of x₁ depends on x₂.
As degree ↑: bias ↓, variance ↑. The optimal degree balances both — find it with cross-validation.
Polynomial regression is linear regression with engineered features. The learning algorithm stays the same.
Tune degree via cross-validation. Pair with Ridge regularisation to avoid overfitting at degrees 3+.
Sometimes x₁ only matters when x₂ is high. Interaction terms like x₁×x₂ encode this dependency explicitly.