Multiple Regression

Bedrooms alone is a limited model. House price is shaped by location, square footage, proximity to good schools, whether there's a pool, and dozens of other factors. We need more than one predictor.

Multiple linear regression extends the single-predictor model to as many features as you have. For our house-price problem, suppose we add location (distance from city center in miles) and whether the house has a pool:

price^=β0+β1(bedrooms)+β2(distance)+β3(pool)\hat{\text{price}} = \beta_0 + \beta_1(\text{bedrooms}) + \beta_2(\text{distance}) + \beta_3(\text{pool})


Price is a weighted sum of predictors plus a baseline. OLS still minimizes SSE — it now finds the hyperplane (rather than a line) that does so. Python handles this identically: just pass a multi-column feature matrix.

A Fitted Multiple Regression

Suppose OLS returns:

price^=100,000+60,000(bedrooms)8,000(distance)+45,000(pool)\hat{\text{price}} = 100{,}000 + 60{,}000(\text{bedrooms}) - 8{,}000(\text{distance}) + 45{,}000(\text{pool})

Reading each coefficient:

  • β1=60,000\beta_1 = 60{,}000: Holding distance and pool constant, each additional bedroom adds $60,000 to predicted price.
  • β2=8,000\beta_2 = -8{,}000: Holding bedrooms and pool constant, each additional mile from city center reduces predicted price by $8,000.
  • β3=45,000\beta_3 = 45{,}000: Holding bedrooms and distance constant, having a pool adds $45,000 to predicted price.

Prediction: A 3-bedroom house, 5 miles from center, with a pool:
price^=100,000+60,000(3)8,000(5)+45,000(1)=325,000\hat{\text{price}} = 100{,}000 + 60{,}000(3) - 8{,}000(5) + 45{,}000(1) = 325{,}000

The Critical Clause: Holding All Else Constant

Every coefficient in a multiple regression carries a hidden clause: holding all other predictors constant.


Example: the coefficient on bedrooms was 65,000 in the simple model (Section 3) and 60,000 in the multiple model above. The difference is because the multiple model has already "absorbed" some of the bedroom effect via distance and pool. Coefficients shift when you add or remove predictors.


A dramatic version of this is sign reversal. In a dataset of houses, more bedrooms tends to mean higher price — a positive raw correlation. But once you control for square footage, an extra bedroom sometimes means smaller rooms, which can flip the coefficient negative. Interpreting a coefficient in isolation, without knowing what else is in the model, is unreliable.

Three Uses of Regression
Machine Learning
Prediction
Feed in predictor values, get an estimated output.
You don't need to interpret the coefficients — you just want accurate ŷ. The goal is minimizing prediction error on new data, not understanding what drives the outcome.
Look at
Accurate ŷ on new data
Causal Inference
Estimation
Quantify the effect of one specific predictor.
"How much does a pool add to price, after controlling for location and size?" You care about the value and uncertainty of a specific coefficient β, not overall predictive accuracy.
Look at
Coefficient value & uncertainty
Inferential Statistics
Hypothesis Testing
Test whether a coefficient is significantly non-zero.
Is the pool coefficient significantly different from zero? Could the relationship be explained by chance? OLS gives you a t-statistic and p-value for each coefficient.
Look at
t-statistic & p-value

The same model serves all three purposes. The question you're asking determines what you look at in the output.

Checkpoint

You build a multiple regression predicting house price from: bedrooms, square footage, and distance from city center. The coefficient on bedrooms is −$12,000. What is the most likely explanation?