11/15/23, 1:23 AM
17.6. Multiple Regression — Computational and Inferential Thinking
https://inferentialthinking.com/chapters/17/6/Multiple_Regression.html
11/12
Finally, we can inspect whether our prediction is close to the true sale price for our one test
example. Looks reasonable!
17.6.3.1. Evaluation
To evaluate the performance of this approach for the whole test set, we apply
predict_nn
to
each test example, then compute the root mean squared error of the predictions. Computation
of the predictions may take several minutes.
For these data, the errors of the two techniques are quite similar! For different data sets, one
technique might outperform another. By computing the RMSE of both techniques on the same
data, we can compare methods fairly. One note of caution: the difference in performance might
not be due to the technique at all; it might be due to the random variation due to sampling the
training and test sets in the first place.
Finally, we can draw a residual plot for these predictions. We still underestimate the prices of
the most expensive houses, but the bias does not appear to be as systematic. However, fewer
residuals are very close to zero, indicating that fewer prices were predicted with very high
accuracy.
143415.0
print
(
'Actual sale price:'
, test_nn
.
column(
'SalePrice'
)
.
item(
0
))
print
(
'Predicted sale price using nearest neighbors:'
, predict_nn(example_nn_row))
Actual sale price: 147000
Predicted sale price using nearest neighbors: 143415.0
nn_test_predictions
=
test_nn
.
drop(
'SalePrice'
)
.
apply(predict_nn)
rmse_nn
=
np
.
mean((test_prices
-
nn_test_predictions)
**
2
)
**
0.5
print
(
'Test set RMSE for multiple linear regression: '
, rmse_linear)
print
(
'Test set RMSE for nearest neighbor regression:'
, rmse_nn)
Test set RMSE for multiple linear regression:
33025.064938240575
Test set RMSE for nearest neighbor regression: 36067.116043510105
Skip to main content