tutorial_regression2
pdf
keyboard_arrow_up
School
University of British Columbia *
*We aren’t endorsed by this school
Course
DSCI100
Subject
Statistics
Date
Feb 20, 2024
Type
Pages
15
Uploaded by CountKuduMaster478
Tutorial 9: Regression Continued
Lecture and Tutorial Learning Goals:
By the end of the week, you will be able to:
Recognize situations where a simple regression analysis would be appropriate
for making predictions.
Explain the -nearest neighbour (
-nn) regression algorithm and describe
how it differs from k-nn classification.
Interpret the output of a -nn regression.
In a dataset with two variables, perform -nearest neighbour regression in R
using tidymodels
to predict the values for a test dataset.
Using R, execute cross-validation in R to choose the number of neighbours.
Using R, evaluate -nn regression prediction accuracy using a test data set
and an appropriate metric (
e.g.
, root means square prediction error).
In a dataset with > 2 variables, perform -nn regression in R using
tidymodels
to predict the values for a test dataset.
In the context of -nn regression, compare and contrast goodness of fit and
prediction properties (namely RMSE vs RMSPE).
Describe advantages and disadvantages of the -nearest neighbour
regression approach.
Perform ordinary least squares regression in R using tidymodels
to predict
the values for a test dataset.
Compare and contrast predictions obtained from -nearest neighbour
regression to those obtained using simple ordinary least squares regression
from the same dataset.
This tutorial covers parts of the Regression II chapter of the online textbook. You
should read this chapter before attempting the worksheet.
### Run this cell before continuing.
library
(
tidyverse
)
library
(
repr
)
library
(
tidymodels
)
library
(
GGally
)
library
(
ISLR
)
options
(
repr.matrix.max.rows =
6
)
source
(
"tests.R"
)
source
(
"cleanup.R"
)
Predicting credit card balance
In [ ]:
Source: https://media.giphy.com/media/LCdPNT81vlv3y/giphy-downsized-
large.gif
Here in this worksheet we will work with a simulated data set that contains
information that we can use to create a model to predict customer credit card
balance. A bank might use such information to predict which customers might be
the most profitable to lend to (customers who carry a balance, but do not default,
for example).
Specifically, we wish to build a model to predict credit card balance (
Balance
column) based on income (
Income
column) and credit rating (
Rating
column).
We access this data set by reading it from an R data package that we loaded at
the beginning of the worksheet, ISLR
. Loading that package gives access to a
variety of data sets, including the Credit
data set that we will be working with.
We will rename this data set credit_original
to avoid confusion later in the
worksheet.
credit_original <-
Credit
credit_original
Question 1.1
{points: 1}
Select only the columns of data we are interested in using for our prediction (both
the predictors and the response variable) and use the as_tibble
function to
convert it to a tibble (it is currently a base R data frame). Name the modified data
frame credit
(using a lowercase c).
Note: We could alternatively just leave these variables in and use our recipe
formula below to specify our predictors and response. But for this worksheet, let's
select the relevant columns first.
### BEGIN SOLUTION
credit <-
credit_original |>
select
(
Balance
, Income
, Rating
) |>
as_tibble
()
In [ ]:
In [ ]:
### END SOLUTION
credit
test_1.1
()
Question 1.2
{points: 1}
Before
we perform exploratory data analysis, we should create our training and
testing data sets. First, split the credit
data set. Use 60% of the data and set the
variables we want to predict as the strata
argument. Assign your answer to an
object called credit_split
.
Assign your training data set to an object called credit_training
and your
testing data set to an object called credit_testing
.
set.seed
(
2000
)
### BEGIN SOLUTION
credit_split <-
initial_split
(
credit
, prop =
0.6
, strata =
Balance
)
credit_training <-
training
(
credit_split
)
credit_testing <-
testing
(
credit_split
)
### END SOLUTION
test_1.2
()
Question 1.3
{points: 1}
Using only the observations in the training data set, use the ggpairs
library
create a pairplot (also called "scatter plot matrix") of all the columns we are
interested in including in our model. Since we have not covered how to create
these in the textbook, we have provided you with most of the code below and
you just need to provide suitable options for the size of the plot.
The pairplot contains a scatter plot of each pair of columns that you are plotting
in the lower left corner, the diagonal contains smoothed histograms of each
individual column, and the upper right corner contains the correlation coefficient
(a quantitative measure of the relation between two variables)
Name the plot object credit_pairplot
.
# options(...)
# credit_pairplot <- credit_training |> # ggpairs(
# lower = list(continuous = wrap('points', alpha = 0.4)),
# diag = list(continuous = "barDiag")
# ) +
# theme(text = element_text(size = 20))
### BEGIN SOLUTION
options
(
repr.plot.height =
8
, repr.plot.width =
9
)
credit_pairplot <-
credit_training |>
ggpairs
(
mapping =
aes
(
alpha =
0.4
)) +
In [ ]:
In [ ]:
In [ ]:
In [ ]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
theme
(
text =
element_text
(
size =
20
))
### END SOLUTION
credit_pairplot
test_1.3
()
Question 1.4
Multiple Choice:
{points: 1}
Looking at the ggpairs
plot above, which of the following statements is
incorrect
?
A. There is a strong positive relationship between the response variable
(
Balance
) and the Rating
predictor
B. There is a strong positive relationship between the two predictors (
Income
and
Rating
)
C. There is a strong positive relationship between the response variable
(
Balance
) and the Income
predictor
D. None of the above statements are incorrect
Assign your answer to an object called answer1.4
. Make sure your answer is an
uppercase letter and is surrounded by quotation marks (e.g. "F"
).
### BEGIN SOLUTION
answer1.4 <-
"C"
### END SOLUTION
answer1.4
test_1.4
()
Question 1.5
{points: 1}
Now that we have our training data, we will fit a linear regression model.
Create and assign your linear regression model specification to an object
called lm_spec
.
Create a recipe for the model. Assign your answer to an object called
credit_recipe
.
set.seed
(
2020
) #DO NOT REMOVE
### BEGIN SOLUTION
lm_spec <-
linear_reg
() |>
set_engine
(
"lm"
) |>
set_mode
(
"regression"
)
credit_recipe <-
recipe
(
Balance ~
.
, data =
credit_training
)
### END SOLUTION
In [ ]:
In [ ]:
In [ ]:
In [ ]:
print
(
lm_spec
)
print
(
credit_recipe
)
test_1.5
()
Question 1.6
{points: 1}
Now that we have our model specification and recipe, let's put them together in a
workflow, and fit our simple linear regression model. Assign the fit to an object
called credit_fit
.
set.seed
(
2020
) # DO NOT REMOVE
### BEGIN SOLUTION
credit_fit <-
workflow
() |>
add_recipe
(
credit_recipe
) |>
add_model
(
lm_spec
) |>
fit
(
data =
credit_training
)
### END SOLUTION
credit_fit
test_1.6
()
Question 1.7
Multiple Choice:
{points: 1}
Looking at the slopes/coefficients above from each of the predictors, which of the
following mathematical equations is correct for your prediction model?
A.
B.
C.
D.
Assign your answer to an object called answer1.7
. Make sure your answer is an
uppercase letter and is surrounded by quotation marks (e.g. "F"
).
### BEGIN SOLUTION
answer1.7 <-
"A"
### END SOLUTION
answer1.7
test_1.7
()
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
Question 1.8
{points: 1}
Calculate the to assess goodness of fit on credit_fit
(remember this is
how well it predicts on the training data used to fit the model). Return a single
numerical value named lm_rmse
.
set.seed
(
2020
) # DO NOT REMOVE
#... <- credit_fit |>
# predict(...) |>
# bind_cols(...) |>
# ...(truth = ..., estimate = ...) |>
# filter(.metric == ...) |>
# select(...) |>
# pull()
### BEGIN SOLUTION
lm_rmse <-
credit_fit |>
predict
(
credit_training
) |>
bind_cols
(
credit_training
) |>
metrics
(
truth =
Balance
, estimate =
.pred
) |>
filter
(
.metric ==
'rmse'
) |>
select
(
.estimate
) |>
pull
()
### END SOLUTION
lm_rmse
test_1.8
()
Question 1.9
{points: 1}
Calculate using the test data. Return a single numerical value named
lm_rmspe
.
set.seed
(
2020
) # DO NOT REMOVE
### BEGIN SOLUTION
lm_rmspe <-
credit_fit |>
predict
(
credit_testing
) |>
bind_cols
(
credit_testing
) |>
metrics
(
truth =
Balance
, estimate =
.pred
) |>
filter
(
.metric ==
'rmse'
) |>
select
(
.estimate
) |>
pull
()
### END SOLUTION
lm_rmspe
test_1.9
()
Question 1.9.1
{points: 3}
In [ ]:
In [ ]:
In [ ]:
In [ ]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Redo this analysis using -nn regression instead of linear regression. Use
set.seed(2000)
at the beginning of this code cell to make it reproducible. Use
the same predictors and train - test data splits as you used for linear regression,
and use 5-fold cross validation to choose from the range 1-10
. Remember to
scale and shift your predictors on your training data, and to apply that same
standardization to your test data! Assign a single numeric value for for
your k-nn model as your answer, and name it knn_rmspe
.
set.seed
(
2000
) # DO NOT REMOVE
### BEGIN SOLUTION
# NOTE TO TAs: the recipe step uses randomness, so if the student
# does recipe AFTER vfoldcv, they'll get a different answer (184.26) than # if they do recipe BEFORE vfoldcv (179.88). They should get # **full marks in either case**
credit_knn_recipe <-
recipe
(
Balance ~
.
, data =
credit_training
) |>
step_center
(
all_predictors
()) |>
step_scale
(
all_predictors
())
credit_knn_spec <-
nearest_neighbor
(
weight_func =
"rectangular"
, neighbors =
tun
set_engine
(
"kknn"
) |>
set_mode
(
"regression"
)
credit_vfold <-
vfold_cv
(
credit_training
, v =
5
, strata =
Balance
)
credit_knn_workflow <-
workflow
() |>
add_recipe
(
credit_knn_recipe
) |>
add_model
(
credit_knn_spec
)
gridvals <-
tibble
(
neighbors =
seq
(
1
,
10
))
credit_knn_results <-
credit_knn_workflow |>
tune_grid
(
resamples =
credit_vfold
, grid =
gridvals
) |>
collect_metrics
() #select the value of k resulting in best RMSE
kmin <-
credit_knn_results |>
filter
(
.metric ==
'rmse'
) |>
filter
(
mean ==
min
(
mean
)) |>
pull
(
neighbors
)
#retrain the model using that final k, predict on held-out data
credit_spec <-
nearest_neighbor
(
weight_func =
"rectangular"
, neighbors =
kmin
) |
set_engine
(
"kknn"
) |>
set_mode
(
"regression"
)
credit_fit <-
workflow
() |>
add_recipe
(
credit_knn_recipe
) |>
add_model
(
credit_spec
) |>
fit
(
data =
credit_training
)
knn_rmspe <-
credit_fit |>
predict
(
credit_testing
) |>
bind_cols
(
credit_testing
) |>
metrics
(
truth =
Balance
, estimate =
.pred
)
|>
In [ ]:
filter
(
.metric ==
'rmse'
) |>
pull
(
.estimate
) ### END SOLUTION
knn_rmspe
Question 1.9.2
{points: 3}
Discuss which model, linear regression versus -nn regression, gives better
predictions and why you think that might be happening.
BEGIN SOLUTION
Linear regression is giving better predictions as measured by . The
for linear regression is ~155 and the for k-nn regression is
~175. This is likely happening because of one or two of the following reasons:
Even with the best 𝑘
we can pick, -nn regression could have slightly overfit
the training data which would explain that it doesn't generalize as well to
data that wasn't used to train it.
There is a fairly linear relationship between most/all of the predictors and the
target/outcome variable, so linear regression is an appropriate model and fits
well.
END SOLUTION
2. Ames Housing Prices
Source: https://media.giphy.com/media/xUPGGuzpmG3jfeYWIg/giphy.gif
If we take a look at the Business Insider report What do millenials want in a
home?
, we can see that millenials like newer houses that have their own defined
spaces. Today we are going to be looking at housing data to understand how the
sale price of a house is determined. Finding highly detailed housing data with the
final sale prices is very hard, however researchers from Truman State Univeristy
have studied and made available a dataset containing multiple variables for the
city of Ames, Iowa. The data set describes the sale of individual residential
property in Ames, Iowa from 2006 to 2010. You can read more about the data set
here
. Today we will be looking at 5 different variables to predict the sale price of a
house. These variables are:
Lot Area: lot_area
Year Built: year_built
Basement Square Footage: bsmt_sf
First Floor Square Footage: first_sf
Second Floor Square Footage: second_sf
First, load the data with the script given below.
# run this cell
ames_data <-
read_csv
(
'data/ames.csv'
, col_types =
cols
()) |>
select
(
lot_area =
Lot.Area
, year_built =
Year.Built
, bsmt_sf =
Total.Bsmt.SF
, first_sf =
`X1st.Flr.SF`
, second_sf =
`X2nd.Flr.SF`
, sale_price =
SalePrice
) |>
filter
(
!
is.na
(
bsmt_sf
))
ames_data
Question 2.1
{points: 3}
Split the data into a train dataset and a test dataset, based on a 70%-30% train-
test split. Use set.seed(2019)
. Remember that we want to predict the
sale_price
based on all of the other variables.
Assign the objects to ames_split
, ames_training
, and ames_testing
,
respectively.
Use 2019 as your seed for the split.
set.seed
(
2019
) # DO NOT CHANGE!
### BEGIN SOLUTION
ames_split <-
initial_split
(
ames_data
, prop =
0.7
, strata =
sale_price
)
ames_training <-
training
(
ames_split
)
ames_testing <-
testing
(
ames_split
)
### END SOLUTION
# We check that you've created objects with the right names below
# But all other tests were intentionally hidden so that you can practice decidin
# when you have the correct answer.
test_that
(
'Did not create objects named ames_split, ames_training and ames_testi
In [ ]:
In [ ]:
In [ ]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
expect_true
(
exists
(
"ames_split"
)) expect_true
(
exists
(
"ames_training"
)) expect_true
(
exists
(
"ames_testing"
)) })
### BEGIN HIDDEN TESTS
test_that
(
'ames_split should be a rsplit object.'
, {
expect_true
(
'rsplit' %in%
class
(
ames_split
))
})
test_that
(
'ames_training is not a tibble.'
, {
expect_true
(
'tbl' %in%
class
(
ames_training
))
})
test_that
(
'ames_training does not contain the correct number of rows and/or colu
expect_equal
(
dim
(
ames_training
), c
(
2048
, 6
))
expect_equal
(
digest
(
int_round
(
sum
(
ames_training
$
lot_area
), 2
)), '0f473284653
expect_equal
(
digest
(
int_round
(
sum
(
ames_training
$
first_sf
), 2
)), '46b1007aee0
})
test_that
(
'ames_testing is not a tibble.'
, {
expect_true
(
'tbl' %in%
class
(
ames_testing
))
})
test_that
(
'ames_testing does not contain the correct number of rows and/or colum
expect_equal
(
dim
(
ames_testing
), c
(
881
, 6
))
expect_equal
(
digest
(
int_round
(
sum
(
ames_testing
$
lot_area
), 2
)), 'ef74702fa3ef
expect_equal
(
digest
(
int_round
(
sum
(
ames_testing
$
first_sf
), 2
)), '2b626f5c6c11
})
print
(
"Success!"
)
### END HIDDEN TESTS
Question 2.2
{points: 3}
Let's start by exploring the training data. Use the ggpairs()
function from the
GGally package to explore the relationships between the different variables.
Assign your plot object to a variable named answer2.2
.
set.seed
(
2020
) # DO NOT REMOVE
### BEGIN SOLUTION
options
(
repr.plot.height =
10
, repr.plot.width =
20
)
answer2.2 <-
ames_training |>
ggpairs
(
mapping =
aes
(
alpha =
0.05
)) +
theme
(
text =
element_text
(
size =
17
))
### END SOLUTION
answer2.2
# We check that you've created objects with the right names below
# But all other tests were intentionally hidden so that you can practice decidin
# when you have the correct answer.
test_that
(
'Did not create a plot named answer2.2'
, {
expect_true
(
exists
(
"answer2.2"
)) })
### BEGIN HIDDEN TESTS
test_that
(
'answer2.2 should be using data from ames_training'
, {
expect_equal
(
int_round
(
nrow
(
answer2.2
$
data
), 0
), 2048
)
expect_equal
(
int_round
(
ncol
(
answer2.2
$
data
), 0
), 6
)
})
test_that
(
'answer2.2 should be a pairwise plot matrix.'
, {
In [ ]:
In [ ]:
expect_true
(
'ggmatrix' %in%
c
(
class
(
answer2.2
)))
})
print
(
"Success!"
)
### END HIDDEN TESTS
Question 2.3
Multiple Choice:
{points: 1}
Now that we have seen all the relationships between the variables, which of the
following variables would not
be a strong predictor for sale_price
?
A. bsmt_sf
B. year_built
C. first_sf
D. lot_area
E. second_sf
F. It isn't clear from these plots
Assign your answer to an object called answer2.3
. Make sure your answer is an
uppercase letter and is surrounded by quotation marks (e.g. "F"
).
### BEGIN SOLUTION
answer2.3 <-
"F"
### END SOLUTION
answer2.3
# We check that you've created objects with the right names below
# But all other tests were intentionally hidden so that you can practice decidin
# when you have the correct answer.
test_that
(
'Did not create an object called answer2.3'
, {
expect_true
(
exists
(
'answer2.3'
))
})
### BEGIN HIDDEN TESTS
test_that
(
'Solution is incorrect'
, {
expect_equal
(
digest
(
answer2.3
), 'f76b651ab8fcb8d470f79550bf2af53a'
)
})
print
(
"Success!"
)
### END HIDDEN TESTS
Question 2.4 - Linear Regression
{points: 3}
Fit a linear regression model using tidymodels
with ames_training
using all
the variables in the data set.
create a model specification called lm_spec
create a recipe called ames_recipe
create a workflow with your model spec and recipe, and then create the
model fit and name it ames_fit
In [ ]:
In [ ]:
set.seed
(
2020
) # DO NOT REMOVE
### BEGIN SOLUTION
lm_spec <-
linear_reg
() |>
set_engine
(
"lm"
) |>
set_mode
(
"regression"
)
ames_recipe <-
recipe
(
sale_price ~
.
, data =
ames_training
) ames_fit <-
workflow
() |>
add_recipe
(
ames_recipe
) |>
add_model
(
lm_spec
) |>
fit
(
data =
ames_training
)
### END SOLUTION
ames_fit
# We check that you've created objects with the right names below
# But all other tests were intentionally hidden so that you can practice decidin
# when you have the correct answer.
test_that
(
'Did not create an object named lm_spec'
, {
expect_true
(
exists
(
"lm_spec"
)) })
test_that
(
'Did not create an object named ames_recipe'
, {
expect_true
(
exists
(
"ames_recipe"
)) })
test_that
(
'Did not create an object named ames_fit'
, {
expect_true
(
exists
(
"ames_fit"
)) })
### BEGIN HIDDEN TESTS
test_that
(
'lm_spec is not a linear regression model'
, {
expect_true
(
'linear_reg' %in%
class
(
lm_spec
))
})
test_that
(
'lm_spec does not contain the correct specifications'
, {
expect_equal
(
digest
(
as.character
(
lm_spec
$
mode
)), 'b8bdd7015e0d1c6037512fd139
expect_equal
(
digest
(
as.character
(
lm_spec
$
engine
)), '0995419f6f003f701c545d05
})
test_that
(
'ames_recipe is not a recipe'
, {
expect_true
(
'recipe' %in%
class
(
ames_recipe
))
})
test_that
(
'ames_recipe does not contain the correct variables'
, {
expect_equal
(
digest
(
int_round
(
sum
(
ames_recipe
$
template
$
lot_area
), 2
)), '0f47
expect_equal
(
digest
(
int_round
(
sum
(
ames_recipe
$
template
$
first_sf
), 2
)), '46b1
})
test_that
(
'ames_fit is not a workflow'
, {
expect_true
(
'workflow' %in%
class
(
ames_fit
))
})
test_that
(
'ames_fit does not contain the correct data'
, {
expect_equal
(
digest
(
int_round
(
sum
(
ames_fit
$
pre
$
actions
$
recipe
$
recipe
$
templat
expect_equal
(
digest
(
int_round
(
sum
(
ames_fit
$
pre
$
actions
$
recipe
$
recipe
$
templat
})
test_that
(
'ames_fit coefficients are incorrect'
, {
expect_equal
(
digest
(
int_round
(
sum
(
ames_fit
$
fit
$
fit
$
fit
$
coefficients
), 2
)), '
})
print
(
"Success!"
)
### END HIDDEN TESTS
In [ ]:
In [ ]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Question 2.5
True or False:
{points: 1}
Aside from the intercept, all the variables have a positive relationship with the
sale_price
. This can be interpreted as the value of the variables decrease, the
prices of the houses increase.
Assign your answer to an object called answer2.5
. Make sure your answer is in
lowercase letters and is surrounded by quotation marks (e.g. "true"
or
"false"
).
# run this cell
ames_fit
$
fit
$
fit
$
fit
$
coefficients
### BEGIN SOLUTION
answer2.5 <-
"false"
### END SOLUTION
answer2.5
# We check that you've created objects with the right names below
# But all other tests were intentionally hidden so that you can practice decidin
# when you have the correct answer.
test_that
(
'Did not create an object named answer2.5'
, {
expect_true
(
exists
(
"answer2.5"
)) })
### BEGIN HIDDEN TESTS
test_that
(
'Solution is incorrect'
, {
expect_equal
(
digest
(
answer2.5
), 'd2a90307aac5ae8d0ef58e2fe730d38b'
)
})
print
(
"Success!"
)
### END HIDDEN TESTS
Question 2.6
{points: 3}
Looking at the coefficients and intercept produced from the cell block above,
write down the equation for the linear model.
Make sure to use correct math typesetting syntax (surround your answer with
dollar signs, e.g. $0.5 * a$
)
BEGIN SOLUTION
As long as the coefficients are right and the variable names are right, don't be too
picky about math syntax.
END SOLUTION
In [ ]:
In [ ]:
In [ ]:
Question 2.7
Multiple Choice:
{points: 1}
Why can we not easily visualize the model above as a line or a plane in a single
plot?
A. This is not true, we can actually easily visualize the model
B. The intercept is much larger (6 digits) than the coefficients (single/double
digits)
C. There are more than 2 predictors
D. None of the above
Assign your answer to an object called answer2.7
. Make sure your answer is an
uppercase letter and is surrounded by quotation marks (e.g. "F"
).
### BEGIN SOLUTION
answer2.7 <-
"C"
### END SOLUTION
answer2.7
# We check that you've created objects with the right names below
# But all other tests were intentionally hidden so that you can practice decidin
# when you have the correct answer.
test_that
(
'Did not create an object named answer2.7'
, {
expect_true
(
exists
(
"answer2.7"
)) })
### BEGIN HIDDEN TESTS
test_that
(
'Solution is incorrect'
, {
expect_equal
(
digest
(
answer2.7
), '475bf9280aab63a82af60791302736f6'
)
})
print
(
"Success!"
)
### END HIDDEN TESTS
Question 2.8
{points: 3}
We need to evaluate how well our model is doing. For this question, calculate the
(a single numerical value) of the linear regression model using the test
data set and assign it to an object named ames_rmspe
.
set.seed
(
2020
) # DO NOT REMOVE
### BEGIN SOLUTION
ames_rmspe <-
ames_fit |>
predict
(
ames_testing
) |>
bind_cols
(
ames_testing
) |>
metrics
(
truth =
sale_price
, estimate =
.pred
) |>
filter
(
.metric ==
'rmse'
) |>
select
(
.estimate
) |>
pull
()
In [ ]:
In [ ]:
In [ ]:
### END SOLUTION
ames_rmspe
# We check that you've created objects with the right names below
# But all other tests were intentionally hidden so that you can practice decidin
# when you have the correct answer.
test_that
(
'Did not create an object named ames_rmspe'
, {
expect_true
(
exists
(
"ames_rmspe"
)) })
### BEGIN HIDDEN TESTS
test_that
(
'ames_rmspe is incorrect'
, {
expect_equal
(
digest
(
int_round
(
ames_rmspe
, 2
)), '449c6dc6cc4df30b73051b58cabe
})
print
(
"Success!"
)
### END HIDDEN TESTS
Question 2.9
Multiple Choice:
{points: 1}
Which of the following statements is incorrect
?
A. is a measure of goodness of fit
B. measures how well the model predicts on data it was trained with
C. measures how well the model predicts on data it was not trained
with
D. measures how well the model predicts on data it was trained with
Assign your answer to an object called answer2.9
. Make sure your answer is an
uppercase letter and is surrounded by quotation marks (e.g. "F"
).
### BEGIN SOLUTION
answer2.9 <-
"D"
### END SOLUTION
answer2.9
# We check that you've created objects with the right names below
# But all other tests were intentionally hidden so that you can practice decidin
# when you have the correct answer.
test_that
(
'Did not create an object named answer2.9'
, {
expect_true
(
exists
(
"answer2.9"
)) })
### BEGIN HIDDEN TESTS
test_that
(
'Solution is incorrect'
, {
expect_equal
(
digest
(
answer2.9
), 'c1f86f7430df7ddb256980ea6a3b57a4'
)
})
print
(
"Success!"
)
### END HIDDEN TESTS
source
(
"cleanup.R"
)
In [ ]:
In [ ]:
In [ ]:
In [ ]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Recommended textbooks for you

Elementary Linear Algebra (MindTap Course List)
Algebra
ISBN:9781305658004
Author:Ron Larson
Publisher:Cengage Learning

Functions and Change: A Modeling Approach to Coll...
Algebra
ISBN:9781337111348
Author:Bruce Crauder, Benny Evans, Alan Noell
Publisher:Cengage Learning

Algebra and Trigonometry (MindTap Course List)
Algebra
ISBN:9781305071742
Author:James Stewart, Lothar Redlin, Saleem Watson
Publisher:Cengage Learning

College Algebra
Algebra
ISBN:9781305115545
Author:James Stewart, Lothar Redlin, Saleem Watson
Publisher:Cengage Learning

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill

Recommended textbooks for you
- Elementary Linear Algebra (MindTap Course List)AlgebraISBN:9781305658004Author:Ron LarsonPublisher:Cengage LearningFunctions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage LearningAlgebra and Trigonometry (MindTap Course List)AlgebraISBN:9781305071742Author:James Stewart, Lothar Redlin, Saleem WatsonPublisher:Cengage Learning
- College AlgebraAlgebraISBN:9781305115545Author:James Stewart, Lothar Redlin, Saleem WatsonPublisher:Cengage LearningGlencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill

Elementary Linear Algebra (MindTap Course List)
Algebra
ISBN:9781305658004
Author:Ron Larson
Publisher:Cengage Learning

Functions and Change: A Modeling Approach to Coll...
Algebra
ISBN:9781337111348
Author:Bruce Crauder, Benny Evans, Alan Noell
Publisher:Cengage Learning

Algebra and Trigonometry (MindTap Course List)
Algebra
ISBN:9781305071742
Author:James Stewart, Lothar Redlin, Saleem Watson
Publisher:Cengage Learning

College Algebra
Algebra
ISBN:9781305115545
Author:James Stewart, Lothar Redlin, Saleem Watson
Publisher:Cengage Learning

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
