a) 1-Nearest-Neighbour has a perfect training error and hence it should be used here. b) Random Forests outperforms both 1-Nearest-Neighbour and the Decision Tree in terms of prediction error. c) Not just in this case, but in general, Cross Validation is the better validation strategy and should always be preferred over the error on a single test set. d) Not just in this case, but in general, Decision Trees always perform worse than Random Forests.
Let us return to the Titanic data set. We now have learned several models and want to choose the
best one. We used three different methods to validate these models: The training error rate (apparent error rate),
the error rate on an external test set and the error rate estimated by a 10-fold cross validation.
Which of the following statements are correct?
a) 1-Nearest-Neighbour has a perfect training error and hence it should be used here.
b) Random Forests outperforms both 1-Nearest-Neighbour and the Decision Tree in terms of prediction error.
c) Not just in this case, but in general, Cross Validation is the better validation strategy and should always be
preferred over the error on a single test set.
d) Not just in this case, but in general, Decision Trees always perform worse than Random Forests.
Step by step
Solved in 2 steps