Consider the graph below analyzing the size of tree vs. accuracy for a decision tree which has been pruned back to the red line. 0.9 O.85 0.8 0.75 0.7 0.65 On trainine data On validation data- On validation data (during pruning) 0.6 0.55 0.5 20 70 100 Size of tree (number of nodes) Figure 2: Pruned decision tree Refer to Figure 2. Let's say that we have a third dataset Dnew (from the same data distribution), which is not used for training or pruning. If we evaluate this new dataset, approximately what is the accuracy when the size of the tree is at 25 nodes, and why? Select one. Select one: Around 0.76 (slightly higher than the accuracy for validation data at 25 nodes) Around 0.73 (the same as the accuracy for validation data at 25 nodes) Around 0.70 (slightly lower than the accuracy for validation data at 25 nodes) None of the above Which of the following gives us the best approximation of the true error? Line corresponding to training data Line corresponding to validation data Line corresponding to new dataset Dnew Which of the following are valid ways to avoid overfitting? Select all that apply. Select all that apply: O Decrease the training set size. O Set a threshold for a minimum number of samples required to split at an internal node. O Prune the tree so that cross-validation error is minimal. O Maximize the tree depth. O None of the above.
Consider the graph below analyzing the size of tree vs. accuracy for a decision tree which has been pruned back to the red line. 0.9 O.85 0.8 0.75 0.7 0.65 On trainine data On validation data- On validation data (during pruning) 0.6 0.55 0.5 20 70 100 Size of tree (number of nodes) Figure 2: Pruned decision tree Refer to Figure 2. Let's say that we have a third dataset Dnew (from the same data distribution), which is not used for training or pruning. If we evaluate this new dataset, approximately what is the accuracy when the size of the tree is at 25 nodes, and why? Select one. Select one: Around 0.76 (slightly higher than the accuracy for validation data at 25 nodes) Around 0.73 (the same as the accuracy for validation data at 25 nodes) Around 0.70 (slightly lower than the accuracy for validation data at 25 nodes) None of the above Which of the following gives us the best approximation of the true error? Line corresponding to training data Line corresponding to validation data Line corresponding to new dataset Dnew Which of the following are valid ways to avoid overfitting? Select all that apply. Select all that apply: O Decrease the training set size. O Set a threshold for a minimum number of samples required to split at an internal node. O Prune the tree so that cross-validation error is minimal. O Maximize the tree depth. O None of the above.
MATLAB: An Introduction with Applications
6th Edition
ISBN:9781119256830
Author:Amos Gilat
Publisher:Amos Gilat
Chapter1: Starting With Matlab
Section: Chapter Questions
Problem 1P
Related questions
Question
1
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
This is a popular solution!
Trending now
This is a popular solution!
Step by step
Solved in 4 steps
Recommended textbooks for you
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
Elementary Statistics: Picturing the World (7th E…
Statistics
ISBN:
9780134683416
Author:
Ron Larson, Betsy Farber
Publisher:
PEARSON
The Basic Practice of Statistics
Statistics
ISBN:
9781319042578
Author:
David S. Moore, William I. Notz, Michael A. Fligner
Publisher:
W. H. Freeman
Introduction to the Practice of Statistics
Statistics
ISBN:
9781319013387
Author:
David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:
W. H. Freeman