a3-solution

pdf

School

Rumson Fair Haven Reg H *

*We aren’t endorsed by this school

Course

101

Subject

Statistics

Date

Nov 24, 2024

Type

pdf

Pages

Uploaded by CoachRiverTiger30

Assignment 3: Linear/Quadratic Discriminant Analysis and Comparing Classification Methods SDS293 - Machine Learning Due: 11 Oct 2017 by 11:59pm Conceptual Exercises 4.5 (p. 169 ISLR) This question examines the differences between LDA and QDA. (a) If the Bayes decision boundary is linear , do we expect LDA or QDA to perform better on the training set? On the test set? Solution: We would expect QDA to perform better on the training set because its increased flexiblity will result in a closer fit. If the Bayes decision boundary is linear, we expect LDA to perform better than QDA on the test set, as QDA could be subject to overfitting. (b) If the Bayes decision boundary is non-linear , do we expect LDA or QDA to perform better on the training set? On the test set? Solution: If the Bayes decision bounary is non-linear, we expect QDA to perform better on both the training and test sets. (c) In general, as the sample size n increases , do we expect the test prediction accuracy of QDA relative to LDA to improve, decline, or be unchanged? Why? Solution: We expect the test prediction accuracy of QDA relative to LDA to improve as n gets bigger. In general, as the the sample size increases, a more flexibile method will yield a better fit as the variance is offset by the larger sample size. (d) True or False : Even if the Bayes decision boundary for a given problem is linear, we will probably achieve a superior test error rate using QDA rather than LDA because QDA is flexible enough to model a linear decision boundary. Justify your answer. Solution: False. With fewer sample points, the variance from using a more flexible method, such as QDA, would likely result in overfitting, yielding a higher test error rate than LDA. 1

Applied Exercises 4.10 (p. 171 ISLR) This question should be answered using the Weekly data set, which is part of the ISLR package. This data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1,089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010. (a) Produce some numerical and graphical summaries of the Weekly data. Do there appear to be any patterns ? Solution: Year and Volume appear to have a relationship. No other patterns are discernible. (b) Use the full data set to perform a logistic regression with Direction as the response and the five lag variables plus Volume as predictors, and use the summary() function to print the results. Do any of the predictors appear to be statistically significant ? If so, which ones? Solution: Lag2 appears to have some statistical significance with Pr ( > | z | ) = 3% . (c) Compute the confusion matrix and overall fraction of correct predictions. What is the con- fusion matrix is telling you about the types of mistakes made by your logistic model? Solution: Percentage of correct predictions: (54 + 557) / (54 + 557 + 48 + 430) = 56 . 1% On weeks where the market goes down, the logistic regression is right most of the time: 557 / (557 + 48) = 92 . 1% However, on weeks the market goes down the logistic regression is wrong most of the time: 54 / (430 + 54) = 11 . 2% (d) Now fit the logistic regression model using a training data period from 1990 to 2008, with Lag2 as the only predictor. Report the confusion matrix and the overall fraction of correct predictions for the test data (that is, the data from 2009 and 2010). Solution: glm.pred Down Up Down 9 5 Up 34 56 mean: 0.625 (e) Repeat (d) using LDA. Solution: Same as logistic regression. 2

(f) Repeat (d) using QDA. Solution: glm.pred Down Up Down 0 0 Up 43 61 mean: 0.587 A correctness of 58.7% even though it picked Up the whole time! (g) Repeat (d) using KNN with K = 1. Solution: glm.pred Down Up Down 21 30 Up 22 31 mean: 0.5 (h) Which of these methods appears to provide the best results on this data? Solution: Logistic regression and LDA methods both provide equally low test error rates. (i) Experiment with different combinations of predictors, including possible transformations and interactions, for each of the methods. You should also experiment with values for K in the KNN classifier. Report the predictors, method, and associated confusion matrix that appears to provide the best results on the held out data. Why do you think this one performed best? Solution: This problem will have different solutions depending on which combinations you tried. Variation of 4.13 (p. 173 ISLR) Using the Boston data set from ISLR , fit a classification model in order to predict whether a given suburb has a crime rate above or below the median. You may want to explore logistic regression, LDA, and KNN models using various subsets of the predictors. Once you’re satisfied with your results, describe your model and findings: • Why did you choose that type of model? • How did you choose your predictors? • What does your model it tell you about the data? 3

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

• Where does it break down? • Is there additional information that you would need to know to be able to make a better model? Solution: This problem will have different solutions depending on which combinations you tried. Interesting solutions will be anonymized and made available after grading is complete. 4