An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics)
13th Edition
ISBN: 9781461471370
Author: Gareth James
Publisher: SPRINGER NATURE CUSTOMER SERVICE
expand_more
expand_more
format_list_bulleted
Concept explainers
Expert Solution & Answer
Chapter 2, Problem 7E
a.
Explanation of Solution
Euclidean distance
X1 | X2 | X3 | Y | Distance from origin |
0 | 3 | 0 | Red | 3 |
2 | ... |
b.
Explanation of Solution
Prediction of value k
- Prediction with K=1 is Green...
c.
Explanation of Solution
Prediction of value k
- Prediction with K=1 is Green.
- This is because t...
d.
Explanation of Solution
Bayes decision boundary
- When K becomes larger, we get a smoother boundary...
Expert Solution & Answer
Want to see the full answer?
Check out a sample textbook solutionStudents have asked these similar questions
The paper "The Effects of Adolescent Volunteer Activities on the Perception of Local Society and Community Spirit Mediated by Self-Conception"† describes a survey of a large representative sample of middle school children in South
Korea. One question in the survey asked how much time per year the children spent in volunteer activities. The sample mean was 14.76 hours and the sample standard deviation was 16.54 hours.
USE SALT
Based on the reported sample mean and sample standard deviation, explain why it is not reasonable to think that the distribution of volunteer times for the population of South Korean middle school students is approximately
normal. (Round your answer to nearest percent.)
If the distribution of volunteer times is approximately normal, for the sample standard deviation of s = 16.54 hours and the sample mean of x = 14.76 hours, approximately 19
negative. Therefore, it is not
reasonable to think that the distribution of volunteer times is approximately normal.
% of…
Suppose X|Y=1 is a Uniform(0,4) density, X|Y=0 is a Uniform(2,6) density, and
P(Y=1) = 2/3. Calculate the Bayes error, i.e., the error rate of the Bayes classifier
(you may leave your answer as a fraction).
Assume there are three hypotheses, h1, h2, h3, which are trained from the same data set D. The accuracy of the three hypotheses are P(h1) = 0.45, P(h2) =
0.3, P(h3) 0.25.
Given a new instance x, the predicted results of the three hypotheses are h1(x) = yes, h2(x) = no, h3(x) = no. (Assume binary target values of "yes" and "no.")
What is the predicted result of the Bayes optimal classifier using h1, h2, and h3?
O Yes
O No
Chapter 2 Solutions
An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics)
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Similar questions
- Explain the flaws in this model training strategy. What's your solution? We want to create a hip X-Ray deformity prediction model. 100 individuals have 640 frontal X-rays. Three orthopedic physicians label the photos as positive or negative for hip deformity. The picture dataset was randomly divided among 80% training (training and validation) and 20% testing.arrow_forwardGive the steps by steps answerarrow_forward2. Take a bivariate normal distribution with two random variables X and Y, with mean value = (1, -1), var(X) = 3, var(Y) = 6, and cor(X,Y) = -0.5. %3! (a) create a contour plot for this data (b) plot 1,000 simulations of this distribution (c) Using 1,000,000 simulations, find (1) the expected value of Y (ii) the expected value of Y, given that X> 2 (ii) the expected value of Y, given that X = 2arrow_forward
- A histogram is plotted to get an idea of the probability distribution for a feature in a dataset. Given the histogram, what would you estimate for the probability that, for a random sample, the feature lies between 2 and 4? 0.175 0.150 0.125 0.100 0.075 0.050 0.025 0.000 0.5 0.125 0.05 0 0.25 -2 -6 8 10arrow_forwardExplain........arrow_forward1 Change this code from Matlab to Phython: function p = predict (theta, X) % PREDICT Predict whether the label is 0 or 1 using learned logistic 5 åregression parameters theta 4 p = PREDICT (theta, X) computes the predictions for X using a threshold at 0.5 (i.e., if sigmoid (theta'*x) >= 0.5, predict 1) size (X, 1); % Number of training examples % You need to return the following variables correctly zeros (m, 1); p=sigmoid (X*theta); 8 m = 9. 10 11 12 for i=1:m if (p (i) >= 0.5) p(i) =1; 13 14 15 else 16 p(i)=0; 17 end 18 end 19 end olo oto oto oto olo olo olo oto olo olo oto olo olo olo oto olo olo olo olo olo olo ofo o1o olo H23 +56 7arrow_forward
- Exercise 10 Of the sampling distributions from 2 and 3, which has a smaller spread? If you're concerned with making estimates that are more often close to the true value, would you prefer a sampling distribution with a large or small spread?arrow_forwardWhen building a predictive model, out-of-sample predictive accuracy will always improve when we include any independent variable that leads to an increase in the R-Square. TRUE FALSEarrow_forwardWhat is answer ? Q1: Suppose you are working on weather prediction, and use a learning algorithm to predict tomorrow's temperature (in degrees Centigrade/Fahrenheit). Would you treat this as a classification or a regression problem? Q2: Suppose you are working on stock market prediction. You would like to predict whether or not a certain company will declare bankruptcy within the next 7 days (by training on data of similar companies that had previously been at risk of bankruptcy). Would you treat this as a classification or a regression problem?arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Operations Research : Applications and AlgorithmsComputer ScienceISBN:9780534380588Author:Wayne L. WinstonPublisher:Brooks Cole
Operations Research : Applications and Algorithms
Computer Science
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Brooks Cole