Hw12P8

pdf

School

San Francisco State University *

*We aren’t endorsed by this school

Course

25

Subject

Computer Science

Date

Nov 24, 2024

Type

pdf

Pages

1

Uploaded by MinisterSteelSheep33

Report
In [16]: k = 47 k Out[16]: 47 In [17]: grader.check( "q1_6" ) Out[17]: q1_6 passed! Question 1.7. Why do we divide our data into a training and test set? What is the point of a test set, and why do we only want to use the test set once? Explain your answer in 3 sentences or less. (10 points) Hint: Check out this section in the textbook. The data is divided into a training and test set to minimize overfitting and balance training and testing accuracies to form the most reliable classifications. We should not use our test set to find the best possible number of neigbors as the test is meant to evaluate the performance of our classifier, and using the test set to find the best possible k would skew the classifier towards that particular test set and the result would not be an objective evaluation ofthe performance, and instead be biased. This is why we only want to use the test set once, as it represents an out-of-sample data set which can be used to objectively evaluate performance, skewing the performance if used multiple times. Question 1.8. Why do we use an odd-numbered k in k-NN? Explain. (10 points) We use an odd-numbered k in k-NN to ensure no ties when evaluating the data. For this example, if you chose 4 as the k value, and you end up with two neighbors as "Berkeley" and two as "Stanford," the result is inconclusive and we cannot classify the data point. Question 1.9.0. Setup Thomas has devised a scheme for splitting up the test and training set. For each row from coordinates : Rows for Stanford students have a 50% chance of being placed in the training set and 50% chance of being placed in the test set. Rows for Berkeley students have a 80% chance of being placed in the training set and 20% chance of being placed in the test set. Hint 1: Remember that there are 77 Berkeley students and 23 Stanford students in coordinates. Hint 2: Thomas' last name is Bayes. (So 18.1 from the textbook may be helpful here!)
Discover more documents: Sign up today!
Unlock a world of knowledge! Explore tailored content for a richer learning experience. Here's what you'll get:
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help