Carte_BIFS613_Homework 5

docx

School

University of Maryland Global Campus (UMGC) *

*We aren’t endorsed by this school

Course

613

Subject

Computer Science

Date

Dec 6, 2023

Type

docx

Pages

4

Uploaded by blondie1914

Report
Katie Carte BIFS613 Homework 5 1. Follow the steps to produce predictions (50 points) Introduction: The iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are _Iris setosa_, _versicolor_, and _virginica_. In R: >library(class) >tidx <- sample(nrow(iris),round(.6*nrow(iris))) >train <- iris[tidx,-5] >test <- iris[-tidx,-5] >cl <- iris[tidx,5] >orig <- iris[-tidx,5] >pred <- knn(train,test,cl,k=3,prob=TRUE) # Use the table below to build your confusion matrix table(orig,pred) 2. Answer the following questions (50 points) 2.1 What method was used to provide the predictions? The k-Nearest Neighbors (k-NN) classification method is used in this code to make predictions. 2.2 What is the role of the variable cl? The variable "cl" in the provided code has a specific role in the k-Nearest Neighbors (k- NN) classification process. "cl" is short for "class," and it represents the class labels or categories associated with the training data points. The k-NN algorithm uses the "cl" variable to learn and understand the relationships between the feature data in the training set (stored in the "train" variable) and their corresponding class labels. This understanding is essential for making predictions on new, unseen data. cl <- iris[tidx, 5]: This line extracts the class labels for the training data, which is the species of the iris flowers. 2.3 Use the table generated above to produce your measures of performance and provide them below. Accuracy: 0.6666667
Precision: 0.6666667 Recall: 0.6666667 F1 Score: 0.6666667 2.4 What did you predict? How did the method perform? Discuss your findings... With all performance measures (accuracy, precision, recall, and F1 Score) consistently at 67%, it suggests that the classification model is making predictions at a similar level of correctness across all measures. My model strikes a balance between making accurate positive predictions and capturing true positive cases. Original Predicted 1 setosa setosa 2 setosa setosa 3 setosa setosa 4 setosa setosa 5 setosa setosa 6 setosa setosa 7 setosa setosa 8 setosa setosa 9 setosa setosa 10 setosa setosa 11 setosa setosa 12 setosa setosa 13 setosa setosa 14 setosa setosa 15 setosa setosa 16 setosa setosa 17 setosa setosa 18 setosa setosa 19 setosa setosa 20 setosa setosa 21 setosa setosa 22 setosa setosa 23 setosa setosa 24 setosa setosa 25 versicolor versicolor 26 versicolor versicolor 27 versicolor versicolor 28 versicolor versicolor 29 versicolor versicolor 30 versicolor versicolor 31 versicolor versicolor 32 versicolor versicolor 33 versicolor versicolor
34 versicolor versicolor 35 versicolor versicolor 36 versicolor virginica 37 versicolor versicolor 38 versicolor versicolor 39 versicolor versicolor 40 versicolor versicolor 41 versicolor versicolor 42 virginica virginica 43 virginica virginica 44 virginica virginica 45 virginica versicolor 46 virginica virginica 47 virginica virginica 48 virginica virginica 49 virginica virginica 50 virginica virginica 51 virginica virginica 52 virginica virginica 53 virginica virginica 54 virginica virginica 55 virginica virginica 56 virginica virginica 57 virginica virginica 58 virginica virginica 59 virginica virginica 60 virginica virginica > orig <- c("setosa", "versicolor", "versicolor", "virginica", "setosa", "versicolor") > pred <- c("setosa", "versicolor", "setosa", "virginica", "versicolor", "versicolor") > accuracy <- function(orig, pred) { + mean(orig == pred) + } > precision <- function(orig, pred, positive_class) { + true_positive <- sum(orig == positive_class & pred == positive_class) + false_positive <- sum(orig != positive_class & pred == positive_class) + true_positive / (true_positive + false_positive) + } > recall <- function(orig, pred, positive_class) { + true_positive <- sum(orig == positive_class & pred == positive_class) + false_negative <- sum(orig == positive_class & pred != positive_class) + true_positive / (true_positive + false_negative) + } > f1_score <- function(orig, pred, positive_class) { + prec <- precision(orig, pred, positive_class) + rec <- recall(orig, pred, positive_class)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
+ 2 * (prec * rec) / (prec + rec) + } > View(accuracy) > function(orig, pred) { + mean(orig == pred) + } function(orig, pred) { mean(orig == pred) } > cat("Accuracy:", accuracy(orig, pred), "\n") Accuracy: 0.6666667 > cat("Precision (for 'versicolor'):", precision(orig, pred, positive_class = "versicolor"), "\n") Precision (for 'versicolor'): 0.6666667 > cat("Recall (for 'versicolor'):", recall(orig, pred, positive_class = "versicolor"), "\n") Recall (for 'versicolor'): 0.6666667 > cat("F1 Score (for 'versicolor'):", f1_score(orig, pred, positive_class = "versicolor"), "\n") F1 Score (for 'versicolor'): 0.6666667