HW_1

Rmd

School

Georgia Institute Of Technology *

*We aren’t endorsed by this school

Course

6501

Subject

Statistics

Date

Feb 20, 2024

Type

Rmd

Pages

2

Uploaded by DukeOysterMaster937

Report
--- title: "HW 1" author: "" date: "2023-08-24" output: pdf_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Question 2.1 ### Describe a situation or problem from your job, everyday life, current events, etc., for which a classification model would be appropriate. List some (up to 5) predictors that you might use. ### Answer: One thing that I deal with daily that a classification model would be appropraite would be what jacket should I wear outside. Living in Portland Oregon one common question is will it rain today if so a rain jacket is a good idea, if it is cold a hoodie or sweatshirt is a good idea. If it is cold and rainy then a larger rain jacket on top of a sweatshirt. If it is summer and not rainy then no jacket is ideal. If it is snowing then a ski jacket is a good idesa. ## Question 2.2 ### 1. ```{r , echo=FALSE} library(kernlab) data = read.table("credit_card_data.txt", header=FALSE) model <-ksvm(as.matrix(data[,1:10]),as.factor(data[,11]),type="C- svc",kernel="vanilladot",C=100,scaled=TRUE) a <- colSums(model@xmatrix[[1]] * model@coef[[1]]) a a0 <- model@b a0 pred <- predict(model,data[,1:10]) pred sum(pred == data[,11]) / nrow(data) ``` #### From our calculations my classifier of C=100 is about 86% accurate. ### 3. ```{r pressure, echo=FALSE} library(kknn) k_neighbors = seq(1, 10, 1) knn_pred <- rep(0, nrow(data)) avg_classification_accuracy = list() number_of_neighbors_considered = list() for (j in k_neighbors){ classification_accuracy = c() for (i in 1:nrow(data)){ model_knn = kknn(V11~V1+V2+V3+V4+V5+V6+V7+V8+V9+V10, data[-i,], data[i,], k = j, distance = 2, kernel = "optimal", scale = TRUE) knn_pred[i] <- round(fitted.values(model_knn)) accuracy <- sum(knn_pred == data[,11]) / nrow(data)
classification_accuracy <- append(classification_accuracy, accuracy) } avg_classification_accuracy <- append(avg_classification_accuracy, mean(classification_accuracy)) number_of_neighbors_considered <- append(number_of_neighbors_considered, j) #average classification accurary for all points i for this k } df <- data.frame(unlist(number_of_neighbors_considered), unlist(avg_classification_accuracy)) names(df) <- c("number_of_neighbors_considered", "avg_classification_accuracy") df #find best accuracy for all k ``` #### Based on my calculations the best value for k is 9 because it has the highest classification accuracy out of the values that I tested.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help