adm-rmd

docx

School

Kent State University *

*We aren’t endorsed by this school

Course

76681

Subject

Industrial Engineering

Date

Jan 9, 2024

Type

docx

Pages

Uploaded by ProfessorAlpaca3845

Advanced Data Mining and Predictive Analytics-1 Rahul Chakravarthy k 2023-10-22 Question A1: What is the main purpose of regularization when training predictive models? -Regularization is the process of selecting or tuning the preferred level of model complse regularization in training the machine learning model to properly fitting a model into our test set. Question A2: What is the role of a loss function in a predictive model? And name two common loss functions for regression models and two common loss functions for classification models. -A loss function is also known as a cost function or error function. The loss function is the function which computes or quantifies the distance between the current output of the model and the expected output. Generally, is a method to evaluate how well the algorithm model is doing its task which is to predict the output.Regression algorithms predict the continuous value based on the input variables, continuous values like age, prices etc.Classification algorithm approximates a mapping function from input variables to identify discrete output variable, which can be labels or categories. For a regression problem the most common loss function is mean squared error and mean absolute error, mean bias error, relative absolute error are the other loss functions. For a classification problem the most common loss function is binary and categorical cross entropy, hinge loss, support vector machine loss are others. It is important to choose the correct loss function as it directly impacts the algorithm of the model. Question A3: Consider the following scenario. You are building a classification model with many hyper parameters on a relatively small data set. You will see that the training error is extremely small. Can you fully trust this model? Discuss the reason. - A classification model build with many hyper parameters on relatively small dataset cannot be completely trusted or reliable because, the possible issue with the small database includes overfitting, the most common issues when many hyper parameters re used on small dataset. Data noise, higher level of noise is often seen in the smaller datasets which could result in errors. Sampling data, the dataset does not reflect reality. So, if we use many parameters in building a classification model on small dataset that could lead to overfitting as the classification model tries to fix all the data points by learning more noise and resulting in training errors. So, these kinds of models cannot be trusted as there are chances for such models to perform better on unseen data resulting in small training errors. Question A4: What is the role of the lambda parameter in regularized linear models such as Lasso or Ridge regression models? -The lambda parameter controls the amount of regularization applied to the model. As a hyperparameter for regularized linear models like Lasso or Ridge regression models, the lambda parameter is nothing more than a regularization parameter. A large lambda number of results in

greater regularization, which reduces the coefficients for the characteristics and makes it more less accurate when using training data and less generalized. Nevertheless, smaller lambda bigger coefficients indicate weaker regularization, which improves performance on the training set, however its performance might not be as good with untested data. Thus, deciding is crucial. Cross-validation can be used to determine the lambda’s ideal value. #Question A1 # Load the required libraries and data library (ISLR) library (dplyr) ## ## Attaching package: 'dplyr' ## The following objects are masked from 'package:stats': ## ## filter, lag ## The following objects are masked from 'package:base': ## ## intersect, setdiff, setequal, union library (glmnet) ## Loading required package: Matrix ## Loaded glmnet 4.1-8 library (caret) ## Loading required package: ggplot2 ## Loading required package: lattice data (Carseats) set.seed ( 64046 ) # Select the required attributes Carseats_Filtered <- Carseats %>% select (Sales, Price, Advertising, Population, Age, Income, Education) # The data sets are split-ed into training and testing sets set.seed ( 123 ) trainIndex <- createDataPartition (Carseats_Filtered $ Sales, p = 0.7 , list = FALSE ) train <- Carseats_Filtered[trainIndex, ] test <- Carseats_Filtered[ - trainIndex, ] # The input attributes are scaled and centered prepValues <- preProcess (train[, - 1 ], method = c ( "center" , "scale" )) train[, - 1 ] <- predict (prepValues, train[, - 1 ])

test[, - 1 ] <- predict (prepValues, test[, - 1 ]) # Create a matrix of input attributes and output variable x <- as.matrix (train[, - 1 ]) y <- train $ Sales # Fit a Lasso regression model using glmnet lassofit <- glmnet (x, y, alpha = 1 ) # Using cross-validation find the best value of lambda cv.out <- cv.glmnet (x, y, alpha = 1 ) best.lambda <- cv.out $ lambda.min # Plot the coefficient paths for different values of lambda plot (lassofit, xvar = "lambda" , label = TRUE ) # Plot the mean squared error (MSE) vs. log(lambda) for the cross-validation results plot (cv.out)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

# Print the best value of lambda cat ( "Best lambda value:" , best.lambda, " \n " ) ## Best lambda value: 0.08344626 #Question A2 # Using glmnet fit a lasso regression model with the best lambda value lassofit <- glmnet (x, y, alpha = 1 , lambda = best.lambda) # Get the coefficients of the best model coeflasso <- predict (lassofit, type = "coefficients" , s = best.lambda) # Print the coefficient for the "Price" attribute cat ( "Coefficient for Price:" , coeflasso[ "Price" ,], " \n " ) ## Coefficient for Price: -1.175399 #Question A3 # Fit a Lasso regression model using glmnet with lambda = 0.01 lasso.fit .01 <- glmnet (x, y, alpha = 1 , lambda = 0.01 ) # Count the number of non-zero coefficients

num.vars .01 <- sum ( coef (lasso.fit .01 , s = 0 ) != 0 ) # Print the number of variables with non-zero coefficients cat ( "Number of variables with non-zero coefficients (lambda = 0.01):" , num.vars .01 , " \n " ) ## Number of variables with non-zero coefficients (lambda = 0.01): 7 # Fit a Lasso regression model using glmnet with lambda = 0.1 lasso.fit .1 <- glmnet (x, y, alpha = 1 , lambda = 0.1 ) # Count the number of non-zero coefficients num.vars .1 <- sum ( coef (lasso.fit .1 , s = 0 ) != 0 ) # Print the number of variables with non-zero coefficients cat ( "Number of variables with non-zero coefficients (lambda = 0.1):" , num.vars .1 , " \n " ) ## Number of variables with non-zero coefficients (lambda = 0.1): 5 #Question A4 # Fit an elastic-net model using glmnet with alpha = 0.6 enet.fit <- glmnet (x, y, alpha = 0.6 ) # Use cross-validation to find the best value of lambda for the elastic-net model cv.enet <- cv.glmnet (x, y, alpha = 0.6 ) # The cross-validation results are plotted plot (cv.enet)

# The optimal value of lambda is extracted from the cross- validation results lambda.opt <- cv.enet $ lambda.min # Print the optimal value of lambda cat ( "Optimal value of lambda for elastic-net model (alpha = 0.6):" , lambda.opt, " \n " ) ## Optimal value of lambda for elastic-net model (alpha = 0.6): 0.09586045

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

adm-rmd

Related Documents