19BCE1567_LAB1

pdf

School

University of South Carolina *

*We aren’t endorsed by this school

Course

MISC

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by JudgeDeerMaster933

19BCE1567 13/01/2022 SARA KULKARNI L21+L22: PROF LAKSHMI PATHI ESSENTIALS OF DATA ANALYTICS LAB 1 Tasks for Week-1: Regression Understand the following operations/functions on random dataset and perform similar operations on mtcars and ‘data.csv’ dataset based on given instructions. Aim : To develop linear regression model for the given data using R programming and to verify the null hypothesis 1. MTCARS DATASET ALGORITHM: 1.Start 2.load mtcars data 3.Split the data into training and testing data 4.Use lm command to generate linear model for target variable with respect to dependent variable 5.print the lmModel

6.Print the summary of the model 7.Predict the target variable using the linear model and compare with the actual values in the test dataset STATISTICS: lm(formula = mpg ~ wt, data = train1) Coefficients: (Intercept) wt 37.490 -5.422 lm(formula = mpg ~ wt, data = train1) Residuals: Min 1Q Median 3Q Max -4.5012 -2.3686 -0.2967 1.3515 6.8391 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 37.4900 2.4144 15.528 2.44e-13 *** wt -5.4224 0.7193 -7.538 1.56e-07 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.272 on 22 degrees of freedom Multiple R-squared: 0.7209, Adjusted R-squared: 0.7082 F-statistic: 56.83 on 1 and 22 DF, p-value: 1.561e-07 INFERENCE: We can observe that the p value is less than 0.05, indicating the existence of a strong correlation between mpg and wt. Hence this model is selected. [ACCEPTED] Comparing the predicted values with actual values in the test dataset Program: data1<-mtcars library(dplyr) require(caTools) set.seed(123) sample = sample.split(data1,SplitRatio = 0.75) train1 =subset(data1,sample ==TRUE) # creates a training dataset named train1 with rows which are marked as TRUE test1=subset(data1, sample==FALSE) dim(test1) lmmodel<-lm(mpg~wt,data=train1) print(lmmodel) summary(lmmodel)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

test1$Preditedmpg<- predict(lmmodel, test1) # Priting top 6 rows of actual and predited mpg head(test1[ , c("mpg", "Preditedmpg")]) 2 SAMPLE DATASET ALGORITHM 1.Start 2.Read data.csv from files 3.Split the data into training and testing data 4.Use lm command to generate linear model for target variable with respect to dependent variable 5.print the lmModel 6.Print the summary of the model 7.check the p value , since it is less than 0.05 reject the model STATISTICS: Call: lm(formula = Weight ~ Height, data = train) Coefficients: (Intercept) Height

17.0547 0.5347 lm(formula = Weight ~ Height, data = train) Residuals: Min 1Q Median 3Q Max -64.904 -25.591 -0.766 29.361 56.206 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 17.0547 49.5133 0.344 0.7320 Height 0.5347 0.2911 1.837 0.0722 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 31.59 on 50 degrees of freedom Multiple R-squared: 0.0632, Adjusted R-squared: 0.04446 F-statistic: 3.373 on 1 and 50 DF, p-value: 0.07222 INFERENCE: Since the value of P is greater than 0.05 , there is no significant relation between Weight and Height in our dataset. Hence this model is rejected. [REJECTED]

PROGRAM: data1 <- read.csv("data.csv", header = TRUE, sep = ",") head(data1) library(caret) # Split data into train and test index <- createDataPartition(data1$Weight, p = .10, list = FALSE) train <- data1[index, ] test <- data1[-index, ] # Checking the dim of train dim(train) lmModel <- lm(Weight ~ Height , data = train) # Printing the model object print(lmModel) summary(lmModel)

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version