Compare the simple linear model in Question 2 and the exponential model in Question 3, do you observe any improvements after conducting the exponential model relative to the linear model? Support your conclusion with details (such as R-Square, homogeneity of variance and normality test) this is Q2 and my answer to it: Fit the simple linear regression model TIME = β0 + β1ENZ + ε. Write down the estimated regression function and examine the residual plot and normality test. Describe what you observed and make brief comments. Hint: you need to check the ANOVA table (that is the F-Statistic and its p-value on the last line of the summary(yourmodel) output)), parameter estimates tables, R-Square, residual plot and normality test. Coefficients: (Intercept) enz -108.716 3.967 TIME= -108.716 + 3.967 (ENZ) + ε By checking normality using the Shapiro-Wilk test, the p-value is 0.0000 less than 0.05. The null is rejected, and the data is not normally distributed. ----------------------------------------------- Test Statistic pvalue ----------------------------------------------- Shapiro-Wilk 0.866 0.0000 Kolmogorov-Smirnov 0.14 0.2190 Cramer-von Mises 5.1667 0.0000 Anderson-Darling 1.9562 0.0000 ----------------------------------------------- Residuals: Min 1Q Median 3Q Max -243.33 -64.74 -25.19 49.45 486.50 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -108.7161 61.7191 -1.761 0.084 . enz 3.9668 0.7721 5.137 4.25e-06 *** Residual standard error: 119.5 on 52 degrees of freedom Multiple R-squared: 0.3367, Adjusted R-squared: 0.3239 F-statistic: 26.39 on 1 and 52 DF, p-value: 4.25e-06 This is Q3 and my answer to it: Fit the exponential model logTIME = β0 + β1ENZ + ε. Write down the estimated regression function. Does the model fit well? Why? Hint: you need to check the ANOVA table (that is the F-Statistic and its p-value on the last line of the summary(yourmodel) output)), parameter estimates tables, RSquare, residual plot and normality test. lm(formula = log(time) ~ enz, data = patients) logTIME = β0 + β1ENZ + ε. = 3.558633 ~ 0.019727ENZ + ε Residuals: Min 1Q Median 3Q Max -1.19415 -0.29725 -0.02198 0.34125 1.01853 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.558633 0.245526 14.494 < 2e-16 *** enz 0.019727 0.003072 6.423 4.12e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.4753 on 52 degrees of freedom Multiple R-squared: 0.4424, Adjusted R-squared: 0.4316 F-statistic: 41.25 on 1 and 52 DF, p-value: 4.118e-08 The model fit well, as the p-value is 4.118e-08, Normality test: W = 0.98822, p-value = 0.8716, more than 0.05. Fail to reject the null and conclude the data are normally distributed.
Correlation
Correlation defines a relationship between two independent variables. It tells the degree to which variables move in relation to each other. When two sets of data are related to each other, there is a correlation between them.
Linear Correlation
A correlation is used to determine the relationships between numerical and categorical variables. In other words, it is an indicator of how things are connected to one another. The correlation analysis is the study of how variables are related.
Regression Analysis
Regression analysis is a statistical method in which it estimates the relationship between a dependent variable and one or more independent variable. In simple terms dependent variable is called as outcome variable and independent variable is called as predictors. Regression analysis is one of the methods to find the trends in data. The independent variable used in Regression analysis is named Predictor variable. It offers data of an associated dependent variable regarding a particular outcome.
- Compare the simple linear model in Question 2 and the exponential model in Question 3, do you observe any improvements after conducting the exponential model relative to the linear model? Support your conclusion with details (such as R-Square, homogeneity of variance and normality test)
this is Q2 and my answer to it:
- Fit the simple linear regression model TIME = β0 + β1ENZ + ε. Write down the estimated regression
function and examine the residual plot and normality test. Describe what you observed and make brief comments. Hint: you need to check the ANOVA table (that is the F-Statistic and its p-value on the last line of the summary(yourmodel) output)), parameter estimates tables, R-Square, residual plot and normality test.
Coefficients:
(Intercept) enz
-108.716 3.967
TIME= -108.716 + 3.967 (ENZ) + ε
By checking normality using the Shapiro-Wilk test, the p-value is 0.0000 less than 0.05. The null is rejected, and the data is not
-----------------------------------------------
Test Statistic pvalue
-----------------------------------------------
Shapiro-Wilk 0.866 0.0000
Kolmogorov-Smirnov 0.14 0.2190
Cramer-von Mises 5.1667 0.0000
Anderson-Darling 1.9562 0.0000
-----------------------------------------------
Residuals:
Min 1Q
-243.33 -64.74 -25.19 49.45 486.50
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -108.7161 61.7191 -1.761 0.084 .
enz 3.9668 0.7721 5.137 4.25e-06 ***
Residual standard error: 119.5 on 52 degrees of freedom
Multiple R-squared: 0.3367, Adjusted R-squared: 0.3239
F-statistic: 26.39 on 1 and 52 DF, p-value: 4.25e-06
This is Q3 and my answer to it:
- Fit the exponential model logTIME = β0 + β1ENZ + ε. Write down the estimated regression function. Does the model fit well? Why? Hint: you need to check the ANOVA table (that is the F-Statistic and its p-value on the last line of the summary(yourmodel) output)), parameter estimates tables, RSquare, residual plot and normality test.
lm(formula = log(time) ~ enz, data = patients)
logTIME = β0 + β1ENZ + ε.
= 3.558633 ~ 0.019727ENZ + ε
Residuals:
Min 1Q Median 3Q Max
-1.19415 -0.29725 -0.02198 0.34125 1.01853
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.558633 0.245526 14.494 < 2e-16 ***
enz 0.019727 0.003072 6.423 4.12e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4753 on 52 degrees of freedom
Multiple R-squared: 0.4424, Adjusted R-squared: 0.4316
F-statistic: 41.25 on 1 and 52 DF, p-value: 4.118e-08
The model fit well, as the p-value is 4.118e-08,
Normality test:
W = 0.98822, p-value = 0.8716, more than 0.05. Fail to reject the null and conclude the data are normally distributed.
The coefficient of determination is defined as the proportion of total variability in the dependent variable, which is explained by the model. It is denoted by .
The value of indicates that the dependent variable cannot be predicted using the independent variable. The value of indicates the perfect fit. Hence, the closer the value of is to , the better the fit.
For the linear model, the multiple -squared and adjusted -squared both lies close to indicated not so the good fit of the model. Whereas the multiple -squared and adjusted -squared for exponential model lie close to indicating a comparatively better fit of the model.
Trending now
This is a popular solution!
Step by step
Solved in 4 steps