Review data below. It contains data on recovery time (minutes) from a certain procedure as well as data on age (years) and sex (1 = male, 0 = female) of 53 hospitalized patients. "recovery" "age" "sex" "1" 4 50.52 1 "2" 4 53.16 0 "3" 25 67.47 1 "4" 7 55.56 1 "5" 8 56.03 0 "6" 28 68.72 0 "7" 8 58.69 1 "8" 9 44.88 0 "9" 9 49.51 1 "10" 10 51.43 1 "11" 10 64.79 1 "12" 10 57.88 1 "13" 10 58.21 0 "14" 11 55.89 1 "15" 11 50.55 1 "16" 12 69.3 0 "17" 12 58.98 0 "18" 12 39.27 1 "19" 12 60.61 1 "20" 13 51.22 1 "21" 13 46.46 1 "22" 14 53.26 1 "23" 14 46.79 0 "24" 15 49.17 1 "25" 16 50 1 "26" 18 46.13 0 "27" 20 71.38 0 "28" 21 64.53 0 "29" 21 51.62 1 "30" 22 75.54 0 "31" 22 67.26 0 "32" 23 60.05 0 "33" 23 71.95 1 "34" 24 71.78 1 "35" 25 71.22 0 "36" 25 69.89 0 "37" 25 68.54 1 "38" 26 62.38 0 "39" 26 59.94 0 "40" 27 59.2 0 "41" 28 56.05 1 "42" 28 60.92 0 "43" 28 50.35 0 "44" 31 84.69 0 "45" 39 75.08 1 "46" 44 51.77 0 "47" 45 58.97 1 "48" 46 58.33 0 "49" 50 70.8 0 "50" 60 62.17 1 "51" 60 65.53 0 "52" 65 62.71 0 "53" 72 62.57 0 After examining the scatterplot of age vs. recovery time, it seems that we might do better if we perform a transformation of the data and then fit the model. Use the following code to get started with the problem: dat6 <- read.table("RecoveryData.txt", header = T) recovery <- dat6$recovery age <- dat6$age sex <- dat6$sex a.) Take the logarithm of the recovery times and then make a scatterplot of log(recovery time) vs. age. (You do not need to create different plotting characters for males and females.) How can we use this scatterplot to justify fitting a model with log(recovery time) as the response and sex and age as the predictors (i.e., why would this model be preferable to a the model with the un-transformed recovery time as the outcome)? b.) Fit the multiple linear regression model with log(recovery time) as the response and sex and age as the predictors in R. Provide the R output for the fitted model and write down the estimated regression equation. c.) Compare the R2 values for the model with log(recovery time) as the response and with recovery time as the response. Which model is better with respect to this value? d.) Interpret the estimated slopes of sex and age for the fitted model with log(recovery time) as the response. e.) What is the predicted recovery time for a new 53 year old male patient based on the estimated model with log(recovery time) as the response? Do not use R. f.) Use R to find the 95% prediction interval for the predicted recovery time for a new 53 year old male patient based on the estimated model with log(recovery time) as the response. Note that you will use the predict function in the same way as we have learned to use it in simple linear regression settings. The difference here is that you have two predictors so you need to use the following code for the newX object: newX <- data.frame(sex = 1, age = 53)
Review data below. It contains data on recovery time (minutes) from a certain procedure as well as data on age (years) and sex (1 = male, 0 = female) of 53 hospitalized patients.
"recovery" "age" "sex"
"1" 4 50.52 1
"2" 4 53.16 0
"3" 25 67.47 1
"4" 7 55.56 1
"5" 8 56.03 0
"6" 28 68.72 0
"7" 8 58.69 1
"8" 9 44.88 0
"9" 9 49.51 1
"10" 10 51.43 1
"11" 10 64.79 1
"12" 10 57.88 1
"13" 10 58.21 0
"14" 11 55.89 1
"15" 11 50.55 1
"16" 12 69.3 0
"17" 12 58.98 0
"18" 12 39.27 1
"19" 12 60.61 1
"20" 13 51.22 1
"21" 13 46.46 1
"22" 14 53.26 1
"23" 14 46.79 0
"24" 15 49.17 1
"25" 16 50 1
"26" 18 46.13 0
"27" 20 71.38 0
"28" 21 64.53 0
"29" 21 51.62 1
"30" 22 75.54 0
"31" 22 67.26 0
"32" 23 60.05 0
"33" 23 71.95 1
"34" 24 71.78 1
"35" 25 71.22 0
"36" 25 69.89 0
"37" 25 68.54 1
"38" 26 62.38 0
"39" 26 59.94 0
"40" 27 59.2 0
"41" 28 56.05 1
"42" 28 60.92 0
"43" 28 50.35 0
"44" 31 84.69 0
"45" 39 75.08 1
"46" 44 51.77 0
"47" 45 58.97 1
"48" 46 58.33 0
"49" 50 70.8 0
"50" 60 62.17 1
"51" 60 65.53 0
"52" 65 62.71 0
"53" 72 62.57 0
After examining the
dat6 <- read.table("RecoveryData.txt", header = T)
recovery <- dat6$recovery
age <- dat6$age
sex <- dat6$sex
a.) Take the logarithm of the recovery times and then make a scatterplot of log(recovery time) vs. age. (You do not need to create different plotting characters for males and females.) How can we use this scatterplot to justify fitting a model with log(recovery time) as the response and sex and age as the predictors (i.e., why would this model be preferable to a the model with the un-transformed recovery time as the outcome)?
b.) Fit the multiple linear regression model with log(recovery time) as the response and sex and age as the predictors in R. Provide the R output for the fitted model and write down the estimated regression equation.
c.) Compare the R2 values for the model with log(recovery time) as the response and with recovery time as the response. Which model is better with respect to this value?
d.) Interpret the estimated slopes of sex and age for the fitted model with log(recovery time) as the response.
e.) What is the predicted recovery time for a new 53 year old male patient based on the estimated model with log(recovery time) as the response? Do not use R.
f.) Use R to find the 95% prediction interval for the predicted recovery time for a new 53 year old male patient based on the estimated model with log(recovery time) as the response. Note that you will use the predict
newX <- data.frame(sex = 1, age = 53)
Trending now
This is a popular solution!
Step by step
Solved in 5 steps with 2 images
hello, thank you for your help! for part E, I am not sure how you got the answer? Is there an equation you used? I have to find the recovery time for 53 but i cant figure out how to do that based on your answer for age 54.