Review data below. It contains data on recovery time (minutes) from a certain procedure as well as data on age (years) and sex (1 = male, 0 = female) of 53 hospitalized patients. "recovery" "age" "sex" "1" 4 50.52 1 "2" 4 53.16 0 "3" 25 67.47 1 "4" 7 55.56 1 "5" 8 56.03 0 "6" 28 68.72 0 "7" 8 58.69 1 "8" 9 44.88 0 "9" 9 49.51 1 "10" 10 51.43 1 "11" 10 64.79 1 "12" 10 57.88 1 "13" 10 58.21 0 "14" 11 55.89 1 "15" 11 50.55 1 "16" 12 69.3 0 "17" 12 58.98 0 "18" 12 39.27 1 "19" 12 60.61 1 "20" 13 51.22 1 "21" 13 46.46 1 "22" 14 53.26 1 "23" 14 46.79 0 "24" 15 49.17 1 "25" 16 50 1 "26" 18 46.13 0 "27" 20 71.38 0 "28" 21 64.53 0 "29" 21 51.62 1 "30" 22 75.54 0 "31" 22 67.26 0 "32" 23 60.05 0 "33" 23 71.95 1 "34" 24 71.78 1 "35" 25 71.22 0 "36" 25 69.89 0 "37" 25 68.54 1 "38" 26 62.38 0 "39" 26 59.94 0 "40" 27 59.2 0 "41" 28 56.05 1 "42" 28 60.92 0 "43" 28 50.35 0 "44" 31 84.69 0 "45" 39 75.08 1 "46" 44 51.77 0 "47" 45 58.97 1 "48" 46 58.33 0 "49" 50 70.8 0 "50" 60 62.17 1 "51" 60 65.53 0 "52" 65 62.71 0 "53" 72 62.57 0 After examining the scatterplot of age vs. recovery time, it seems that we might do better if we perform a transformation of the data and then fit the model. Use the following code to get started with the problem: dat6 <- read.table("RecoveryData.txt", header = T) recovery <- dat6$recovery age <- dat6$age sex <- dat6$sex a.) Take the logarithm of the recovery times and then make a scatterplot of log(recovery time) vs. age. (You do not need to create different plotting characters for males and females.) How can we use this scatterplot to justify fitting a model with log(recovery time) as the response and sex and age as the predictors (i.e., why would this model be preferable to a the model with the un-transformed recovery time as the outcome)? b.) Fit the multiple linear regression model with log(recovery time) as the response and sex and age as the predictors in R. Provide the R output for the fitted model and write down the estimated regression equation
Review data below. It contains data on recovery time (minutes) from a certain procedure as well as data on age (years) and sex (1 = male, 0 = female) of 53 hospitalized patients.
"recovery" "age" "sex"
"1" 4 50.52 1
"2" 4 53.16 0
"3" 25 67.47 1
"4" 7 55.56 1
"5" 8 56.03 0
"6" 28 68.72 0
"7" 8 58.69 1
"8" 9 44.88 0
"9" 9 49.51 1
"10" 10 51.43 1
"11" 10 64.79 1
"12" 10 57.88 1
"13" 10 58.21 0
"14" 11 55.89 1
"15" 11 50.55 1
"16" 12 69.3 0
"17" 12 58.98 0
"18" 12 39.27 1
"19" 12 60.61 1
"20" 13 51.22 1
"21" 13 46.46 1
"22" 14 53.26 1
"23" 14 46.79 0
"24" 15 49.17 1
"25" 16 50 1
"26" 18 46.13 0
"27" 20 71.38 0
"28" 21 64.53 0
"29" 21 51.62 1
"30" 22 75.54 0
"31" 22 67.26 0
"32" 23 60.05 0
"33" 23 71.95 1
"34" 24 71.78 1
"35" 25 71.22 0
"36" 25 69.89 0
"37" 25 68.54 1
"38" 26 62.38 0
"39" 26 59.94 0
"40" 27 59.2 0
"41" 28 56.05 1
"42" 28 60.92 0
"43" 28 50.35 0
"44" 31 84.69 0
"45" 39 75.08 1
"46" 44 51.77 0
"47" 45 58.97 1
"48" 46 58.33 0
"49" 50 70.8 0
"50" 60 62.17 1
"51" 60 65.53 0
"52" 65 62.71 0
"53" 72 62.57 0
After examining the
dat6 <- read.table("RecoveryData.txt", header = T)
recovery <- dat6$recovery
age <- dat6$age
sex <- dat6$sex
a.) Take the logarithm of the recovery times and then make a scatterplot of log(recovery time) vs. age. (You do not need to create different plotting characters for males and females.) How can we use this scatterplot to justify fitting a model with log(recovery time) as the response and sex and age as the predictors (i.e., why would this model be preferable to a the model with the un-transformed recovery time as the outcome)?
b.) Fit the multiple linear regression model with log(recovery time) as the response and sex and age as the predictors in R. Provide the R output for the fitted model and write down the estimated regression equation.
Step by step
Solved in 4 steps with 2 images