[You must include the statistical software output that you used for obtaining the statistics to conclude your analysis. Please place them right after your answer for each part of the question.] Part 1 Correlation and Simple Linear Regression Correlation between Two Quantitative Variables and Simple Linear Regression Oxygen consumption rate is a measure of aerobic fitness. However, the procedure is expensive and cumbersome. Your task in this project is to use the following recorded data to learn about the project participants and build a model to estimate oxygen consumption using one of the other variables provided in the following data set. The variables are Oxygen in take rate (ml per kg body weight per minute), Age (in years), BMI (Weight/Height2) RunTime (time to run one mile, in minutes), RestPulse (resting pulse rate per minute), RunPulse (heart rate while running the same time Oxygen rate measured), MaxPulse (maximum heart rate recorded while running), Ranking (a runner’s club ranking) The data can be downloaded from the following link: http://gchang.people.ysu.edu/stat/fitnessBMIOneMileRunHealthClub_19f.sav [If you have trouble downloading file from your browser, please try a different browser. Or, do right click and save the data in your computer and then open the file with statistical software.] Please use the data above to answer the following questions and support them with statistics and charts from statistical software: Use statistical software on the data provided for this question to find the one best predictor for oxygen consumption among all other variables in the data set. Explain your findings using scatter plots or matrix of scatter plots (an option in statistical software), Pearson correlation coefficient, and the p-values from testing Pearson’s correlation coefficient is statistically significantly different from zero, at 5% level of significance. (Show statistics or graphs to support your model, and if non-linearity is observed try to linearize the relation to find better linear fit. See the last page of the lecture note for an example of transforming a variable to linearize the relation.) [Copy and paste your graphs and output from statistical software here!] Build a simple linear regression model using the best predictor you found in 1. to estimate oxygen consumption and write the regression equation in the following space. Please justify your reason for choosing your best predictor for this question. [Copy and paste your statistical software output here!] Oxygen BMI Age RestPulse RunPulse MaxPulse RunTimeOneMile Rankings 47.01 23.91 43 61 170 190 8.10 1438 54.37 24.99 45 46 157 169 6.22 618 59.73 25.12 44 39 165 173 5.63 277 50.31 26.26 39 56 179 181 6.33 559 45.17 24.54 48 59 177 177 8.32 4000 45.93 25.00 41 71 177 181 7.99 2961 49.14 25.10 44 65 163 171 7.35 1551 40.25 23.73 45 64 176 177 9.10 8955 60.42 24.91 39 49 171 187 6.25 516 51.52 24.73 45 46 169 169 7.15 1278 37.93 23.93 46 57 187 193 9.51 13449 44.78 23.97 46 52 177 177 7.77 2360 48.13 26.23 48 48 163 165 7.51 1820 52.85 25.25 55 51 167 171 6.89 985 49.95 26.80 50 45 181 186 8.10 551 41.04 26.29 52 58 169 173 7.76 2344 47.27 24.22 52 49 163 169 7.09 1195 47.04 26.75 49 49 163 165 6.93 1025 51.36 24.25 50 68 169 169 7.03 1126 40.07 25.69 58 59 175 177 8.73 6165 46.08 24.29 55 63 157 166 7.64 2079 46.35 24.46 53 49 165 167 7.30 710 55.13 26.29 51 49 147 156 6.47 643 46.09 24.51 52 49 173 173 7.81 2473 39.88 25.97 55 45 169 173 9.19 9831 45.85 25.43 52 60 187 189 7.42 1669 50.95 20.49 58 50 149 156 6.69 801 48.74 24.68 50 57 187 189 6.37 586 48.12 23.38 49 53 171 177 7.20 1000 48.17 24.45 53 54 171 173 9.30 1881
Correlation
Correlation defines a relationship between two independent variables. It tells the degree to which variables move in relation to each other. When two sets of data are related to each other, there is a correlation between them.
Linear Correlation
A correlation is used to determine the relationships between numerical and categorical variables. In other words, it is an indicator of how things are connected to one another. The correlation analysis is the study of how variables are related.
Regression Analysis
Regression analysis is a statistical method in which it estimates the relationship between a dependent variable and one or more independent variable. In simple terms dependent variable is called as outcome variable and independent variable is called as predictors. Regression analysis is one of the methods to find the trends in data. The independent variable used in Regression analysis is named Predictor variable. It offers data of an associated dependent variable regarding a particular outcome.
[You must include the statistical software output that you used for obtaining the statistics to conclude your analysis. Please place them right after your answer for each part of the question.]
Part 1
Correlation between Two Quantitative Variables and Simple Linear Regression
Oxygen consumption rate is a measure of aerobic fitness. However, the procedure is expensive and cumbersome. Your task in this project is to use the following recorded data to learn about the project participants and build a model to estimate oxygen consumption using one of the other variables provided in the following data set. The variables are
Oxygen in take rate (ml per kg body weight per minute),
Age (in years),
BMI (Weight/Height2)
RunTime (time to run one mile, in minutes),
RestPulse (resting pulse rate per minute),
RunPulse (heart rate while running the same time Oxygen rate measured),
MaxPulse (maximum heart rate recorded while running),
Ranking (a runner’s club ranking)
The data can be downloaded from the following link: http://gchang.people.ysu.edu/stat/fitnessBMIOneMileRunHealthClub_19f.sav
[If you have trouble downloading file from your browser, please try a different browser. Or, do right click and save the data in your computer and then open the file with statistical software.]
Please use the data above to answer the following questions and support them with statistics and charts from statistical software:
- Use statistical software on the data provided for this question to find the one best predictor for oxygen consumption among all other variables in the data set. Explain your findings using
scatter plots or matrix of scatter plots (an option in statistical software), Pearsoncorrelation coefficient , and the p-values from testing Pearson’s correlation coefficient is statistically significantly different from zero, at 5% level of significance. (Show statistics or graphs to support your model, and if non-linearity is observed try to linearize the relation to find better linear fit. See the last page of the lecture note for an example of transforming a variable to linearize the relation.)
[Copy and paste your graphs and output from statistical software here!]
- Build a simple linear regression model using the best predictor you found in 1. to estimate oxygen consumption and write the regression equation in the following space. Please justify your reason for choosing your best predictor for this question.
[Copy and paste your statistical software output here!]
Oxygen | BMI | Age | RestPulse | RunPulse | MaxPulse | RunTimeOneMile | Rankings |
47.01 | 23.91 | 43 | 61 | 170 | 190 | 8.10 | 1438 |
54.37 | 24.99 | 45 | 46 | 157 | 169 | 6.22 | 618 |
59.73 | 25.12 | 44 | 39 | 165 | 173 | 5.63 | 277 |
50.31 | 26.26 | 39 | 56 | 179 | 181 | 6.33 | 559 |
45.17 | 24.54 | 48 | 59 | 177 | 177 | 8.32 | 4000 |
45.93 | 25.00 | 41 | 71 | 177 | 181 | 7.99 | 2961 |
49.14 | 25.10 | 44 | 65 | 163 | 171 | 7.35 | 1551 |
40.25 | 23.73 | 45 | 64 | 176 | 177 | 9.10 | 8955 |
60.42 | 24.91 | 39 | 49 | 171 | 187 | 6.25 | 516 |
51.52 | 24.73 | 45 | 46 | 169 | 169 | 7.15 | 1278 |
37.93 | 23.93 | 46 | 57 | 187 | 193 | 9.51 | 13449 |
44.78 | 23.97 | 46 | 52 | 177 | 177 | 7.77 | 2360 |
48.13 | 26.23 | 48 | 48 | 163 | 165 | 7.51 | 1820 |
52.85 | 25.25 | 55 | 51 | 167 | 171 | 6.89 | 985 |
49.95 | 26.80 | 50 | 45 | 181 | 186 | 8.10 | 551 |
41.04 | 26.29 | 52 | 58 | 169 | 173 | 7.76 | 2344 |
47.27 | 24.22 | 52 | 49 | 163 | 169 | 7.09 | 1195 |
47.04 | 26.75 | 49 | 49 | 163 | 165 | 6.93 | 1025 |
51.36 | 24.25 | 50 | 68 | 169 | 169 | 7.03 | 1126 |
40.07 | 25.69 | 58 | 59 | 175 | 177 | 8.73 | 6165 |
46.08 | 24.29 | 55 | 63 | 157 | 166 | 7.64 | 2079 |
46.35 | 24.46 | 53 | 49 | 165 | 167 | 7.30 | 710 |
55.13 | 26.29 | 51 | 49 | 147 | 156 | 6.47 | 643 |
46.09 | 24.51 | 52 | 49 | 173 | 173 | 7.81 | 2473 |
39.88 | 25.97 | 55 | 45 | 169 | 173 | 9.19 | 9831 |
45.85 | 25.43 | 52 | 60 | 187 | 189 | 7.42 | 1669 |
50.95 | 20.49 | 58 | 50 | 149 | 156 | 6.69 | 801 |
48.74 | 24.68 | 50 | 57 | 187 | 189 | 6.37 | 586 |
48.12 | 23.38 | 49 | 53 | 171 | 177 | 7.20 | 1000 |
48.17 | 24.45 | 53 | 54 | 171 | 173 | 9.30 | 1881 |
Trending now
This is a popular solution!
Step by step
Solved in 2 steps with 6 images