For normality, minimum two graphs. Note: If using R, for Levene's test, install the R package "car". Does the response variable need any transformation? Why? e) Compare vitamin D levels for country A and country B at a significance level of 5% using a t- test, one-way ANOVA, and regression. Compare the results of all three techniques. For one-way ANOVA use t-test (LSD. test). Plot t-test and regression. Note: use the "Question_1_Country" dataset for this question. If you are using R, for the LSD. test function, install the package "agricolae". f) Write down the contrast and its related hypotheses to compare country A with country B for their vitamin D levels. Perform the contrast both by hand-calculation and using a statistical package and compare the answers. g) Change the variable Age from a numerical variable to a categorical variable with two levels. You need to select the criteria for this categorising. However, make sure you have enough observations in each group. Compare the variances of these two levels using Levene's test: You may need to change the criteria for Age group membership a few times, to get an appropriate spread of observations across groups. Before categorising, a summary of the variable "Age" might be useful: summary(Vitamin_DSAge) Min. 1st Qu. Median Mean 3rd Qu. 20.00 36.00 44.00 41.62 47.00 R code for solution: New Age

MATLAB: An Introduction with Applications
6th Edition
ISBN:9781119256830
Author:Amos Gilat
Publisher:Amos Gilat
Chapter1: Starting With Matlab
Section: Chapter Questions
Problem 1P
icon
Related questions
Question
Country
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
Gender
Male
Male
Male
Male
Male
Female
Female
Female
Female
Female
Other
Other
Other
Other
Other
Male
Male
Male
Male
Male
Female
Female
Female
Female
Female
Other
Other
Other
Other
Other
Male
Male
Male
Male
Male
Female
Female
Female
Female
Female
Other
Other
Other
Other
Other
Male
Male
Male
Male
Male
Female
Female
Female
Female
Female
Other
Other
Other
Other
Other
Male
Male
Male
Male
Male
Female
Female
Female
Female
Female
Other
Other
Other
Other
Other
Male
Male
Male
Male
Male
Female
Female
Female
Female
Female
Other
Other
Other
Other
Other
Sun_Exposure
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
High
High
High
High
High
High
High
High
High
High
High
High
High
High
High
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Low
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
Moderate
High
High
High
High
High
High
High
High
High
High
*******
High
High
High
High
High
Age
37
31
33
38
37
36
32
36
32
37
39
33
32
38
37
34
22
38
38
38
33
37
39
38
38
37
21
38
39
30
38
39
23
37
35
34
37
30
37
39
36
37
37
38
29
52
47
46
45
47
47
46
41
42
47
51
47
49
35
51
49
47
46
41
47
49
43
48
47
49
45
49
49
40
47
45
49
45
47
47
41
44
51
34
47
49
46
47
46
45
Vitamin_D_Level
45
40
39
42
38
38
39
37
44
38
43
36
41
44
34
48
57
38
45
37
60
40
42
42
44
39
38
39
45
47
55
59
46
55
51
49
51
62
55
53
53
66
53
51
57
53
47
57
47
51
49
53
55
45
48
51
48
46
51
45
45
58
48
55
52
44
48
48
48
62
!
52
52
55
65
55
62
65
52
57
62
60
62
64
63
63
63
62
61
67
75
Transcribed Image Text:Country A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B Gender Male Male Male Male Male Female Female Female Female Female Other Other Other Other Other Male Male Male Male Male Female Female Female Female Female Other Other Other Other Other Male Male Male Male Male Female Female Female Female Female Other Other Other Other Other Male Male Male Male Male Female Female Female Female Female Other Other Other Other Other Male Male Male Male Male Female Female Female Female Female Other Other Other Other Other Male Male Male Male Male Female Female Female Female Female Other Other Other Other Other Sun_Exposure Low Low Low Low Low Low Low Low Low Low Low Low Low Low Low Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate High High High High High High High High High High High High High High High Low Low Low Low Low Low Low Low Low Low Low Low Low Low Low Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate Moderate High High High High High High High High High High ******* High High High High High Age 37 31 33 38 37 36 32 36 32 37 39 33 32 38 37 34 22 38 38 38 33 37 39 38 38 37 21 38 39 30 38 39 23 37 35 34 37 30 37 39 36 37 37 38 29 52 47 46 45 47 47 46 41 42 47 51 47 49 35 51 49 47 46 41 47 49 43 48 47 49 45 49 49 40 47 45 49 45 47 47 41 44 51 34 47 49 46 47 46 45 Vitamin_D_Level 45 40 39 42 38 38 39 37 44 38 43 36 41 44 34 48 57 38 45 37 60 40 42 42 44 39 38 39 45 47 55 59 46 55 51 49 51 62 55 53 53 66 53 51 57 53 47 57 47 51 49 53 55 45 48 51 48 46 51 45 45 58 48 55 52 44 48 48 48 62 ! 52 52 55 65 55 62 65 52 57 62 60 62 64 63 63 63 62 61 67 75
Scientists claim that there is a possible relationship between the severity of Covid-19 and a
low level of blood vitamin D. The normal level for vitamin D is around 30ng/ml. For this reason,
the vitamin D blood level has been measured for 2 countries, A and B, and recorded in a
Microsoft Excel file (Vitamin_D.xlsx). Enter this Excel file into the statistical package of your
choice and answer all questions: (Please add the code and output to your answers.)
a) Find the number of observations, mean, standard deviation of vitamin D level for the
variables Country, Gender, and Sun_
Exposure using the statistical package of your choice:
b) Write down the statistical hypotheses for three separate analyses comparing the means of
the groups in the following variables: (1) Country, (2) Gender, and (3) Sun_
Exposure. Estimate the treatment effects for each of these three factors.
Note: For the next questions, use a statistical software to randomly sample 30
observations for each Country, and save this subset of observations in a new dataset
called "Question_1_Country".
In SPSS: Data > Select Cases > Random sample of cases (you can save the random
observations as a .sav or an excel file, if using SPSS).
In R: you can use the following code to perform this random sampling. Learn this
procedure as you may need to use it for other sections.
Note: every time you run the following code, you will get a different dataset.
Therefore, you will get different outputs (because of the random nature of random
sampling).
### 30 Random observations for Country A ###
Country A<subset (Vitamin D, subset= Country=="A")
C.A - Country A[sample(nrow(Country_A), 30),]
### 30 Random observations for Country B ###
Country B < subset (vitamin D, subset=Country--"B")
C.B - Country_B[sample (nrow(Country_B), 30),]
### Combining them in a new dataset ###
Question 1 Country <- rbind(C.A, C.B)
View(Question_1_Country)
c) Read the above code carefully and explain what each line is doing.
d) To investigate the difference in vitamin D level between countries, write down the general
assumptions of the appropriate statistical test,
and check them using graphs and/or outputs from your statistical package of choice.
Note: use the "Question_1_Country" dataset for this question.
For normality, minimum two graphs.
Note: If using R, for Levene's test, install the R package "car".
Does the response variable need any transformation? Why?
e) Compare vitamin D levels for country A and country B at a significance level of 5% using a t-
test, one-way ANOVA, and regression.
Compare the results of all three techniques. For one-way ANOVA use t-test (LSD. test).
Plot t-test and regression.
Note: use the "Question_1_Country" dataset for this question.
If you are using R, for the LSD. test function, install the package "agricolae".
f) Write down the contrast and its related hypotheses to compare country A with country B for
their vitamin D levels.
Perform the contrast both by hand-calculation and using a statistical package and compare
the answers.
g) Change the variable Age from a numerical variable to a categorical variable with two levels.
You need to select the criteria for this categorising. However, make sure you have enough
observations in each group.
Compare the variances of these two levels using Levene's test:
You may need to change the criteria for Age group membership a few times, to get an
appropriate spread of observations across groups.
Before categorising, a summary of the variable "Age" might be useful:
summary(Vitamin_D$Age)
Min. 1st Qu. Median Mean 3rd Qu. Max
20.00 36.00 44.00 41.62 47.00 65.00
R code for solution:
New Age <- ffelse (Vitamin DSAge 40, 'Age_level_1', 'Age_level_2')
### Adding the new variable to our original dataset #
Vitamin DSNew Age <- New Age
#### Count the number of each level ######
table(vitamin DSNew Age)
Now, check the equality of variances of vitamin D level between these two age groups using
Levene's test:
Note: use the standard residuals for the variance equality check.
h) Perform a two-way ANOVA to investigate the effects of Sun_
Exposure (3 levels) and Country (2 levels) on vitamin D levels.
Write all statistical hypotheses and the factorial models, produce the ANOVA table and
Tukey's HSD test with graphs (if needed),
and conclusions based on a 5% significance level.
Note: Use a statistical package to randomly sample 5 observations for each
combination of Sun_Exposure and Country, combine the datasets together and call
it "Question_1_Factorial".
In SPSS: Data > Select Cases > Random sample of cases (you can save the random
Transcribed Image Text:Scientists claim that there is a possible relationship between the severity of Covid-19 and a low level of blood vitamin D. The normal level for vitamin D is around 30ng/ml. For this reason, the vitamin D blood level has been measured for 2 countries, A and B, and recorded in a Microsoft Excel file (Vitamin_D.xlsx). Enter this Excel file into the statistical package of your choice and answer all questions: (Please add the code and output to your answers.) a) Find the number of observations, mean, standard deviation of vitamin D level for the variables Country, Gender, and Sun_ Exposure using the statistical package of your choice: b) Write down the statistical hypotheses for three separate analyses comparing the means of the groups in the following variables: (1) Country, (2) Gender, and (3) Sun_ Exposure. Estimate the treatment effects for each of these three factors. Note: For the next questions, use a statistical software to randomly sample 30 observations for each Country, and save this subset of observations in a new dataset called "Question_1_Country". In SPSS: Data > Select Cases > Random sample of cases (you can save the random observations as a .sav or an excel file, if using SPSS). In R: you can use the following code to perform this random sampling. Learn this procedure as you may need to use it for other sections. Note: every time you run the following code, you will get a different dataset. Therefore, you will get different outputs (because of the random nature of random sampling). ### 30 Random observations for Country A ### Country A<subset (Vitamin D, subset= Country=="A") C.A - Country A[sample(nrow(Country_A), 30),] ### 30 Random observations for Country B ### Country B < subset (vitamin D, subset=Country--"B") C.B - Country_B[sample (nrow(Country_B), 30),] ### Combining them in a new dataset ### Question 1 Country <- rbind(C.A, C.B) View(Question_1_Country) c) Read the above code carefully and explain what each line is doing. d) To investigate the difference in vitamin D level between countries, write down the general assumptions of the appropriate statistical test, and check them using graphs and/or outputs from your statistical package of choice. Note: use the "Question_1_Country" dataset for this question. For normality, minimum two graphs. Note: If using R, for Levene's test, install the R package "car". Does the response variable need any transformation? Why? e) Compare vitamin D levels for country A and country B at a significance level of 5% using a t- test, one-way ANOVA, and regression. Compare the results of all three techniques. For one-way ANOVA use t-test (LSD. test). Plot t-test and regression. Note: use the "Question_1_Country" dataset for this question. If you are using R, for the LSD. test function, install the package "agricolae". f) Write down the contrast and its related hypotheses to compare country A with country B for their vitamin D levels. Perform the contrast both by hand-calculation and using a statistical package and compare the answers. g) Change the variable Age from a numerical variable to a categorical variable with two levels. You need to select the criteria for this categorising. However, make sure you have enough observations in each group. Compare the variances of these two levels using Levene's test: You may need to change the criteria for Age group membership a few times, to get an appropriate spread of observations across groups. Before categorising, a summary of the variable "Age" might be useful: summary(Vitamin_D$Age) Min. 1st Qu. Median Mean 3rd Qu. Max 20.00 36.00 44.00 41.62 47.00 65.00 R code for solution: New Age <- ffelse (Vitamin DSAge 40, 'Age_level_1', 'Age_level_2') ### Adding the new variable to our original dataset # Vitamin DSNew Age <- New Age #### Count the number of each level ###### table(vitamin DSNew Age) Now, check the equality of variances of vitamin D level between these two age groups using Levene's test: Note: use the standard residuals for the variance equality check. h) Perform a two-way ANOVA to investigate the effects of Sun_ Exposure (3 levels) and Country (2 levels) on vitamin D levels. Write all statistical hypotheses and the factorial models, produce the ANOVA table and Tukey's HSD test with graphs (if needed), and conclusions based on a 5% significance level. Note: Use a statistical package to randomly sample 5 observations for each combination of Sun_Exposure and Country, combine the datasets together and call it "Question_1_Factorial". In SPSS: Data > Select Cases > Random sample of cases (you can save the random
Expert Solution
steps

Step by step

Solved in 9 steps with 12 images

Blurred answer
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
MATLAB: An Introduction with Applications
MATLAB: An Introduction with Applications
Statistics
ISBN:
9781119256830
Author:
Amos Gilat
Publisher:
John Wiley & Sons Inc
Probability and Statistics for Engineering and th…
Probability and Statistics for Engineering and th…
Statistics
ISBN:
9781305251809
Author:
Jay L. Devore
Publisher:
Cengage Learning
Statistics for The Behavioral Sciences (MindTap C…
Statistics for The Behavioral Sciences (MindTap C…
Statistics
ISBN:
9781305504912
Author:
Frederick J Gravetter, Larry B. Wallnau
Publisher:
Cengage Learning
Elementary Statistics: Picturing the World (7th E…
Elementary Statistics: Picturing the World (7th E…
Statistics
ISBN:
9780134683416
Author:
Ron Larson, Betsy Farber
Publisher:
PEARSON
The Basic Practice of Statistics
The Basic Practice of Statistics
Statistics
ISBN:
9781319042578
Author:
David S. Moore, William I. Notz, Michael A. Fligner
Publisher:
W. H. Freeman
Introduction to the Practice of Statistics
Introduction to the Practice of Statistics
Statistics
ISBN:
9781319013387
Author:
David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:
W. H. Freeman