Alexander Klemp Biometrics Lab 3 - Fall2023_WC.docx

pdf

School

Beloit College *

*We aren’t endorsed by this school

Course

247

Subject

Industrial Engineering

Date

Jan 9, 2024

Type

pdf

Pages

12

Uploaded by BailiffSnow15934

Report
Prof Cary & Werner Fall 2023 Biometrics Lab 3 Name(s): Alexander Klemp You may work individually or collaboratively in a group of 2 people to develop your answers to this lab. If you work collaboratively, clearly explain the contribution of each member to each question . Submit one answer for your group and make sure that the file name identifies the group members. Please, use the appropriate Greek/Latin symbols. If you write or draw any answers by hand, please photograph them and insert the photo in the appropriate position in the lab. Because mathematical formulas and symbols generated in Google Docs might not convert properly when saving as a Microsoft Word document, please submit your work in both Word and pdf formats . Please read the statements below. When you have completed the lab, sign the statement by typing your name in an appropriate blank. By signing this contract, you acknowledge your commitment to the academic honesty policy. Academic Honesty Policy of Beloit College: “In an academic institution, few offenses against the community are as serious as academic dishonesty. Such behavior is a direct attack upon the concept of learning and inquiry and casts doubts upon all measures of achievement. Beloit insists that only those who are committed to principles of honest scholarship may study at the college.” Acts of Academic Dishonesty “Cheating is an act of deception by which a student misrepresents that he/she has mastered information on an academic exercise that he/she has not mastered. For example, intentionally using or attempting to use unauthorized materials, information, or study aids in any academic exercise is considered cheating.” I, ______________, hereby acknowledge that the academic work presented in this exam is an honest reflection of my own learning. I, ______________, hereby acknowledge that the academic work presented in this exam is an honest reflection of my own learning. 1
Prof Cary & Werner Fall 2023 Word identification: Fill in the blank with the term that is defined. (2 points each, 14pts) 1. Nonparametric test Statistical tests that do not require estimates of population variance or mean and do not test hypotheses about any parameters. 2. Statistical power The likelihood that a study will detect an effect when there is an effect to be detected. 3. one tailed null hypothesis A null hypothesis that contains a directional inequality. 4. The Wilcoxon rank sum test A nonparametric two-sample test that is based on ranked data. 5. Ordinal scale A scale that ranks values by magnitude 6. A hypothesis of difference. 7. type II error The type of error that is made when one fails to reject the null hypothesis when it is false. 8. A student caught 35 squirrels and weighed them. The mean weight was 487 g and the standard deviation was 26 g. What was the standard error of the mean? Please show the formula(s) (define any terms) and your calculations. (3 points) 𝑆𝐸 = σ 𝑛 = 26 35 = 4. 395 𝑔 SE=Standard error( the accuracy of a sample mean) = standard deviation(measure of how dispersed the data is in relation to the mean) σ n = number of samples 2
Prof Cary & Werner Fall 2023 9. You are interested in testing whether the mean food consumption of deer is the same during the months of February and May. To determine which two-sample test to run, you remember that you must first test your assumptions. The R command and output below tests one or more of the assumptions that are required for Student’s two sample t test. (14pts) >shapiro.test(foo$Feb) Data: foo$Feb W = 0.90202,p-value = 0.4211 >shapiro.test(foo$May) Data: foo$May W = 0.95422,p-value = 0.0428 a. What assumption(s) does it test? If the data is normally distributed b. Complete the table below to answer the following questions for each sample (food consumption by month). What are the associated hypotheses, symbol and value of the test statistic(s), the associated p-value(s), decision to reject or accept the null hypothesis, and the appropriate conclusion(s). Feb May Null hypothesis H 0 : The data is drawn from a normally distributed sample Alternate hypothesis H A :The data is drawn from not normally distributed sample Symbol & value of the test statistic: W=.90202 W=.95422 p-value .4211 .0428 Statistical Decision fail to reject the null reject the null Conclusion the data is normally distributed the data is not normally distributed c. Based on this information, should the researchers apply a parametric test to analyze these data? Why or why not? no they should apply a non parametric test because one of the data sets is not normally distributed so 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Prof Cary & Werner Fall 2023 10. Researchers were interested in testing whether bur oak ( Quercus macrocarpa ) trees were the same height in two adjacent counties in Wisconsin. They randomly selected 20 mature bur oak trees in each county and determined their heights. The heights (m) are listed for each county in the table below. Determine the appropriate statistical test and answer the following questions. (35 points) County A County B 12.5 20.3 15.9 25.6 19.7 24.3 23.8 26.9 20.6 30.2 18.5 34 16.8 27.5 13.5 43.2 22.6 24.6 24.1 29.1 20.8 25 21.4 41.9 18.6 31.2 23.5 30.6 23.7 32.2 17.4 33.5 16.9 27.6 24.5 28.7 25 34.2 19.3 36.1 a) To begin, first determine which parametric test would be appropriate to answer this question. Name the test, including the number of tails. (2pts) A 2 sample t test b) What are the assumptions of this test? (2pts) Both samples are independent and randomly sampled. The data in each sample is normally distributed. The variances of the two populations are equal. 4
Prof Cary & Werner Fall 2023 c) Conduct the appropriate analyses and determine whether you have met the assumptions of the test. Make clear conclusion statements and report all statistical output necessary to support your conclusions. Also, copy and paste the R commands and output here that supports your conclusions. (6pts) > countyA <- c(12.5, 15.9, 19.7, 23.8, 20.6, 18.5, 16.8, 13.5, 22.6, 24.6, 24.1, 20.8, 21.4, 18.6, 17.4, 16.9, 24.5, 25, 19.3, 25) > countyB <- c(20.3, 25.6, 24.3, 26.9, 30.2, 34, 27.5, 43.2, 23.5, 30.6, 23.7, 32.2, 33.5, 27.6, 28.7, 25, 34.2, 36.1) > countyA <- c(12.5, 15.9, 19.7, 23.8, 20.6, 18.5, 16.8, 13.5, 22.6, 24.6, 24.1, 20.8, 21.4, 18.6, 17.4, 16.9, 24.5, 25, 19.3, 25) > countyB <- c(20.3, 25.6, 24.3, 26.9, 30.2, 34, 27.5, 43.2, 23.5, 30.6, 23.7, 32.2, 33.5, 27.6, 28.7, 25, 34.2, 36.1) > shapiro.test(countyA) Shapiro-Wilk normality test data: countyA W = 0.94144, p-value = 0.2553 > shapiro.test(countyB) Shapiro-Wilk normality test data: countyB W = 0.95892, p-value = 0.5808 Both data sets are normally distributed d) Should you continue with the test you stated in a) or do you need to use an equivalent non-parametric test (if so, name it here). (1pt) continue with the initial test stated in a e) Now that you’ve determined which test to run, write the null and alternative hypotheses for that test. Be sure to use notation that is specific to this example. (2pts) H 0 : μ A B H A : μ A ≠μ B f) What is the formula for the test statistic for this test? Identify all terms in the formula. Be sure to use notation that is specific to this example. (2pts) ? = 𝑋 ? −𝑋 ? 𝑆 𝑋 ? −𝑋 ? are the means of County A and County B 𝑋 ? , 𝑋 ? are the standard error of the difference between the means of County A and County B 𝑆 𝑋 ? −𝑋 ? 5
Prof Cary & Werner Fall 2023 g) Calculate the test statistic value here. Show all of your work. (3pts) h) How many degrees of freedom are associated with the test statistic? What formula should be used to calculate the degrees of freedom? Identify all terms in the formula. (2pts) n A =20 n B =20 equation: n A +n B -2 = 20+20-2 = 38 i) What is the critical value for the test statistic using α=0.05? How does your test statistic value compare to the critical value? (2pts) critical value = 2.024 j) Confirm your analysis using R and paste your command and output below. (5pts) > t.test(countyA, countyB) Welch Two Sample t-test data: countyA and countyB t = -5.8655, df = 29.701, p-value = 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Prof Cary & Werner Fall 2023 2.109e-06 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -12.415876 -6.000791 sample estimates: mean of x mean of y 20.07500 29.28333 k) What is the P-value and should the researchers reject the null hypothesis? What should the researchers conclude? Please be specific and include all important statistical output to support your conclusion. (3pts) p-value =2.109e-06 7
Prof Cary & Werner Fall 2023 l) Generate a publication quality figure to visually represent the tree height data collected from both counties. Include a figure legend/caption that contains all of the necessary statistical output. (5pts) 8
Prof Cary & Werner Fall 2023 11. For the data from Question 10, determine the 95% confidence interval associated with the mean tree height (m) for each county. Include the formulas and calculations for determining each limit. (6 pts) 9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Prof Cary & Werner Fall 2023 12. Scientists were interested in whether proximity to an industrialized area influenced growth of saltwater crocodiles ( Crocodylus porosus ) living in that area. To answer this question, they first wanted to determine whether adult males living in a population near a heavily industrialized area had diminished health, as measured by growth (or lack of growth) over time. They previously sampled this population 5 years ago. At that time, captured juvenile males were measured for body length, body mass, tagged, and then released back into the environment. Scientists recently returned to this site and recaptured crocodiles (both juvenile and adult males). Again, they collected length and mass data and returned the crocodiles to the environment. Fortunately, they were able to successfully recapture 13 of their previously measured juveniles who were now adult crocodiles to investigate their question! Use these data to help the scientists with their project by performing the appropriate analysis and answering the following questions. (28 points) Body length, m (5 years ago) Body length, m (Present) 3.25 3.45 3.01 2.80 3.16 3.29 3.41 3.31 3.64 3.78 3.58 3.38 3.61 3.47 3.86 3.74 3.74 3.55 3.95 3.68 3.14 3.32 3.31 3.54 3.49 3.63 a) To begin, first determine which parametric test would be appropriate to answer this question. Name the test, including the number of tails. (2pts) A paired t-test two tailed tailed b) What are the assumptions of this test? Explain how you would test them (but do not run the test). (2pts) The differences between the paired measurements should be approximately normally distributed. A shapiro-wilks test of the differences The paired measurements should be independent. c) Conduct the appropriate analysis and determine whether you have met the assumptions of the test. Make clear conclusion statements and report all statistical output necessary to support your conclusions. Also, copy and paste the R commands and output here that supports your conclusions. (3pts) > shapiro.test(length_5_years_ago) Shapiro-Wilk normality test data: length_5_years_ago W = 0.97551, p-value = 0.9502 10
Prof Cary & Werner Fall 2023 > shapiro.test(length_present) Shapiro-Wilk normality test data: length_present W = 0.98341, p-value = 0.992 d) Should you continue with the test you stated in a) or do you need to use an equivalent non-parametric test (if so, name it here). (1pt) Yes we can continue e) Now that you’ve determined which test to run, write the null and alternative hypotheses for that test. Be sure to use notation that is specific to this example. (2pts) H 0 : μ 5 years present H A : μ 5 years ≠μ presnet f) What is the formula for the test statistic for this test? Identify all terms in the formula. Be sure to use notation that is specific to this example. (2 pts) ? = 𝑑 ?𝑑/ 𝑛 d-bar is the sample mean of the differences between paired measurements sd is the sample standard deviation of the differences n is the number of paired observation g) Calculate the test statistic value here. Show all of your work (if appropriate, a table might be useful!). (5 pts) 11
Prof Cary & Werner Fall 2023 h) How many degrees of freedom are associated with the test statistic? What formula should be used to calculate the degrees of freedom? Identify all terms in the formula. (2 pts) n-1 n=13 df=12 i) What is the critical value for the test statistic using α=0.05? How does your test statistic value compare to the critical value? (2pts) 2.1604 the test statistic is less than the critical value j) Confirm your analysis using R and paste your commands and output below. (4pts) > t.test(length_5_years_ago, length_present, paired = TRUE, alternative = "two.sided") Paired t-test data: length_5_years_ago and length_present t = -1.8328, df = 12, p-value = 0.09175 alternative hypothesis: true mean difference is not equal to 0 95 percent confidence interval: -0.46301535 0.03993843 sample estimates: mean difference -0.2115385 k) What is the P-value and should the researchers reject the null hypothesis? What should the researchers conclude? Please be specific and include all important statistical output to support your conclusion. (3 pts) p-value = 0.09175 so we fail to reject the null hypothesis because the critical value is greater than the test statistic 12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help