HWK3_Soln

pdf

School

University of Wisconsin, Madison *

*We aren’t endorsed by this school

Course

371

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

5

Uploaded by UltraDolphinMaster987

Report
Stat 371 Homework #3 SOLUTIONS Submit your homework to Canvas by the due date and time. Email your lecturer if you have extenuating circumstances and need to request an extension. If an exercise asks you to use R, include a copy of all relevant code and output in your submitted homework file. You can copy/paste your code, take screenshots, or compile your work in an Rmarkdown document. If a problem does not specify how to compute the answer, you many use any appropriate method. I may ask you to use R or use manual calculations on your exams, so practice accordingly. You must include an explanation and/or intermediate calculations for an exercise to be complete. Be sure to submit the HWK3 Autograde Quiz which will give you ~20 of your 40 accuracy points. 50 points total: 40 points accuracy, and 10 points completion Probability Exercise 1. A geneticist is studying two genes. Each gene can be either dominant or recessive. A collection of 100 individuals is categorized and found to have 58 individuals with both genes dominant, 6 individuals with both genes recessive and a total of 70 Gene 2 dominant individuals. a. Create a 2-way table (like the one below) to organize the counts of individuals within each of the 4 combinations of dominant and recessive for the two genes. X Gene 2 Dominant Gene 2 Recessive Total Gene 1 Dominant 58 24 82 Gene 1 Recessive 12 6 18 Total 70 30 100 b. What is the probability that a randomly sampled individual from this group has Gene 1 dominant? 82 / 100 = 0 . 82 c. What is the probability that a randomly sampled individual from this group has Gene 1 or Gene 2 dominant? (58 + 24 + 12) / 100 = 94 / 100 d. What is the probability that in a random sample of 3 individuals from this group (without replacement), at least one of the three has both recessive genes? P(At least 1 has both recessive) = 1 - P(0 have both recessive) 1 94 100 93 99 92 98 = 1 0 . 8289672 = 0 . 1710328 e. What is the probability that a randomly sampled individual from this group has Gene 2 dominant, given we know they have Gene 1 dominant? 1
P(Gene 2 Dom | Gene 1 Dom) = 58 / 82 = 0 . 7073 f. The genes are said to be in linkage equilibrium if the event that Gene 1 is dominant is independent of the event that Gene 2 is dominant. Are these genes in linkage equilibrium in this group of 100 individuals? There are multiple ways to do this, but lets check if P(Gene 2 Dominant| Gene 1 Dominant) = 58 / 82 = 0 . 7073 = P(Gene 2 Dominant) = 70 / 100 = 0 . 7 . The probabilities are not equal, so the genes are not in linkage equilibrium in this group of 100 individuals. g. Now suppose in a different group of 100 individuals, 6 individuals have both genes recessive and a total of 70 Gene 2 dominant individuals. How many individuals would have both genes dominant if the event: Gene 1 is dominant is independent of the event: Gene 2 is dominant in this group of 100 individuals? Make sure to show how you calculated your answer. X Gene 2 Dominant Gene 2 Recessive Total Gene 1 Dominant ?? Gene 1 Recessive 6 Total 70 100 Algebra: P(Gene 2 Dominant) = 70 / 100 = P(Gene 2 Dom| Gene 1 Dom) = x x +24 , solving for x gives us 56. 70 100 = x x + 24 70 ( x + 24) = 100 x x = 70 24 30 = 56 Exercise 2. Prevention after acute myocardial infarction (AMI) is primarily managed through medications. A large cohort study of post-AMI patients >65 years old found only 74% of patients filled all their discharge prescriptions by 120 days after discharge. A physician at UW has 4 post-AMI patients >65 yo and would like to use 0.74 has his estimate for π , the probability for each of his patients filling all of their discharge prescriptions by 120 days after discharge. Define a random variable F, the count of the physician’s four patients who fill all of their discharge prescriptions by 120 days after discharge. Assume that the filling of prescription behavior is independent between the 4 patients and that π = 0 . 74 . a. Determine the probability function for F (write out the pmf) using probability theory. P ( F = 0) = 1 (1 0 . 74) 4 = 0 . 00456976 P ( F = 1) = 4 0 . 74 1 (1 0 . 74) 3 = 0 . 05202496 P ( F = 2) = 6 0 . 74 2 (1 0 . 74) 2 = 0 . 2221066 P ( F = 3) = 4 0 . 74 3 (1 0 . 74) 1 = 0 . 421433 P ( F = 4) = 1 0 . 74 4 (1 0 . 74) 0 = 0 . 2998658 2
dbinom( 0 : 4 , 4 , 0.74 ) ## [1] 0.00456976 0.05202496 0.22210656 0.42143296 0.29986576 f P(F = f) 0 0.00456976 1 0.05202496 2 0.2221066 3 0.421433 4 0.2998658 b. Compute the probability that F > 0. What does this value mean in the context of the scenerio? P(F > 0) = 1 - 0.00456976 = 0.9954302, or 0 . 05202496 + 0 . 2221066 + 0 . 421433 + 0 . 2998658 = 0 . 9954302 This value gives the probability that at least one (1, 2, 3, or 4) of the patients filled their prescription within 120 days. sum(dbinom( 1 : 4 , 4 , 0.74 )) ## [1] 0.9954302 c. What is the expected value for F, µ F ? What does that value mean in the context of the scenerio? µ F = 4 0 . 74 = 2 . 96 using the binomial shortcut. On average, in a group of 4 patients 2.96 will get all discharge prescriptions filled within 120 days. probs <- dbinom( 0 : 4 , 4 , 0.74 ) vals <- 0 : 4 sum(probs*vals) # Expected Value "long way" ## [1] 2.96 mean <- sum(probs*vals) d. What is the standard deviation for F, σ F ? σ F = 4 0 . 74 0 . 26 = 0 . 8772685 . This is using the binomial RV shortcut sqrt(sum(probs*(vals-mean)ˆ 2 )) # SD the "long way" ## [1] 0.8772685 e. Suppose this physician now has 20 post-AMI patients >65 years and wants to use a binomial model ( n = 20 , π = 0 . 74 ) to describe the number of those 20 patients who will get all discharge prescriptions filled within 120 days. (i) What the the probability that exactly 15 of those 20 patients get all discharge prescriptions filled within 120 days? (You can use R). dbinom( 15 , 20 , 0.74 ) ## [1] 0.2012734 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
(ii) What the the probability that 15 or more of those 20 patients get all discharge prescriptions filled within 120 days? (You can use R). sum(dbinom( 15 : 20 , 20 , 0.74 )) ## [1] 0.5765058 (iii) Which histogram given below correctly shows the pdf for the binomial model described in (e)? 0 5 10 15 0.00 0.10 0.20 Graph A Probability 0 5 10 15 20 0.00 0.10 0.20 Graph B Probability 0 5 10 15 20 0.00 0.10 0.20 Graph C Probability 0 5 10 15 0.00 0.10 0.20 Graph D Probability Graph B, since we see that the values 0 - 20 are on the x axis and P(X = 15) = 0.2012734 as calculated in (i). Exercise 3. For each of the following questions, say whether the random variable is reasonably approximated by a binomial random variable or not, and explain your answer. Comment on the reasonableness of each of things that must be true for a variable to be a binomial random variable (ex: identify n : the number of Bernoulli trials, π the probability of success, etc). a. A fair die is rolled until a 1 appears, and X denotes the number of rolls. Not binomial, since there is not a fixed number of trials. b. Twenty of the different Badger basketball players each attempt 1 free throw and X is the total number of successful attempts. Not binomial, since the probability of success isn’t the same between players. c. A die is rolled 50 times. Let X be the face that lands up. Not binomial, since the outcome is not a number of successes. d. In a bag of 10 batteries, I know 2 are old. Let X be the number of old batteries I choose when taking a sample of 4 to put into my calculator. 4
Not binomial. A large sample size (4) relative to population size (10) means sampling without replacement will affect probability, so the probability of success is not the same. e. It is reported that 20% of Madison homeowners have installed a home security system. Let X be the number of homes without home security systems installed in a random sample of 100 houses in the Madison city limits. Binomial! 100 fixed trials, S = no home security installed, P(S) = 0.80, P(F) = 0.20, n = 100. The number of homes is so large, we can assume independence between homes in a random sample of just 100. 5