STAT200 2023 GE4

docx

School

University of Delaware *

*We aren’t endorsed by this school

Course

200

Subject

Statistics

Date

Jan 9, 2024

Type

docx

Pages

8

Uploaded by SuperHumanMouse356

Report
1 STAT 200 Guided Exercise 4 For On-Line Students, be sure to: Please submit your answers in a Word or PDF file to Canvas at the place you downloaded the file. You can paste Excel/JMP output into a Word File. Please submit only one file for the assignment. It is ok to do problems by hand. Guided Assignments are not graded but we check for completed work. Key Topics Probability Discrete Probability Binomial Distribution Normal Distribution 1. Sensitivity and Specificity of a Test. No medical test is 100% certain. There is usually a chance the test will say that the disease is present when it is not (False Positive) or the disease is not present when it is (False Negative). This example will help us learn more about the specificity and sensitivity of a test, which are both probability concepts. This example was taken from PediaLabs, Calculating Sensitivity and Specificity. A total of 1500 children have a Rapid Strep Test (RST) done by a standardized culture technique. Of the 1500 children, 1338 have a negative RST and 162 have a positive RST. In addition, a backup throat culture (gold standard) was done on all children. Of those children with a negative RST, 1302 have a negative throat culture. In the group with a positive RST, 159 have a positive throat culture. We will calculate the sensitivity and specificity of the RST. The table below reflects this result. Throat Culture Results RST Test Result Present Absent Totals Positive 159 3 162 Negative 36 1302 1338 Totals 195 1305 1500 a. What is the probability that a person diagnosed as having Strep Throat by the RST, actually having Strep Throat? Can you write the formal Probability Statement as part of your answer (think of what is the given)? 159/62 = .9815 It is a conditional probability the rst it is positive b. What is the probability that a person not diagnosed as having Strep Throat by the RST, not having Strep Throat? Can you write the formal Probability Statement as part of your answer (think of what is the given). 1302/1338 = .9731 it is a conditional probability, rst is negative 1
We can think of our table in the following way: Disease Status Test Result Present Absent Totals Positive True Positive a False Positive b a + b Negative False Negative c True Negative d c + d a + c b + d a + b + c + d c. The sensitivity of a test is expressed as the probability of a positive test among patients with the disease. The formula is given as: Sensitivity = True Positive ( True Positive + False Negative ) Sensitivity = a / (a + c) What is the sensitivity of this test? 159/195 = .8154 the sensitivity of this test is ok but not excellent d. The specificity of a test is expressed as the probability of a negative test among patients without the disease. The formula is given as: Specificity = True Negative ( True Negative + False Positive ) Specificity = d / (b + d) What is the specificity of this test? 1302/1305 = .9917 the specificity of this test is excellent because it is close to 1 e. The web site noted: These data represent the actual sensitivity and specificity of most rapid strep tests. Because of this, in clinical practice, we trust a positive rapid strep and treat the patient based only on this result, but we do not completely trust a negative - a back up culture must be done to confirm that the patient truly does not have strep throat. 2
I am only asking you to ponder about this! Given what you learned above (and look at my answer for sure on this one), does this make sense? Yes this makes sense because the sensitivity was only good but the specificity was excellent. 2. Discrete Random Variable: The number of Games in a Baseball World Series. Based on past results found in the Information Please Almanac, there is a 0.1809 probability that a baseball World Series contest will last four games, a 0.2234 probability that it will last five games, a 0.2234 probability that it will last six games, and a 0.3723 probability that it will last seven games. The probability table is given below: X 4 5 6 7 P(X) .1809 .2234 .2234 .3723 a. What is the mean (expected value) number of games in a World Series? 5.7871 b. What is the variance of the number of games in a World Series? 1.2740 c. Is it unusual for a team to sweep the World Series (win all four games in a row)? It’s not unusual but is more unlikely than losing one of the games 3. Consider an experiment in which 10 identical small boxes are placed side-by-side on a table. A crystal is placed, at random, inside one of the boxes. A self-professed “psychic” is asked to pick the box that contains the crystal. This experiment is repeated seven times, and x is the number of correct decisions in seven tries. Thus, it is a Binomial random variable. a. If the psychic is guessing, what is the value of p, the probability of a correct decision on each trial? 1/10 = .1 b. Fill in the remaining portions of this table reflecting the probability distribution for this variable using the binomial table or the binomial formula. The Binomial Table for n = 7 and p = .10 is much easier! X 0 1 2 3 4 5 6 7 p(x) .4783 .3720 .1240 .0230 .0026 .0002 .0000 .0000 c. If the psychic is guessing, what is the expected number of correct decisions in seven trials, and what is the variance? E(x) = .7 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
V(x) = .63 d. If the psychic is guessing, what is the probability of no correct decisions in seven trials? .4783 e. One of the “psychics” who took the test got all seven wrong. Suppose the criteria for having ESP is that you could guess right with p =.5. In other words, if you are a psychic you might not get it right all the time, but you should be doing much better than chance. If p=.5 instead of .10, what is the probability of guessing incorrectly on all seven trials? .0078 4. If a single bit of data (0 or 1) is transmitted over a noisy communication channel, it has a probability p of being incorrectly transmitted. To improve the reliability of the transmission, the bit is transmitted n times, where n is odd. A decoder at the receiving end, called a majority decoder, decides that the correct message is the one carried by the majority of the received bits. This means that if there are five transmissions of a (0,1) bit, the bit used by at least three of the transmissions would be considered correct. Assume that each bit is independently subject to being corrupted with the same probability p, and that p=.1. Note, p is the probability of an error, and in terms of a binomial problem we will think of X as the number of errors in n transmissions. a. If a company sent only one transmission, what is the probability of it being received without an error? 0.9 b. A company decides to use 5 transmissions for each data item as a strategy to reduce errors (n=5). This may seem like excessive, but sending data is very fast. They will then choose the outcome where the majority of the transmissions agree. Set up the outcomes for 5 transmissions and the probabilities associated with each outcome using the binomial distribution. N = 5 p = .1 X 0 1 2 3 4 5 p(X) 0.5905 0.3281 0.0729 0.0081 0.0005 0.000 c. Calculate the mean, variance, and standard deviation for this problem. Mean = 0.5 Variance = 0.45 Standard Dev = .6708 4
d. If five messages are sent for each bit, the probability that the message is correctly received is the probability of two or fewer errors . This is not easy to see, but think it through with me. If the system sends 3, 4, 5 wrong messages, the majority decoder strategy will accept the wrong message and make a wrong decision. But it the wrong message is sent 2, 1 or 0 times, the right message will be accepted. Look at the probability of zero, 1 or 2 messages from our binomial table above. What is the probability that the message is correctly received in five transmissions (i.e., 2 or fewer errors)? Compare that with the answer your derived in Part a. Did sending five transmissions improve the chances of sending the message correctly? P(x=0) + P(x=1) + P(x=2) = .5905 + .3281 + .0729 = .9914 This is much better that .9 The majority decoder strategy, with n= 5 transmissions, greatly improved the chance of a right transmission. 5. Discrete Random Variable Problem. A concert producer has scheduled an outdoor concert on a Saturday. If it does not rain, he expects to make $20,000 profit from the concert If it does rain, the producer will be forced to cancel the concert and lose $12,000 (from fees, advertising, stadium rental and so forth) The probability of rain on Saturday is .4. a. What is the expected profit from the concert? Hint: write out the probability distribution and solve for the expectation. The values that your random variable can take are the dollar values. X 20000 -12000 P(X) 0.6 0.4 E(x) = 7200 b. For a fee of $1,000 an insurance company will insure against all losses from a rained out concert. If the producer buys the insurance, what is her expected profit from the concert? Note: an insurance fee is a fixed cost incurred regardless of whether is rains or not. E(x) = 11000 5
c. Assuming the forecast is accurate, do you believe the insurance company has charged too much or too little? Hint: reformulate the problem to express outcomes in terms of the insurance company and what they expect to pay out. E(x) = 4800 They charged too little 6. Normal Distribution. Plastic bags used for packaging produce are manufactured so that the breaking strength of the bag is normally distributed with a mean of 5 pounds per square inch and a standard deviation of 1.5 pounds per square inch. What proportion of the bags produced have a breaking strength of: You can use the Standard Normal Table to solve these problems. I provide one version of this table. Other versions are available on-line. Regardless of the version, the answer will be the same. I also provide an Excel file that is very useful for these problems, Normal.xlsx. The Excel file will have the most accuracy since you do not need to round the z-score to 2 decimal places. I use the Excel file for my answers. The difference between the table answers and Excel will be small. The Standard Normal Problems all start the same way, by creating a z-score of the value of X. I suggest you do a quick sketch of the problem to remind yourself what part of the curve you are interested in. a. Less than 3.17 pounds per square inch? .1112 b. At least 3.6 pounds per square inch? .8247 c. Between 5 and 5.5 pounds per square inch? .1306 d. Between 3.2 and 4.2 pounds per square inch? .1818 7. Normal Distribution. You have been hired as a consultant to provide analysis for the Personnel Department at ZTel company, a large communications company. Every applicant of ZTel must take a standardized exam, and the hire or no-hire decision depends in part on this exam. The exam was purchased from a company which says the exam is distributed approximately normal with: μ = 80.0 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
σ = 13.5 The current interview policy has two phases. The first phase separates all applicants into one of three categories: Automatic Interview score of 89 or above Maybe Interview score of 78 to 89 Automatic Rejects score less than 74 The Maybe group are passed on to a second phase where their previous experiences, education, special skills, and other factors are taken into consideration in whether to grant an interview. No one at the company can remember why the values of 89 and 74 were used as the standards for automatic interview or rejection, and most likely there were decided arbitrarily by a former Personnel Manager. The current Personnel Manager of Ztel needs to know the following: a. The probability associated with the current standard of being automatically rejected - what proportion of the applicants are automatically rejected (those < 74)? .3284 b. The probability associated with the current standard of being automatically interviewed - what proportion of the applicants are automatically interviewed (those > 89)? .25 c. The manger notices that applicants that score between 85 and 92 tend to be good hires, having both good skills and a higher probability of accepting an offer to the company. She would like to give this group a higher priority in the second phase of evaluation. What percentage of the applicants should she expect to fall within this range? .1685 d. The manager would prefer that the exam score for automatically interview would be set at the top 15% (above the 85th percentile) and the automatic rejection would be set at the bottom 20% (below the 20th percentile). What are the exam values in this distribution associated with these probabilities (in this case, round to whole numbers)? Hint: Draw out the distribution so you can see what percentiles you need to solve for. 85 th percentile – use 94 20 th percentile – use 69 e. Summarize your results as a recommendation to your client. The new standards are more extreme than before which makes it harder to get an automatic interview but also harder to get automatically rejected. This allows for more people to stand out through their other skills and achievements which I believe is good. 7
8