chap 9 Expect_The_Unexpected_A_First_Course_In_Biostatist..._----_(Statistics) (2)

pdf

School

University of Ottawa *

*We aren’t endorsed by this school

Course

2379

Subject

Statistics

Date

Jan 9, 2024

Type

pdf

Pages

22

Uploaded by GrandUniverseHyena41

Report
Chapter 9 Hypothesis Testing In this chapter, we introduce another statistical method for drawing conclu- sions about the values of a parameter. This method consists in confronting two hypotheses which speak about the parameter values. It is used when one wants to gain support (or evidence) towards a desired statement, called “the research hypothesis”, and denoted by H 1 . The other hypothesis, which the researcher wants to reject, is called the “null hypothesis”, and is de- noted by H 0 . When using this method, we formulate the two hypotheses with the goal of rejecting H 0 , and gaining evidence towards H 1 . 9.1 Hypothesis Testing for the Mean: Large Samples In this section, we introduce the method of hypothesis testing, when the parameter of interest is the population mean μ , and the sample size n is large, i.e. n 40. The null hypothesis H 0 says that the unknown parameter μ is equal to a specified numerical value μ 0 : H 0 : μ = μ 0 . Under new experimental conditions, the mean measurement μ is thought to deviate from μ 0 , which is a value obtained under standard conditions. The alternative hypothesis H 1 (that we would like to gain evidence for) specifies the direction of this change in μ . This hypothesis can take three different forms: (1) μ is larger than μ 0 . In this case, we write H 1 : μ > μ 0 , and we say that we perform a right-tailed test . This set-up is used when one wants to gain evidence that μ exceeds the hypothesized value μ 0 . 141 Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
142 Expect the Unexpected: A First Course in Biostatistics (2) μ is smaller than μ 0 . In this case, we write H 1 : μ < μ 0 and we say that we perform a left-tailed test . This set-up is used when one wants to gain evidence that μ diminishes compared to μ 0 . (3) μ is different than μ 0 . In this case, we write H 1 : μ 6 = μ 0 and we say that we perform a two-tailed test . This set-up is used when the direction of the change in μ is unknown. Setting up the hypothesis in the desired way (i.e. choosing the appropri- ate alternative hypothesis H 1 , among the three possibilities listed above) is the first and most important step of a statistical testing procedure. Before performing the test, the statistician has to decide what is the alternative hypothesis H 1 . This decision dictates automatically which of the three cases above has to be used for the problem at hand. The conclusion of a test of hypothesis is one of the following: (i) We reject H 0 . In this case, we say that there is enough evidence in favour of H 1 . (We may say that H 1 is true.) (ii) We fail to reject H 0 . In this case, we say that there is not enough evidence in favour of H 1 . (We avoid saying that H 0 is true, although this may help with the logic.) As a consequence, hypothesis testing can result in two types of errors: Type I error (whose probability is denoted by α ) is encountered if we reject H 0 , when H 0 is true. Type II error (whose probability is denoted by β ) is encountered if we fail to reject H 0 , when H 1 is true. Ideally, both probabilities α and β should be small. The table below illus- trates all 4 possibilities: Reject H 0 Fail to Teject H 0 H 0 True Type I error Correct decision (probability α ) (probability 1 - α ) H 1 True Correct decision Type II error (probability 1 - β ) (probability β ) Fig. 9.1 Probabilities associated with a test of hypothesis Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Hypothesis Testing 143 Example 9.1. The effects of inhaling particle matter (PM) have been widely studied in humans. The smaller particles PM 10 (particles with di- ameter of less that 10 micrometers) are especially dangerous, and possibly related to asthma and lung cancer. As of January 1, 2005 the European Commission has set the limit for the PM 10 in the air at 50 μg/m 3 (daily average). Local health organizations in a large European city are concerned that the PM 10 level in the outdoor air is higher than the 50 μg/m 3 permis- sible. To test the validity of this statement, levels of PM 10 were measured on 40 different days, yielding an average ¯ x = 52 . 5 μg/m 3 and a sample variance s 2 = 33 . 5. To set-up correctly the two hypotheses, we keep in mind that we want to reject H 0 , in favor of H 1 . Therefore, we set H 0 : “the average level of PM 10 is equal to 50” and H 1 : “the average level of PM 10 exceeds 50”. We are confronting the following two hypotheses: H 0 : μ = 50 versus H 1 : μ > 50 . A type I error occurs when we decide that the PM 10 level is higher than 50, when in fact it is not. This does not have a negative health impact, but may result in falsely alarming the public. A type II error occurs when we are unable to gain evidence that the PM 10 level is higher than 50, when in fact it is. This may have a negative health effect on the population. Example 9.2. Cholesterol is one of the body’s fats, used for making cell membranes, vitamin D and hormones. High levels of low-density lipoprotein (LDL) cholesterol in the blood can cause the build up of plaque in the artery walls, which is a major risk factor for heart disease and stroke. The Canadian Heart and Stroke Foundation advises a diet low in saturated fats and regular physical activities as effective measures for reducing the LDL blood cholesterol levels. To gain evidence for this statement, we use a sample of 52 Canadians with a high level of LDL blood cholesterol of 4.0 nmol/L, who were on a low-fat diet for 30 days, combined with 30 minutes of daily cardio exercises. After this period, the average LDL blood cholesterol level for this sample was found to be ¯ x = 3 . 5, (which is lower than the initial value μ 0 = 4 . 0), with a sample standard deviation s = 1 . 12. We now set-up the two hypotheses in the desired direction. The goal is to reject H 0 , and gain evidence for H 1 . The null hypothesis H 0 : μ = 4 . 0 says that despite the new measures, the average LDL blood cholesterol level stays the same. The alternative hypothesis H 1 : μ < 4 . 0 says that the LDL Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
144 Expect the Unexpected: A First Course in Biostatistics blood cholesterol level is reduced. We are confronting the following two hypotheses: H 0 : μ = 4 . 0 versus H 1 : μ < 4 . 0 . A type I error occurs when we decide that diet combined with exercise reduces the LDL blood cholesterol level, when in fact it does not. A type II error occurs if we are unable to gain evidence that diet combined with exercise reduces the LDL blood cholesterol level, when in fact it does. Example 9.3. Recent studies suggest that Bacillus Calmette-Gu´ erin (BCG) vaccination early in life is related to asthma. A commonly used index of asthma in a population is the level of forced expiratory volume in one second (FEV 1 ). The level of FEV 1 was measured in 46 adult men, who were administered the BCG vaccine at the age of 14, yielding an average volume ¯ x = 4 . 52 BTPS and a sample variance s 2 = 2 . 1. We would like to gain evidence for the fact that the BCG vaccination induces a change in the FEV 1 level, the direction of the change being unknown. For adult men, the normal level of FEV 1 is around the value of 4.00 BTPS. We set-up the two hypotheses, with the goal of rejecting H 0 , in favor of H 1 . Hypothesis H 0 says that the BCG vaccination does not induce a significant change in the FEV 1 level. The alternative hypothesis H 1 says that the BCG induces either an increase or a decrease in the FEV 1 level. We want to test: H 0 : μ = 4 . 00 versus H 1 : μ 6 = 4 . 00 , μ being the average FEV 1 level in the BCG-vaccinated male population. A type I error occurs when we decide that the BCG vaccination induces a change in the FEV 1 level, when in fact it does not. A type II error occurs if we are unable to gain evidence that the BCG vaccination affects the FEV 1 level, when in fact it does. We treat separately the three different cases, explaining what method to use in each case. Case (1). H 0 : μ = μ 0 versus H 1 : μ > μ 0 This is the case when we want to gain evidence that the true mean μ of the population is larger than a numerical value μ 0 . To move in the direction of H 1 , we first have to make sure that in the case of our sample, the sample average ¯ x is larger than μ 0 . (If this is not the case, there is no hope that we can gather any evidence for H 1 .) Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Hypothesis Testing 145 We then calculate the difference ¯ x - μ 0 , and hope that this difference is a large (positive) number. If so, then we reject H 0 ; otherwise, we say that we do not have enough evidence for rejecting H 0 . But how large should this number be? To answer this question, the difference ¯ x - μ 0 itself is not of big help. We have to calculate the stan- dardized ratio z 0 = (¯ x - μ 0 ) / ( s/ n ) and see if this ratio is large, compared with all the possible values for the same ratio that may arise from other samples. The collection of all the samples is huge and therefore, it is im- possible to calculate all the corresponding ratios. Luckily, the way these ratios fluctuate is well-known: if H 0 is true, then by Theorem 8.1, Z 0 = ¯ X - μ 0 S/ n has approximately an N (0 , 1) distribution , if n is large enough (i.e. n 40). In this context, Z 0 is called the test statistic . The question is: supposing that H 0 is true, do we expect to see only rarely sample averages larger than our observed ¯ x , or our ¯ x is a rather typical value, and we should expect to see very often values which are even larger? To answer this question, we use Table 18.3 for calculating the probability that an N (0 , 1) random variable takes a value larger (i.e more extreme) than the value z 0 that we already observed. This probability is called the p -value of the right-tailed test : p -value = P ( Z > z 0 ) , and corresponds to the right-tail of the N (0 , 1) density. We say that z 0 is the observed value of the test statistic Z 0 . The smaller the p -value, the less likely it is that H 0 is true. The inter- pretation is the following: a small p -value means that values larger than ¯ x are rarely encountered under H 0 , and therefore H 0 is unlikely to be true. On the other hand, a large p -value means that values larger than ¯ x are frequently encountered under H 0 , and therefore H 0 is likely to be true. Reporting the p -value is an important step in any statistical analysis, since it gives us an idea about the likelihood that H 0 happens. Sometimes, statisticians are supplied with an a priori α -value for the probability of the type I error. In this case, the decision rule of the test is based on the following comparison between the p -value and α : if p -value < α, then we reject H 0 if p -value α, then we fail to reject H 0 . Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
146 Expect the Unexpected: A First Course in Biostatistics In this context, α is called the significance level of the test. Note that there is some uncertainty in this decision-making process. In fact, a statistician in never 100% sure of making the right decision. The above rule ensures that the probability of the type I error is approximately equal to α . To see this, note p -value = P ( Z > z 0 ) < α = P ( Z > z α ) is equivalent to saying that z 0 > z α . Therefore P (type I error) = P (reject H 0 when H 0 is true) = P ( Z 0 > z α ; μ = μ 0 ) α using the fact that Z 0 has approximately an N (0 , 1) distribution when H 0 is true. Example 9.1 (continued). Suppose that the daily level X of PM 10 in the city’s outdoor air is a random variable of unknown mean μ . The sample size is n = 40. The sample average ¯ x = 52 . 5 indicates that the unknown average μ may be higher than the threshold value μ 0 = 50. To gain evidence for this claim, we calculate the ratio: z 0 = ¯ x - μ 0 s/ n = 52 . 5 - 50 33 . 5 / 40 = 2 . 73 . We cannot say if the value of this ratio is large or small, until we compare it with all the other possible values, which may arise if we change the sample. This comparison is performed using the p -value. From Table 18.3, p -value = P ( Z > 2 . 73) = 1 - 0 . 9968 = 0 . 0032 . This p -value is very small. Based on this sample, it is unlikely that H 0 : μ = 50 is true, and is much more likely that H 1 : μ > 50 is true. Therefore, we have enough evidence for rejecting H 0 . The conclusion is that in this city, the average PM 10 in the outdoor air exceeds the permissible level of 50 μg/m 3 per day. Case (2). H 0 : μ = μ 0 versus H 1 : μ < μ 0 In this case, we want to gain evidence that the average μ is smaller than a given value μ 0 . This time, we first have to make sure that the sample average ¯ x is smaller than μ 0 . Then, we calculate the same ratio z 0 = (¯ x - μ 0 ) / ( s/ n ), keeping in mind that this (negative) ratio value should be compared against all the other negative values in Table 18.2, which could be obtained from different samples. The p -value is a measure Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Hypothesis Testing 147 of how “negative” this ratio is compared with all the other values. It gives the probability that one can obtain something even more “negative” (or more extreme) in the case of another sample. The p -value for the left-tailed test is: p -value = P ( Z < z 0 ) . Note that this probability corresponds to the left-tail of the N (0 , 1) density. A small p -value means that ¯ x is sufficiently small compared to μ 0 . In this case, we reject H 0 , and conclude that there is enough evidence that μ is smaller that μ 0 . As in Case (1), if a preset α -value is given for the probability of the type I error, we reject H 0 if and only if the p -value is smaller than α . Example 9.2 (continued). Let X be the LDL blood cholesterol level of a randomly chosen person who was on a low-fat diet for 30 days, combined with daily exercising. The sample average ¯ x = 3 . 5 is smaller than the initial cholesterol level of 4 . 0, so we can proceed with the test. We calculate the ratio: z 0 = ¯ x - μ 0 s/ n = 3 . 5 - 4 . 0 1 . 12 / 52 = - 3 . 22 . This value is very extreme for the observed value of a Z random variable. More precisely, from Table 18.2, p -value = P ( Z < - 3 . 22) = 0 . 0006 . This is a very small probability. We reject H 0 , in favor of H 1 . The conclu- sion of this study is that a low-fat diet and exercising are effective means of reducing the LDL blood cholesterol level. Case (3). H 0 : μ = μ 0 versus H 1 : μ 6 = μ 0 In this case, we want to show that the unknown average μ is significantly different than a value μ 0 , without any preference for the direction of the change in μ compared to μ 0 . This type of test can be performed if ¯ x is either larger, or smaller than μ 0 . What matters is the absolute value of the difference ¯ x - μ 0 . In fact, our conclusion is based on the absolute value of the ratio z 0 = (¯ x - μ 0 ) / ( s/ n ). If this value is very large (or extreme), then we reject H 0 ; otherwise, we do not have enough evidence for rejecting H 0 . Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
148 Expect the Unexpected: A First Course in Biostatistics The p -value calculation takes into account the fact that the same ab- solute value | z 0 | can be encountered in two different situations: when ¯ x is larger than μ 0 , or ¯ x is smaller than μ 0 . Before selecting the sample, we do not know which of these two situations will be encountered in the case of our sample. For this reason, the p -value calculation considers both tails under the N (0 , 1) density. The p -value of a two-tailed test is: p -value = 2 P ( Z > | z 0 | ) . The value 2 in the formula above is due to the symmetry of the density of the N (0 , 1) distribution. As in the previous two cases, we reject H 0 if the p -value is small. Oth- erwise, we do not have enough evidence for rejecting H 0 . If an a priori α -value is given, we reject H 0 if and only if the p -value is smaller than α . Example 9.3 (continued). Let X be the FEV 1 level in the BCG- vaccinated male population. We first calculate the absolute value: | z 0 | = ¯ x - μ 0 s/ n = 4 . 52 - 4 . 00 2 . 1 / 46 = 2 . 43 . From Table 18.3, we find p -value = 2 P ( Z > 2 . 43) = 2(1 - 0 . 9925) = 2(0 . 0075) = 0 . 015 . If the preset α -value is given as α = 0 . 01 (i.e. we are willing to accept only a risk of 1% of making a type I error), then we fail to reject H 0 , and conclude that there is not enough evidence that the BCG vaccination affects the FEV 1 level. However, if the preset α -value is α = 0 . 05 (i.e. we are willing to accept a risk of 5% of making a type I error), then we reject H 0 , and conclude that the BCG vaccination may affect the FEV 1 level. In case (3), we want to gain evidence that μ is significantly different than the value μ 0 , but we are not sure if it is larger or smaller. Someone might be tempted to conclude that μ 6 = μ 0 , if H 0 has been rejected in at least one of the one-sided tests. This type of argument is called data snooping . It is equivalent to using a right-sided alternative or a left-sided alternative based on the observed sample mean. This will lead to an inflated risk of committing an error of type I. We need to set up the hypotheses before looking at the data. If you do not have a priori information that it is highly likely that μ is on a particular side of μ 0 , then a two-sided alternative should be used. Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Hypothesis Testing 149 9.2 Hypothesis Testing for the Mean: Small Samples In this section, we modify slightly the procedure developed in the previous section for performing a test on the population average μ , in the case when the sample size is small, and the measurement X is normally distributed. We consider separately the three cases: Case (1). H 0 : μ = μ 0 versus H 1 : μ > μ 0 The method is very similar to the one encountered in Section 9.1. A large value of ¯ x (compared to μ 0 ) is an indication that H 0 is not true. The only difference is that, when the sample size is small, we can no longer say that the distribution of the ratio ( ¯ X - μ 0 ) / ( S/ n ) is approximately N (0 , 1) (when H 0 is true). However, by Theorem 8.2, we know that if H 0 is true, then T 0 = ¯ X - μ 0 S/ n has a T ( n - 1) distribution . The test is based on the calculation of the studentized ratio t 0 = (¯ x - μ 0 ) / ( s/ n ). In this case, t 0 is the observed value of the test statistic T 0 . Note that t 0 has exactly the same expression as the ratio z 0 defined in Section 9.1. We use a different notation here to emphasize the fact that t 0 is the observed value of the variable T 0 which has a T ( n - 1) distribution, whereas z 0 is the observed value of the variable Z 0 which has an N (0 , 1) distribution. A large (positive) value of the ratio t 0 is an indication that H 0 is not true. To see if this ratio is really large (compared with other values encountered from different samples), we consider the following p -value of the right-tailed test : p -value = P ( T > t 0 ) , where T is a random variable with a T ( n - 1) distribution. A small p -value is an indication that ¯ x is sufficiently large. In this case, we reject H 0 ; otherwise, we fail to reject H 0 . If a preset α -value is given, we reject H 0 if and only if the p -value is smaller than α . This means that we use the following decision rule: if p -value < α, then we reject H 0 if p -value α, then we fail to reject H 0 . This rule guarantees that the probability of the type I error is equal to α . To see this, note p -value = P ( T > t 0 ) < α = P ( T > t α,n - 1 ) Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
150 Expect the Unexpected: A First Course in Biostatistics is equivalent to saying that t 0 > t α,n - 1 . Therefore P (type I error) = P (reject H 0 when H 0 is true) = P ( T 0 > t α,n - 1 ; μ = μ 0 ) = α using the fact that T 0 has a T ( n - 1) distribution when H 0 is true. We should say few words about the p -value calculation in this case. Due to the limitations of Table 18.4, which gives only the values t corresponding to a selected number of probabilities P ( T t ), in the examples below we content ourselves with reporting only the interval where the p -value lies. For this, we have to place the ratio t 0 between some values that we identify in Table 18.4, on row ν = n - 1. In some examples, this means finding two values t 1 < t 2 (whose corresponding areas to the right are α 1 > α 2 ), such that: t 1 < t 0 < t 2 . In this case, we report that: α 2 < p -value < α 1 . In other examples, we may find only one value t 1 (whose corresponding area to the right is α 1 ), such that: t 0 > t 1 . In this case, we report that: p -value < α 1 . Note that, due to the limitations of this procedure, a comparison with a preset α -value is not always possible. In practice, the exact p -value is obtained using a statistical software. Example 9.4. Leatherbacks are one of the biggest and deepest living of all sea turtles. Their immense mass of up to 2,000 pounds helps them stay warm in the frigid water. In the recent years, the number and the size of leatherbacks in the Atlantic has increased, due to the abundant jellyfish population off the coasts of Nova Scotia, where they come to feed after nesting on the beaches of Trinidad. The claim is that the average mass of an Atlantic leatherback is now higher that 1,000 pounds. We want to test this claim, using the hypothesis H 0 : μ = 1 , 000 versus H 1 : μ > 1 , 000 , where μ is the average mass of an Atlantic leatherback. Type I error occurs when we decide that the average mass of an Atlantic leatherback is higher than 1 , 000 pounds, when in fact it is not. Type II Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Hypothesis Testing 151 error occurs if we conclude that there is not enough evidence that the average mass is higher than 1,000 pounds, when in fact it is. We use a sample of 7 leatherbacks, whose average mass is found to be ¯ x = 1 , 045 pounds, with a standard deviation s = 67 pounds. We assume that the mass X of a randomly chosen Atlantic leatherback has a normal distribution. To perform the test, we calculate the ratio: t 0 = ¯ x - μ 0 s/ n = 1 , 045 - 1 , 000 67 / 7 = 1 . 78 . To decide if the value 1 . 78 is sufficiently large for a random variable T with a T (6) distribution, we consider: p -value = P ( T > 1 . 78) . Searching on row ν = 7 - 1 = 6 of Table 18.4 for a value close to 1.78, we find that 1 . 78 lies between 1 . 440 and 1 . 943, whose corresponding areas to the right are 0 . 10, respectively 0 . 05. Fig. 9.2 T (6) distribution We conclude that: 0 . 05 < p -value < 0 . 10 . Using a statistical software, we see that p -value = 0.063. Suppose first that the preset value α is 0 . 05, i.e. we are willing to accept a risk of 5% of making a type I error. Since the p -value is higher than 0 . 05, we fail to reject H 0 and we conclude that there is not enough evidence that the average mass of the Atlantic leatherbacks is larger than 1,000 pounds. Suppose next that we are willing to accept a risk of 10% of making a type I error (i.e. α = 0 . 10). In this case, since the p -value is smaller than 0 . 10, we reject H 0 in favor of H 1 , and conclude that the average mass of the Atlantic leatherbacks is larger than 1,000 pounds. Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
152 Expect the Unexpected: A First Course in Biostatistics Case (2). H 0 : μ = μ 0 versus H 1 : μ < μ 0 The test is also based on the calculation of the same ratio t 0 = ¯ x - μ 0 s/ n . A negative value of this ratio is an indication that H 1 might be true. To see if this ratio is far from 0 (compared with other values encountered from different samples), we consider the p -value of the left-tailed test : p -value = P ( T < t 0 ) , where T is a random variable with a T ( n - 1) distribution. If the p -value is small, we reject H 0 ; otherwise, we fail to reject H 0 . Example 9.5. More that 20% of the world’s oxygen is produced in the Amazon rainforest. The giant kapok tree ( Ceiba pentandra ) is the tallest tree in the Amazon rainforest, with a height of up to 200 feet and a trunk diameter of 9 or 10 feet. This tree is host to numerous aerial plants, insects and birds. The average growth rate of the giant kapok tree is 10 feet per year. Researchers fear that in the past years, the growth rate of this tree has slowed down, due to climate change and deforestation. We want to test this claim, assuming that the growth rate X of a randomly chosen tree has a normal distribution with unknown mean μ . A sample of 15 giant kapok trees yielded an average annual growth ¯ x = 8 . 5 feet, and a sample standard deviation s = 2 . 1 feet. The hypotheses to be confronted are: H 0 : μ = 10 versus H 1 : μ < 10 . Type I error occurs when we decide that the annual growth rate has de- creased, when in fact it did not. Type II error occurs when we decide that the annual growth rate has stayed the same, when in fact it has decreased. To perform the test, we first observe that ¯ x = 8 . 5 < μ 0 = 10. Hence, we can proceed in the direction of H 1 . For this, we calculate the ratio: t 0 = ¯ x - μ 0 s/ n = 8 . 5 - 10 2 . 1 / 15 = - 2 . 77 . In this case, p -value = P ( T < - 2 . 77) = P ( T > 2 . 77) , where T is a random variable with a T (14) distribution. Note that for the second equality above, we used the symmetry of the T (14) distribution. Looking on row ν = 14 of Table 18.4, we see that 2 . 77 lies between the Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Hypothesis Testing 153 values 2 . 624 and 2 . 977, whose corresponding areas to the right are 0 . 01, respectively 0 . 005. Hence, 0 . 005 < p -value < 0 . 01 . Using a statistical software, we see that p -value = 0.008. Since the p -value is smaller than α = 0 . 01, we reject H 0 and conclude that the annual growth rate of the kapok tree has slowed down. Case (3). H 0 : μ = μ 0 versus H 1 : μ 6 = μ 0 As in Case (3) of Section 9.1, we calculate the absolute value of t 0 . The p -value formula uses both tails of the T ( n - 1) distribution and the symmetry of this distribution (which explains the 2 in the formula below). More precisely, the p -value of the two-tailed test is: p -value = 2 P ( T > | t 0 | ) . As in the previous two cases, we have to obtain a small p -value, in order to reject H 0 in favor of H 1 . Example 9.6. Measurements of blood viscosity were made on laboratory mice. A normal value should be close to 3 . 95. Researchers who are testing a new drug suspect that this could have modified their blood viscosity level, but they do not know the direction of this change. Levels which are either too small or too large are not acceptable. We want to see if there is enough evidence that the average level of viscosity has deviated significantly from the value 3.95, due to the new drug. We are interested in testing: H 0 : μ = 3 . 95 versus H 1 : μ 6 = 3 . 95 , where μ is the average viscosity level for the mice which were treated with the new drug. We assume that the blood viscosity levels are normally distributed. A type I error occurs when we decide that the viscosity level is affected by the new drug, when in fact it is not. A type II error occurs when we fail to gain evidence that the drug affects the viscosity level, when in fact it does. A sample of 9 mice yields a sample viscosity level ¯ x = 5 . 25 and a sample standard deviation s = 0 . 6. We calculate the ratio: t 0 = ¯ x - μ 0 s/ n = 5 . 25 - 3 . 95 0 . 6 / 9 = 6 . 50 , Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
154 Expect the Unexpected: A First Course in Biostatistics and then, p -value = 2 P ( T > 6 . 50) , where T is a random variable with a T (8) distribution. Looking on row ν = 9 - 1 = 8 of Table 18.4, we see that 6.50 is larger than the last value listed in the table, namely 3.355 (whose corresponding area to the right is 0.005). Hence, P ( T > 6 . 5) < 0 . 005 and p -value < 2(0 . 005) = 0 . 01 . Using a statistical software, we see that p -value = 0.0002. Since the p -value is very small, we reject H 0 and conclude that the new drug affects the blood viscosity level. Technology Component using R : Suppose that the data is saved in the variable x . To test H 0 : μ = 3 against H 1 : μ 6 = 3, we use: t.test(x,mu=3) If we omit writing “mu=3”, by default it will be testing H 0 : μ = 0. To test H 0 : μ = 3 against H 1 : μ > 3, we use: t.test(x,mu=3,alternative="greater") The output will also include a “one-sided” confidence interval which has an upper bound equal to . This type of interval is not discussed in the book. To test H 0 : μ = 3 against H 1 : μ < 3, we use: t.test(x,mu=3,alternative="less") The output will also include a “one-sided” confidence interval which has a lower bound equal to -∞ . This type of interval is not discussed in the book. 9.3 Hypothesis Testing for the Proportion In this section, we are interested in confronting two hypotheses which speak about the value of the proportion p of individuals who share a common char- acteristic in a given population. Recall from Section 7.2, that an estimate for p is: ˆ p = y n Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Hypothesis Testing 155 where y denotes the number of individuals with the desired characteristic, in a sample of size n . Example 9.7. An article in the National Geographic magazine (April 2009) draws attention on a form of fungal infection, chytridiomycosis (chytrid for short), which is wiping out amphibians on all continents where frogs live. The Amphibian Ark is an international project aimed at keeping at least 500 amphibian species in captivity for reintroduction when the cri- sis is resolved. In the wild, an infection rate higher that 90% is critical for a species to survive. Researchers suspect that this rate has already been attained for the mountain yellow-legged frogs of the Sixty Lake Basin in California’s Sierra Nevada. In a sample of 85 frogs, 77 tested positive for the chytrid fungus. We want to test the hypotheses: H 0 : p = 0 . 90 versus H 1 : p > 0 . 90 , where p is the percentage of mountain yellow-legged frogs in the Sixty Lake Basin, which are infected by chytrid. An estimate for p is: ˆ p = 77 85 = 0 . 906 , or 90 . 6% . A type I error occurs when we decide that the infection rate exceeds the critical rate of 90%, when in fact it does not. A type II error occurs when we fail to show that the infection rate exceeds the critical rate of 90%, when in fact it does. Example 9.8. Topiramate (commonly known as topamax in Canada and the United States) was approved for use as a treatment for epilepsy in 1995. In 2004, the American Food and Drug Administration approved the drug for use in treating migraines. Side effects of topiramate treatment include fatigue, nausea and confusion. We want to gain evidence for the fact that these side effects appear in less than 6% of the population. In a group of 150 patients treated with topiramate, only 6 complained about side effects. We would like to test the hypotheses: H 0 : p = 0 . 06 versus H 1 : p < 0 . 06 , where p is the (unknown) proportion of people who experience side effects, among those who are using topiramate. An estimate for p is: ˆ p = 6 150 = 0 . 04 , or 4% . A type I error occurs when we decide that the percentage of people who experience side effects is lower than 6%, when in fact it is not. A type II error occurs when we fail to show that the percentage of people who experience side effects is lower that 6%, when in fact it is. Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
156 Expect the Unexpected: A First Course in Biostatistics Example 9.9. Conventional detergents are based on petrochemicals which are rapidly depleting sources of non-renewable materials, and whose residues are poorly biodegradable, building up in the environment. By contrast, ecological detergents are biodegradable, being produced using in- gredients of renewable origin. The effectiveness of ecological detergents in removing oil stains is thought to be around 80%. A new ecological detergent was used in a sample of 500 laundry loads containing oil-stained items. 435 loads resulted in the complete removal of oil stains. Based on this sample, we want to test the hypothesis: H 0 : p = 0 . 80 versus H 1 : p 6 = 0 . 80 , where p is the effectiveness in removing oil stains of the ecological detergent. An estimate for p is: ˆ p = 435 500 = 0 . 87 , or 87% . A type I error occurs when we decide that the effectiveness of the new detergent is significantly different then 80%, when in fact it is not. A type II error occurs when we fail to show that the effectiveness of the new detergent is different then 80%, when in fact it is. We consider the following three cases: Case (1). H 0 : p = p 0 versus H 1 : p > p 0 In this case, we want to show that the unknown proportion p is higher than a fixed numerical value p 0 . To move in this direction, first we have to make sure that the estimate ˆ p is larger than p 0 . A large difference between ˆ p and p 0 is a good sign in favor of H 1 . The next question is: how large the difference ˆ p - p 0 should be, to comfortably reject H 0 ? To answer this question, we use fact that, if H 0 is true then Z 0 = ˆ p - p 0 p p 0 (1 - p 0 ) /n has approximately an N (0 , 1) distribution , if n is large. Our decision is based on the observed value of the test statistic: z 0 = ˆ p - p 0 p p 0 (1 - p 0 ) /n . A large value of this ratio is an indication that H 1 might be true. The idea is to compare this ratio against all other possible values (which may Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Hypothesis Testing 157 arise from different samples), by means of the p -value. The p -value of the right-tailed test for p is: p -value = P ( Z > z 0 ) . The smaller the p -value, the less likely it is that H 0 is true. We reject H 0 (and gain evidence for H 1 ), if the p -value is very small. Example 9.7 (continued). In this case, ˆ p = 0 . 906, n = 85 and p 0 = 0 . 90. We calculate the observed value of the test statistic: z 0 = ˆ p - p 0 p p 0 (1 - p 0 ) /n = 0 . 906 - 0 . 90 p (0 . 90)(0 . 10) / 85 = 0 . 18 . The p -value is: p -value = P ( Z > 0 . 18) = 1 - 0 . 5714 = 0 . 4286 . Since the p -value is large, we cannot reject H 0 . We conclude that there is not enough evidence that the infection rate is higher than 90%. Case (2). H 0 : p = p 0 versus H 1 : p < p 0 To move in the direction of H 1 , the estimate ˆ p has to be smaller than p 0 . The testing procedure is based on the same ratio z 0 as in Case (1). In this case, the ratio is negative. The p -value of the left-tailed test is: p -value = P ( Z < z 0 ) . We reject H 0 if the p -value is smaller than the significance level α . Example 9.8 (continued). Using p 0 = 0 . 06, n = 150, and ˆ p = 0 . 04, we calculate the observed value of the test statistic: z 0 = ˆ p - p 0 p p 0 (1 - p 0 ) /n = 0 . 04 - 0 . 06 p (0 . 06)(0 . 94) / 150 = - 1 . 03 . Using Table 18.2, we obtain: p -value = P ( Z < - 1 . 03) = 0 . 1515 . Since the p -value is large, we cannot reject H 0 in favor of H 1 . The conclusion is that this sample does not contain enough evidence that the percentage of people suffering side effects from topamax is smaller than 0.06. Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
158 Expect the Unexpected: A First Course in Biostatistics Case (3). H 0 : p = p 0 versus H 1 : p 6 = p 0 In this case, we calculate the absolute value of the ratio z 0 . The p -value of the two-tailed test for p is: p -value = 2 P ( Z > | z 0 | ) . As in the previous two cases, we reject H 0 in favor of H 1 , if the p -value is smaller than the significance level α . Example 9.7 (continued). In this case p 0 = 0 . 80, ˆ p = 0 . 87 and n = 500. We calculate the absolute value: | z 0 | = ˆ p - p 0 p p 0 (1 - p 0 ) /n = 0 . 87 - 0 . 80 p (0 . 80)(0 . 20) / 500 = 3 . 91 . Using Table 18.3, we see that: p -value = 2 P ( Z > 3 . 91) < 2(0 . 0001) = 0 . 0002 . (We used the fact that the value 3 . 91 is larger than the largest value given in Table 18.3, namely 3.89, whose area to the right is 0.0001.) Since the p -value is very small, we reject H 0 and conclude that the efficiency of the new ecological detergent is different than 0.80. 9.4 Problems Problem 9.1. The Greenland ice sheet covers roughly 80% of the surface of Greenland, being the second largest body of ice in the world, after the Antarctic ice sheet. As the arctic climate is rapidly warming, the Greenland ice sheet has experienced record melting in the recent years. The following data gives the depth of the ice sheet (in m) measured at various locations during the summer months in the Northeast Greenland National Park: 3115 3133 3123 3145 3125 3131 3127 3120 3118 3124 Using this data, is there enough evidence that the average depth of the ice sheet is below 3140m? Assume that the depth of the ice sheet is normally distributed. Problem 9.2. Ciprofloxacin is an antibiotic used for the treatment of uri- nary tract infections (UTI). It is estimated that 10% of the patients treated with ciprofloxacin will have a recurrent UTI within three weeks after treat- ment. A new drug has been developed for the treatment of UTI. In a group of 347 patients who were treated with this drug, only 29 had a recurrent Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Hypothesis Testing 159 UTI within three weeks after treatment. Can we conclude that the new drug is efficient in reducing the proportion of patients with recurrent UTI below 10%? Use a test of hypotheses of level α = 0 . 05. Problem 9.3. The article [7] studies the acquisition of rainfall data in Guinea Savanna part of Nigeria. One of the major data acquisition prob- lems in Sub-Saharan Africa includes instrumental errors, which are asso- ciated with the functioning of the instruments. An error encountered fre- quently with the rain gauges (instruments used by hydrologists) occurs during the siphoning cycle, when the rain persists to enter the rain gauge. In a sample of 64 observations, it was found that the mean measurement error was ¯ x = 0 . 28 mm with a standard deviation s = 0 . 5 mm. Is there enough evidence that the average measurement error μ exceeds the thresh- old of 0.25 mm? Use level α = 0 . 10. Problem 9.4. Studies were conducted in a metropolitan area to determine the concentration of carbon monoxide near a large highway. There are concerns that the average concentration of carbon monoxide exceeds 100 parts per million (ppm) at this location. The researchers captured air in bags and used a spectrophotometer to determine the concentration of carbon monoxide. Below are the results for 25 randomly chosen times over a period of 6 months: 100.1 101.9 101.3 102.1 98.3 100.3 100.2 109.6 98.5 92.0 103.7 108.5 104.9 109.8 95.3 93.1 107.0 92.1 109.2 93.2 93.1 107.3 97.1 104.4 102.3 For this sample, the mean and standard deviation are 101.012 ppm and 5.7644 ppm, respectively. (a) Formulate the null and alternative hypotheses to test that the average concentration of carbon monoxide exceeds 100 ppm. (b) Interpret the type I error in the context of this question. (c) Interpret the type II error in the context of this question. (d) Use a statistical software to verify that the concentration of carbon monoxide is normally distributed. (e) Compute the p -value of the test the hypotheses formulated in part (a). Give your conclusion at the significance level α = 0 . 10. (f) If your conclusion in part (e) is wrong, did you commit a type I error or a type II error? Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
160 Expect the Unexpected: A First Course in Biostatistics Problem 9.5. Refer to Problem 8.11. Using these data, is there enough evidence to conclude that the mean yield has increased, compared to the average yield of 2.7 kg per year? Problem 9.6. Percutaneous nephrolithotomy (PN) is a surgical procedure used for removing kidney stones by a small puncture incision through the skin. The authors of [13] studied the efficiency of PN in removing kidney stones. The treatment was defined as successful if stones were eliminated or reduced to less than 2 mm after three months. PN was successful for 289 out of 350 patients that underwent the treatment. The traditional open surgery has a success rate of 78%. (a) Give a point estimate for the success rate of PN in treating kidney stones and give the corresponding estimated standard error. (b) Is there significant evidence that the success rate of PN in treating kidney stones is different than the success rate of open surgery? Formulate a null hypothesis and an alternative hypothesis and find the corresponding p -value. Use α = 0 . 01. (c) Use a 95% confidence interval for comparing the success rate of PN in treating kidney stones with the success rate of open surgery. Problem 9.7. Refer to Problem 4.7. This medication was given to 20 patients and 17 reported a significant reduction in their pain. Using these data, is there enough evidence to conclude that the use of this medication is better than not using any medication for reducing pain? Hint: Consider testing p = 0 . 5 against p > 0 . 05, where p is the proportion of patients for whom the pain subsides, among those using the medication. Use the binomial distribution to compute the p -value, since the sample size is small. Problem 9.8. Consider the following R output. It is based on a sample of size n = 125: One Sample t-test data: x t = 1.5001, df = 124, p-value = ? alternative hypothesis: true mean is greater than 100 95 percent confidence interval: 99.96231 Inf sample estimates: Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Hypothesis Testing 161 mean of x 100.3598 (a) Give a point estimate estimate for the population mean μ and give the corresponding estimated standard error. (b) Is this a one-sided or a two-sided test? If it is one-sided, is the alternative right-sided or left-sided? (c) Since the sample size is very large, we can approximate the T (124) distribution with the standard normal distribution. Find the p -value. (d) Can we reject H 0 in favor of H 1 at a level of significance α = 0 . 05? (e) What is the p -value if the alternative hypothesis is H 1 : μ 6 = 100? (f) Can we reject H 0 in favor of H 1 : μ > 100 at a level of significance α = 0 . 10? (g) Use the values from part (a) and Table 17.3 to compute a 97% confidence interval for the population mean μ . Problem 9.9. For many years a farmer has not kiln-dry his barley seeds before sowing. (To kiln-dry means to dry in an insulated chamber where airflow, temperature and humidity are controlled.) The non-kiln-dried seeds yield on average 672 kg of barley per 4000 m 2 . This year the farmer decides to kiln-dry his barley seeds before sowing. Ten varieties of kiln-dried barley seeds are sown. The yields (in kg per 4000 m 2 ) are below. 652 . 3 706 . 1 679 . 9 630 . 9 664 . 0 647 . 5 697 . 6 686 . 8 722 . 6 655 . 0 (a) Using these data, is there enough evidence to conclude that the mean yield has increased? (First verify that the yields are normally distributed.) (b) Construct a 95% confidence interval for the mean yield of kiln-dried barley. Problem 9.10. A cigarettes manufacturer claims that the average tar con- tent μ in his brand of cigarettes is 14 mg. Assume that the tar content in one cigarette has a normal distribution. A medical association is concerned that the tar content of these cigarettes may exceed 14 mg. A sample of 5 randomly selected cigarettes produced by this manufacturer has mean ¯ x = 14 . 4 mg and variance s 2 = 0 . 025 mg 2 . Is there enough evidence to justify the concern of the medical association? Formulate an appropriate test of hypotheses, give the range of the p -value of the test, and state your conclusion at the level α = 0 . 05. Problem 9.11. Refer to Problem 8.15. Is there enough evidence to conclude that the new drug has a larger rate of effectiveness compared to Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
162 Expect the Unexpected: A First Course in Biostatistics another drug, which is effective in 66% of the cases? Use a test of hypothesis at the significance level α = 0 . 10. Problem 9.12. The Canadian federal government is trying to assess whether an anti-smoking campaign has reduced the proportion of teenagers under 16 who smoke. Before the campaign began, 1/3 of teenagers were smokers. After the campaign, a national survey of 200 teenagers revealed that there are 50 smokers in this group. On the basis of this information, would you conclude that the campaign was effective? Formulate an ap- propriate test of hypotheses, give the p -value of the test, and state your conclusion at level α = 0 . 05. Did you know? William Gosset was a student of both chemistry and mathematics. He also worked and studied during the period of 1906-1907 in the biometrics laboratory of Karl Pearson at University College Lon- don. The Student distribution was discovered by Gosset while working as a brewer and scientist at Guinness in the early 20th century. Guinness prohibited its employees from publishing, so Gosset used the pseudonym Student. While performing quality control for Guinness, Gosset saw the need for developing statistical methods for small samples, i.e. methods that do not rely on asymptotic results such as the Central Limit Theorem. He was able to guess the form of the density of the studentization of the sample mean ( X - μ ) / ( S/ n ) , under the assumption that the population is normal. He used mathematical arguments and empirical work (experiments) to con- struct the T -distribution. Gosset’s results were confirmed later by Ronald Fisher. Fisher appreciated the importance of Gosset’s small-sample work, which inspired much of his own work. Balan, R., & Lamothe, G. (2017). Expect the unexpected : A first course in biostatistics (second edition). World Scientific Publishing Company. Created from ottawa on 2023-09-29 20:14:56. Copyright © 2017. World Scientific Publishing Company. All rights reserved.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help