zyBooks- Chapter 1
pdf
keyboard_arrow_up
School
University of Phoenix *
*We aren’t endorsed by this school
Course
218
Subject
Mathematics
Date
Jun 7, 2024
Type
Pages
78
Uploaded by BaronInternetMandrill43
1.1 Example: Paired designs
Learning goals
Identify a study design as having paired or independent groups.
Identify a study design as paired using repeated measures or paired using
matching.
Does listening to music with lyrics interfere with studying?
Many students like to study while listening to music. What does research say about how well students can
focus while listening to music with lyrics? Researchers plan to design an experiment with 27 students with two
treatments: listening to music with lyrics and listening to music without lyrics. The researchers will then
compare students' performance on a memorization game: Students will study a list of 25 common ±ve-letter
words for 90 seconds and then write down as many of the words as they can remember.
PARTICIPATION
ACTIVITY
1.1.1: Possible experimental design.
Randomly assigned
to groups
27 students
With Lyrics Without Lyrics
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
1/78
PARTICIPATION
ACTIVITY
1.1.2: Music study design.
1)
The experimental units are _____.
the students
the lyrics
the music
2)
The explanatory variable is _____, which
is a _____ variable.
the number of words memorized;
quantitative
whether the student listened to
music with lyrics; categorical
Animation content:
Static ±gure: A possible study design to measure the effect of listening to music with lyrics on memorization.
Step 1: Suppose 14 students are randomly assigned to listen to music with lyrics and 13 students are randomly assigned to listen to music with no lyrics.
Step 2: Each of the 27 individuals provides exactly one response value, which is the number of words memorized.
Data
Student
Music Type
# Words
1
With lyrics
5
...
...
...
14
With lyrics
10
15
No lyrics
5
...
...
...
27
No lyrics
10
Animation captions:
1. Suppose 14 students are randomly assigned to listen to music with lyrics and 13 students are
randomly assigned to listen to music with no lyrics.
2. Each of the 27 individuals provides exactly one response value, which is the number of words
memorized.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
2/78
3)
The response variable is _____, which is
a _____ variable.
whether the student listened to
music with lyrics; categorical
the number of words memorized;
quantitative
4)
For the music study, what is the
purpose of the random assignment?
To generalize the results from
these 27 students to a larger
population of students.
To create similar treatment
groups.
To generate a null distribution.
Another study design
The above study design can be referred to as an independent groups design. In an independent groups design
,
the data recorded for one group are unrelated to the data recorded in the other group(s). The goal of random
assignment is to create groups that are similar on all possible confounding variables. However, sometimes the
created groups could be unequal on one or more variables just by chance. Ex: Substantial variation in individual
memorization skill may still lead to groups that are not completely equally distributed with respect to that
variable even if an independent groups design is used. To guard against such unevenness, other study designs
are possible.
PARTICIPATION
ACTIVITY
1.1.3: Paired study design.
With Lyrics With Lyrics Without Lyrics
Without Lyrics
Memorization
Independent groups design Paired study design ©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
3/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Randomly assigned
first treatment
With Lyrics Without Lyrics
Animation content:
Static ±gure: A paired study design is explored to minimize the chance of unequal groups on the observed difference in effect.
Step 1: In an independent groups design for the music study, more of the people with better memorization skills could end up in the without lyrics group by random chance.
Step 2: To account for the variability in scores due to prior differences in memorization skills between people, a paired study design instead asks each student in the study to play the memorization game twice, once with lyrics and once without.
Step 3: Randomization is used to decide which treatment the students participate in ±rst. Step 4: The resulting dataset has two measurements of the number of words remembered for each person.
Paired study design data
Student
First Music Type
# Words
Second Music Type
# Words
1
With lyrics
5
No lyrics
6
...
...
...
...
...
14
With lyrics
10
No lyrics
7
15
No lyrics
5
With lyrics
8
...
...
...
...
...
27
No lyrics
10
With lyrics
11
Animation captions:
1. In an independent groups design for the music study, more of the people with better
memorization skills could end up in the without lyrics group by random chance.
2. To account for the variability in scores due to prior differences in memorization skills between
people, a paired study design instead asks each student in the study to play the memorization
game twice, once with lyrics and once without.
3. Randomization is used to decide which treatment the students participate in ±rst.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
4/78
PARTICIPATION
ACTIVITY
1.1.4: Paired design in the music study.
1)
In a paired study design, randomness is
present in _____.
deciding which treatment is given
to each experimental unit
deciding which experimental
units are paired
deciding the orders of the
treatments
2)
Using a paired study design, any
difference in words memorized by a
particular student will be mainly due to
_____.
randomness
the explanatory variable
confounding variables
3)
In using the paired study design, the
Lyrics and Without Lyrics groups are ___
to be balanced on general
memorization skill.
likely
guaranteed
not likely
Other paired designs
A paired design using repeated measurements
is a study in which each experimental unit is measured twice
under different conditions where the order of the conditions is randomly assigned. Randomizing the order of
the treatments is important to guard against any changes over time. Ex: Both familiarity with the memorization
game and fatigue can change over time and could affect the number of words memorized. But changes over
time are accounted for in the music study by randomizing the order of the treatments.
Having each subject experience both treatments is not always an option, and other paired designs are
possible. A paired design using matching
creates pairs by matching up two very similar experimental units
and then randomly assigning one to each treatment.
4. The resulting dataset has two measurements of the number of words remembered for each
person.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
5/78
PARTICIPATION
ACTIVITY
1.1.5: Paired design using matching.
PARTICIPATION
ACTIVITY
1.1.6: Matched paired design in the music study.
With
Witho
Memorization skill
Paired by
memorization skill Assigned treatment Animation content:
Static ±gure: A paired study design using matching is explored to minimize the chance of unequal groups on the observed difference in effect.
Step 1: Suppose the researchers can record general memorization skill for each student before the start of the study.
Step 2: Students with similar memorization skill can be paired together.
Step 3: A coin can be ²ipped for each pair to decide who gets which treatment.
The resulting dataset has one measurement of the number of words remembered for each person.
Animation captions:
1. Suppose the researchers can record general memorization skill for each student before the
start of the study.
2. Students with similar memorization skill can be paired together.
3. A coin can be ²ipped for each pair to decide who gets which treatment.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
6/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1)
In the above paired design using
matching, randomness is present in
_____.
deciding which treatment each
experimental unit receives
deciding which experimental
units are paired
deciding the orders of the
treatments
2)
In the above paired design using
matching, the Lyrics and Without Lyrics
groups are _____ to be balanced on
general memorization skill.
likely
not likely
Can observational studies use pairing?
The main goal of a paired design is to create treatment groups that are even more likely to be similar to each
other, at least on "important" potential confounding variables. Ex: General memorization ability is clearly related
to the response variable for the music study, and the study would be much less informative if random
assignment happened to put more of the better memorizers into one of the treatment groups than another. A
paired design using matching guards against such unfortunate events and helps the researchers better
compare the explanatory variable conditions of interest. Less important variables are hopefully still balanced
by the random assignment process. Ex: Age is not as relevant as general matching skills to the number of
words memorized, but the random assignment process will still help account for differences between older
and younger participants in number of words.
Paired designs are not limited to experiments, as pairing can also be part of the design of an observational
study. Randomization is no longer used within each pair, but pairing will still "control" for an important source
of variation.
PARTICIPATION
ACTIVITY
1.1.7: Advantages of paired designs.
Research question
Independent groups design
Paired design
Advantage of paired design
In heterosexual couples, do married men tend to be older than married women?
Record the ages of a random sample of married men and an independent random sample of married women.
Record the ages for both the wife and the husband of a random sample of married couples.
The difference in ages the same couple will te
to be smaller than diffe
in ages among married
and married women.
Does Store A tend to have cheaper Record the prices of a random sample of Record the prices at both store A and store B for a Pairing accounts for the
variation in prices of gr
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
7/78
PARTICIPATION
ACTIVITY
1.1.8: Identifying designs as paired or independent.
For the proposed studies below, identify the study plan as paired or independent groups
prices than Store B?
products from store A and an independent sample of products from store B.
random sample of products common to both stores.
store products.
Animation content:
Static ±gure: An independent groups design is compared to a paired design for two possible research questions.
Step 1: For the research question, "In heterosexual couples, do married men tend to be older than married women?", how would an independent groups design differ from a paired design using matching?
Independent groups design: Record the ages of a random sample of married men and an independent random sample of married women.
Paired design: Record the ages for both the wife and husband of a random sample of married couples.
Step 2: Why is the pairing likely to be helpful? The differences will have less variability than the ages. The difference in ages in the same couple will tend to be smaller than differences in ages among married men and married women.
Step 3: For the research question, "Does Store A tend to have cheaper prices than Store B?", how would an independent groups design differ from a paired design using matching? Independent groups design: Record the prices of a random sample of products from store A and an independent sample of products from store B.
Paired design: Record the prices at both store A and store B for a random sample of products common to both stores.
Step 4: What source of variation does the pairing account for? The variation in the differences in prices of the same product should be smaller than the variation in grocery store products. Pairing accounts for the wide variation in prices of grocery store products.
Animation captions:
1. For the research question, "In heterosexual couples, do married men tend to be older than
married women?", how would an independent groups design differ from a paired design using
matching?
2. Why is the pairing likely to be helpful? The differences will have less variability than the ages.
3. For the research question, "Does Store A tend to have cheaper prices than Store B?", how would
an independent groups design differ from a paired design using matching?
4. What source of variation does the pairing account for? The variation in the differences in prices
of the same product should be smaller than the variation in grocery store products.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
8/78
1)
A dietician evaluates the effectiveness
of a weight loss program by looking at
the difference between pre- and post-
diet weights of the same individuals.
Such an observational study would be
_____.
a paired design using matching
a paired design using repeated
measures
an independent groups design
2)
A school lunch program provides one
meat entree and one similar vegetarian
entree each day. Ex: When beef
hamburgers are served, a vegetarian
hamburger is also served. A dietician
wants to compare the number of
calories between these two types of
entrees. Such an observational study
would be _____.
a paired design using matching
a paired design using repeated
measures
an independent groups design
3)
A teacher comparing scores of two
classes taking the same test where one
class used a traditional print textbook
and another used an online interactive
textbook is an example of an
experiment with _____.
a paired design using matching
a paired design using repeated
measures
an independent groups design
CHALLENGE
ACTIVITY
1.1.1: Paired designs.
581360.4180600.qx3zqy7
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
9/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1.2 Example: Simulation-based approach for
analyzing paired data
Learning goals
Understand the difference between independent samples and paired samples in
terms of the study design, how variability can be lower in a paired design, and how
this can in±uence the strength of evidence.
Complete a simulation-based test of signi²cance of a paired design by writing out
the hypothesis, determining the observed statistic, computing the p-value, and
writing out an appropriate conclusion.
Does the path taken to "round" a base make a difference?
In baseball, after the batter hits the ball, the batter attempts to run to ±rst base (single), second base (double),
or third base (triple) as fast as possible. Suppose a baseball player is trying to stretch a single to a double.
Does the strategy used to "round" ±rst base make a difference? A graduate student (
Woodward, 1970
) decided
to compare times for 22 runners to run from home plate to second base using a narrow angle and a wide
angle. Step 1: Do the running times using the narrow-angle strategy tend to differ from the running times using
the wide-angle strategy?
PARTICIPATION
ACTIVITY
1.2.1: Rounding bases study design and data collection.
Runner
Narrow angle
Wide angle
1
2
3
5.50
5.70
5.60
5.55
5.75
5.50
4
5
6
5.50
5.85
5.55
5.40
5.70
5.60
7
8
5.40
5.50
5.35
5.35
9
5.15
5.00
Narrow angle
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
10/78
PARTICIPATION
ACTIVITY
1.2.2: Rounding bases study design.
10
...
5.80
...
5.70
...
Wide angle
Animation content:
Static ±gure: The study design and data of the rounding bases study.
Step 1: Woodward wanted to determine whether rounding ±rst base at a narrow-angle strategy or a wide-angle strategy makes a difference in time to travel from home to second base.
Narrow angle stays the same distance from the baseball diamond throughout, while a wide angle is far until ±rst base and then close between ±rst base and second base.
Step 2: Each runner ran twice, once by taking a wide-angle strategy and once by taking a narrow-
angle strategy, with a rest period in between. Both times are recorded in seconds.
Step 3: The order of the strategy was randomized for each runner.
Some runners ran the narrow angle before the wide angle and others ran the wide angle ±rst.
Step 4: For each of the 22 runners, times are recorded for both the narrow-angle and wide-angle strategies.
Rounding bases study data
Runner
Narrow angle (seconds)
Wide angle (seconds)
1
5.50
5.55
2
5.70
5.75
3
5.60
5.50
...
...
...
Animation captions:
1. Woodward wanted to determine whether rounding ±rst base at a narrow-angle strategy or a
wide-angle strategy makes a difference in time to travel from home to second base.
2. Each runner ran twice, once by taking a wide-angle strategy and once by taking a narrow-angle
strategy, with a rest period in between. Both times are recorded in seconds.
3. The order of the strategy was randomized for each runner.
4. For each of the 22 runners, times are recorded for both the narrow-angle and wide-angle
strategies.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
11/78
1)
The observational units are the _____.
bases
runners
whether the angle is narrow or
wide
2)
The explanatory variable is _____, which
is a _____ variable.
time to round ±rst base;
quantitative
time to round ±rst base;
categorical
whether the angle strategy is
narrow or wide; categorical
whether the angle strategy is
narrow or wide; quantitative
3)
The response variable is _____, which is
a _____ variable.
the time to round ±rst base;
quantitative
time to round ±rst base;
categorical
whether the angle is narrow or
wide; categorical
whether the angle strategy is
narrow or wide; quantitative
4)
The design of the rounding bases study
follows _____.
a paired design using matching
a paired design using repeated
measures
an independent groups design
5)
The paired study design aims to
account for _____.
runner-to-runner variability
variability between the base
running angles
fatigue as a confounding variable
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
12/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
6)
Randomizing which angle runners used
±rst and the rest period control for
_____.
runner-to-runner variability
sample-to-sample variability
fatigue as a confounding variable
Base-running times as matched pairs
A paired design allows a more direct comparison of the effects of the treatment conditions: the two base-
running strategies. To analyze the data, rather than using the run times from each strategy, the analysis can
use the differences (\(d\)) in the two running times for each runner:
\[d = (\text{narrow angle time}) - (\text{wide angle time}).\]
The time difference \(d\) is a single quantitative response variable, which means that the usual one-sample
methods can be applied. In the context of a single sample of differences, the sample statistics can be written
using the subscript "d":
\(\overline{x}_d\) : mean of the differences in run time between the narrow-angle and wide-angle
strategies
\(s_d\) : standard deviation of the differences in run time between the narrow-angle and wide-angle
strategies.
PARTICIPATION
ACTIVITY
1.2.3: Exploring paired data: Calculating the mean of the differences.
Runner
Narrow
Wide
Difference
1
2
22
...
...
...
...
5
5.35
5.15
4.95
5.45
5
0.05
-0.10
0.15
Time (sec)
Differences (sec)
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
13/78
Animation content:
Static ±gure: Calculating the men of the differences in running times.
Step 1: The data for the narrow-angle and wide-angle times are paired, so such data should not be treated as coming from two independent samples. Instead, the differences in times are examined.
Rounding bases study data
Runner
Narrow angle (seconds)
Wide angle (seconds)
Difference (seconds)
1
5.00
4.95
0.05
2
5.35
5.45
-0.10
...
...
...
...
22
5.15
5.00
0.15
Both the narrow and wide running times cluster around 5.5 and range from approximately 4.9 to 6.4.
Step 2: Runner number 1 took longer on the narrow angle than the wide angle, so the difference in time is 0.05 seconds.
Step 3: Runner number 2 took longer on the wide angle than on the narrow angle, so the difference in time is -0.10 seconds.
Step 4: The distribution of the differences in running times has a mean of \(\overline{x}_d\) = 0.075 sec. and a standard deviation of \(s_d\) = 0.0883 sec.
The distribution of differences range from -0.1 to 0.2. The majority of the differences are greater than 0.
Summary statistics
Sample size, \(n\)
Sample mean
Sample SD
Narrow
22
\(\bar{x}_{n} = 5.534\)
\(s_{n} = 0.260\)
Wide
22
\(\bar{x}_{w} = 5.459\)
\(s_{w} = 0.273\)
Difference
22
\(\bar{x}_{d} = 0.075\)
\(s_{d} = 0.088\)
Animation captions:
1. The data for the narrow-angle and wide-angle times are paired, so such data should not be
treated as coming from two independent samples. Instead, the differences in times are
examined.
2. Runner number 1 took longer on the narrow angle than the wide angle, so the difference in time
is 0.05 seconds.
3. Runner number 2 took longer on the wide angle than on the narrow angle, so the difference in
time is -0.10 seconds.
4. The distribution of the differences in running times has a mean of \(\overline{x}_d\) = 0.075 sec.
and a standard deviation of \(s_d\) = 0.0883 sec.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
14/78
PARTICIPATION
ACTIVITY
1.2.4: Exploring the distribution of the difference in run times.
1)
A single orange dot in the animation
above represents _____.
a runner's time for the narrow
angle
a runner's time for the wide angle
the mean running time for all
runners using the narrow-angle
strategy
the difference in a runner's time
between the two angles
2)
A single grey dot in the animation
represents _____.
a runner's time for the narrow
angle
the difference in mean time for
the two angles
the difference in a runner's time
between the two angles
3)
Based on the dot plot, the distribution of
the differences is _____.
symmetric
slightly skewed
4)
Which of the following is the most
appropriate null and alternative
hypotheses for this study?
\(H_0{:}\ \mu_{narrow}\) - \
(\mu_{wide}\) = 0; \(H_a{:}\
\mu_{narrow}\) - \(\mu_{wide}\)
> 0
\(H_0{:}\ \mu_d\) = 0; \(H_a{:}\
\mu_d\) ≠
0
\(H_0{:}\ \pi\) = 0.50; \(H_a{:}\
\pi\) > 0.50
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
15/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
5)
The variability of the differences in
running time is _____ the variability in
running times between runners.
less than
about the same as
greater than
Step 4: A chance model for paired-base running data
As hoped with the paired design, the variation in the time differences (\(s_{d}\) = 0.088 second) is smaller than
the variation in running times for each of the base-running strategies (\(s_{narrow}\) = 0.260, \(s_{wide}\) =
0.273 seconds). To decide whether the observed value of the mean difference (\(\bar{x}_d\) = 0.075 seconds)
in run times is signi±cantly above 0, the paired design also needs to be re²ected in the chance model. The null
hypothesis assumes no association between strategy and run time (a mean of the differences in run times
equal to 0), so within each pair
of individual run times, the narrow and wide run times are interchangeable. So
to generate a set of "could-have-been" outcomes, the simulation will randomly decide whether or not to swap
the two observations for each runner.
PARTICIPATION
ACTIVITY
1.2.5: A chance model for the mean of the differences for paired data.
Narrow
Wide
Swap?
Diff
ID
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
5.50
5.55
0.05
5.70
5.75
- 0.05
5.60
5.50
0.10
5.50
5.40
0.10
5.85
5.70
0.15
5.55
5.60
- 0.05
5.40
5.35
0.05
5.50
5.35
0.15
5.15
5.00
0.15
5.80
5.70
0.10
5.20
5.10
0.10
5.55
5.45
0.10
5.35
5.45
- 0.10
5.00
4.95
0.05
5.50
5.40
0.10
Y
N
N
Y
?
?
?
?
?
?
?
?
?
?
?
-
Differences
Outcomes
-0.007
{
1000
repetitions
0.039
0.052
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
16/78
Average difference
16
17
18
19
20
21
22
5.55
5.50
0.05
5.55
5.35
0.20
5.50
5.55
- 0.05
5.45
5.25
0.20
5.60
5.40
0.20
5.65
5.55
0.10
5.30
5.25
0.05
?
?
?
?
?
?
?
{
1000
Animation content:
Static ±gure: Creating a null distribution of simulated average differences.
Step 1: If the wide and narrow strategies take the same amount of time, the times in each pair can be swapped at random (Ex: ²ip a coin) to generate a repetition of the study, assuming the type of angle doesn't matter. Step 2: If a coin lands heads, the wide and narrow times are swapped for that observation. When the times are swapped, the sign of the difference changes.
Ex: a runner's narrow time is 5.50 and their wide time is 5.55 which corresponds to a difference of 5.50 - 5.55 = -0.05. The coin lands on heads, so the times are switched. The new difference is narrow - wide = 5.55 - 5.50 = 0.05. Step 3: A coin is ²ipped for each pair. In the dot plot of the differences, the difference for each runner either stays the same or changes sign.
Step 4: The value of the mean of the simulated differences in run time is calculated for the ±rst repetition.
The mean of the simulated differences is -0.007.
Step 5: A second set of shu³es produces a second simulated value of the mean difference in run time assuming the wide and narrow angles take the same amount of time in the long run.
The data is shu³ed by the same process. The second simulation produces a mean difference of 0.039.
Step 6: A simulated null distribution for the sample mean difference in run times, assuming the two running angles take the same amount of time, is made by performing 998 additional repetitions.
1000 total randomizations creates a null distribution of average differences. The distribution is bell shaped and centered at 0. The values range from approximately -0.7 to 0.7.
Animation captions:
1. If the wide and narrow strategies take the same amount of time, the times in each pair can be
swapped at random (Ex: ²ip a coin) to generate a repetition of the study, assuming the type of
angle doesn't matter.
2. If a coin lands heads, the wide and narrow times are swapped for that observation. When the
times are swapped, the sign of the difference changes.
3. A coin is ²ipped for each pair. In the dot plot of the differences, the difference for each runner
either stays the same or changes sign.
4. The value of the mean of the simulated differences in run time is calculated for the ±rst
repetition.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
17/78
PARTICIPATION
ACTIVITY
1.2.6: The chance model for a paired design.
1)
A single value in the simulated null
distribution corresponds to _____.
a runner's time for the narrow
angle run
a runner's time for the wide angle
run
the mean time for all runners
using the narrow angle
the mean time for runners
reshu³ed to the wide angle
the mean of the differences in
runners' times between the
angles
5. A second set of shu³es produces a second simulated value of the mean difference in run time
assuming the wide and narrow angles take the same amount of time in the long run.
´. A simulated null distribution for the sample mean difference in run times, assuming the two
running angles take the same amount of time, is made by performing 998 additional
repetitions.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
18/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2)
A repetition of the simulation generating
a negative \(\overline{x}_d\) means
_____.
something went wrong as
negative times are impossible
all runners ran faster using the
narrow angle
individual runners' times using
the narrow path were faster on
average
3)
The center of the null distribution
should be 0 because _____.
the chance model assumes that
\(H_0\) is true
the chance model assumes that
all groups are paired
the statistic is close to zero
4)
The shape of the null distribution is
_____.
skewed to the left
skewed to the right
bell-shaped
5)
A p-value could be calculated by ±nding
the proportion of repetitions in the null
distribution which have a statistic \
(\bar{x}_d\) of _____.
0 or greater
0.052 or greater
0.052 or greater and -0.052 or
smaller
0.075 or greater
0.075 or greater and -0.075 or
less
Step 4, cont.: Measuring the strength of evidence
The chance model can be used to estimate the p-value and standardized statistic as usual.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
19/78
PARTICIPATION
ACTIVITY
1.2.7: Measuring strength of evidence using the simulated null distribution of
the average differences.
Average difference
Measuring strength of evidence p-value \( = \frac{0+0}{1000}\)
Standardized statistic
:
\( \ \ \ \ \ \ \ \ \ \bar{x}_d - \)
Mean = 0.000
SD = 0.024
\(0.024\)
\(0\)
Animation content:
Static ±gure: Measuring the strength of evidence using the simulated null distribution.
The observed average difference is \(\bar{x}_d = 0.075\).
Step 1: The simulated null distribution of the average differences in run time for the two strategies found a mean of 0 and a standard deviation of 0.024.
The simulated null distribution is created from 1000 repetitions. The distribution is approximately bell shaped.
Step 2: The two-sided p-value can be calculated by counting the number of repetitions that have a simulated mean difference at least as extreme as 0.075 seconds. The estimated p-value is less than 0.001.
None of the simulated average differences are less than or equal to -0.075 and none of the simulated average differences are greater or equal to 0.075. Therefore, the simulation-based p-value \(= \frac{0+0}{1000}\),
Step 3: A standardized statistic can be calculated by subtracting the hypothesized value of the parameter and dividing by the standard error of the average difference, which can be estimated from the null distribution.
Standardized statistic \(= \dfrac{\bar{x}_d - 0}{0.024}\).
Animation captions:
1. The simulated null distribution of the average differences in run time for the two strategies
found a mean of 0 and a standard deviation of 0.024.
2. The two-sided p-value can be calculated by counting the number of repetitions that have a
simulated mean difference at least as extreme as 0.075 seconds. The estimated p-value is less
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
20/78
PARTICIPATION
ACTIVITY
1.2.8: Measuring strength of evidence.
1)
The standard deviation of the null
distribution measures _____.
how much the differences vary
within a repetition
how much the mean differences
vary between repetitions
2)
Considering the simulation, an observed
difference of 0.075 _____ when the null
hypothesis is true.
is surprising
is not surprising
3)
Based on the p-value and standardized
statistic, the study indicates very strong
evidence ___.
of a difference in running times
between the two angles, on
average, for the 22 runners in the
study
that the time to second base
does not differ between the two
angles, on average, for the 22
runners in the study
of a difference in running times
between the two angles, on
average, for the population of
runners similar to the ones in the
study
that the time to second base
does not differ between the two
angles, on average, for the
population of runners similar to
the ones in the study.
than 0.001.
3. A standardized statistic can be calculated by subtracting the hypothesized value of the
parameter and dividing by the standard error of the average difference, which can be estimated
from the null distribution.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
21/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4)
The p-value and standardized statistic
present _____ evidence against the null
hypothesis.
very strong
strong
moderate
little to no
Step 4, cont.: Estimating how large of a difference the base-running strategy makes
According to the results of the simulation, very strong evidence exists in favor of the alternative hypothesis
that the two base-running angles differ on average in the long run. The observed statistic \(\bar{x}_d\) is
greater than 0, suggesting the wide angle is faster than the narrow angle on average. But how much faster?
A 2SD CI will estimate how large the effect is. To estimate how much faster the wide-angle route tends to be, a
95% con±dence interval for \(\mu_d\) can be approximated using the standard deviation of the null
distribution.
Figure 1.2.1: A 2SD con±dence interval for the long-run difference in mean
run times.
PARTICIPATION
ACTIVITY
1.2.9: Calculating the 2SD con±dence interval.
1)
Find the lower boundary of the
con±dence interval. Type as #.###.
Check
Show answer
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
22/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2)
Find the upper boundary of the
con±dence interval. Type as #.###.
PARTICIPATION
ACTIVITY
1.2.10: Estimating the size of the difference in run times between the two
strategies.
1)
The con±dence interval can be
interpreted as being _____.
95% con±dent that the narrow
route takes 0.027 to 0.123
seconds longer than the wide
route on average
95% con±dent that the wide route
takes 0.027 to 0.123 seconds
longer than the narrow route on
average
con±dent that 95% of base
runners saved between 0.027
and 0.123 seconds by taking the
narrow route over the wide route
con±dent that 95% of base
runners saved between 0.027
and 0.123 seconds by taking the
narrow route over the wide route
2)
The 95% con±dence interval does not
include _____ because the p-value and
standardized statistic provided strong
evidence of a difference in running
times between the two angles on
average.
0
0.05
0.075
PARTICIPATION
ACTIVITY
1.2.11: Step 5: Forming conclusions.
Check
Show answer
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
23/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1)
The study _____ that the choice of angle
(wide or narrow) causes
a difference in
running time from home plate to
second base.
proves
provides convincing evidence
does not provide convincing
evidence
2)
The results from the study _____ be
generalized to all baseball players
because _____.
can; random assignment was
used
can; random sampling was used
cannot; random assignment was
not used
cannot; random sampling was
not used
Step 6: Did pairing help?
The data could be incorrectly analyzed using a two-independent-samples procedure. If the researcher had
treated the 22 observations from the narrow angle and the 22 observations from the wide angle as coming
from two different sets of 22 runners, would the p-value stay the same, increase, or decrease?
PARTICIPATION
ACTIVITY
1.2.12: Bene±ts of a paired design.
Outcomes
5.498
5.495
Outcomes
5.489
5.505
5.445
5.548
Mean = 0.000
SD = 0.024
Paired Data ©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
24/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Outcomes
{
1000
repetitions
Mean = - 0.003
SD = 0.079
Difference in means
Independent Groups Animation content:
Static ±gure: Comparing the results when the data is incorrectly treated as independent groups and when the data is correctly treated as paired data.
Step 1: If the analysis incorrectly treats the observations from the narrow-angle and wide-angle strategies as independent samples, the pairing is ignored.
The difference in time for each runner is no longer calculated. Only the single difference in the means of each group is recorded.
Step 2: To simulate, all outcomes are randomly shu³ed into two groups. Because observations are no longer paired, both times from a single runner may end up in the same group.
Step 3: The mean of each group is recorded and the difference in group means is plotted.
Ex: one shu³ing produces an average narrow time of 5.498 and an average wide time of 5.495. The difference \(5.498 - 5.495 = 0.003\) is recorded in the null distribution.
Step 4: The process is repeated for 1,000 total shu³es. The null distribution has a mean of -0.003 and a standard deviation of 0.079.
The differences in group means range from approximately -0.52 to 0.25.
Step 5: The null distributions produced by the two approaches can be compared on the same scale. Ignoring the paired study design results in a different null distribution and therefore a different p-value.
Incorrectly treating the data as independent groups produces a simulation-based p-value greater than 0.3. Correctly treating the data as paired data produces a simulation-based p-value less than 0.0001.
Animation captions:
1. If the analysis incorrectly treats the observations from the narrow-angle and wide-angle
strategies as independent samples, the pairing is ignored.
2. To simulate, all outcomes are randomly shu³ed into two groups. Because observations are
no longer paired, both times from a single runner may end up in the same group.
3. The mean of each group is recorded and the difference in group means is plotted.
4. The process is repeated for 1,000 total shu³es. The null distribution has a mean of -0.003
and a standard deviation of 0.079.
5. The null distributions produced by the two approaches can be compared on the same scale.
Ignoring the paired study design results in a different null distribution and therefore a different
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
25/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.2.13: Consequences of an incorrect two-sample analysis.
1)
Treating "narrow" and "wide" as
independent samples produces a null
distribution with a _____ standard
deviation compared to treating data as
paired.
smaller
similar
larger
2)
Treating "narrow" and "wide" as
independent samples yields _____
conclusion about the null and
alternative hypotheses compared to
treating data as paired.
the same
a different
3)
A study design which does not pair the
runners fails to account for _____.
sample-to-sample variation
runner-to-runner variation
confounding variables
CHALLENGE
ACTIVITY
1.2.1: Simulation-based approach for analyzing paired data.
581360.4180600.qx3zqy7
p-value.
1.3 Example: Theory-based approach to analyzing
data from paired samples
Learning goals
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
26/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Identify when a theory-based approach would be valid to ²nd the p-value or a
con²dence interval for paired data.
Use the Two Means tool to ²nd theory-based p-values and con²dence interval for
paired data.
Does adding a laugh track make a dad joke better received?
A "dad joke" is a short joke told with either sincere humorous intent or with the intent to provoke a negative
response at the joke's over-simplicity. Ex: What is brown and sticky? A stick. Television shows often add laugh
tracks to make the show seem funnier. Step 1: Can funniness ratings of dad jokes be increased on average by
adding a laugh track?
Step 2: Researchers (
Cai et al., 2019
) had a professional comedian record 40 dad jokes. People listened to the
jokes and scored the joke on a scale of 1 to 7, with 1 being not funny at all and 7 being extremely funny. The
average score was computed for each joke, giving each joke a funniness rating. Then different people listened
to the same jokes but with a laugh track, and a funniness rating was again determined for each joke. Step 3:
The results for this paired design are shown below.
PARTICIPATION
ACTIVITY
1.3.1: Laugh track study design and data.
Mean = 3.010
SD = 0.490
Mean = 2.715
SD = 0.507
Mean = 0.295
SD = 0.427
No laugh track:
Laugh track:
Rating
Difference
Laugh track Difference ©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
27/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Laugh track Difference Animation content:
Static ±gure: The study design and data of the laugh track study. Laugh track study data
Joke ID
Laugh track
No laugh track
Difference
1
3.22
2.80
0.42
2
3.09
2.56
0.53
3
3.08
2.65
0.43
...
...
...
...
Step 1: Ratings for each joke, with the laugh track and without the laugh track, were determined for 40 dad jokes. Lower ratings correspond to a less funny joke.
Jokes with the laugh track were rated between approximately 2.1 to 4.1 with a mean of 3.010 and a standard deviation of 0.490.
Jokes with no laugh track were rated between approximately 1.5 to 4.7 with a mean of 2.715 and a standard deviation of 0.507.
The dot plots for both the laugh track ratings and no laugh track ratings appear to have multiple clusters.
Step 2: To analyze the data, the difference in ratings (laugh track - no laugh track) was computed for each joke.
The difference in ratings has a mean of 0.295 and a standard deviation of 0.427. The majority of the differences fall between approximately -0.5 and 1, one difference is approximately -1.3.
The dot plot of differences has one unusual observation at approximately -1.3 but is not drastically skewed.
Step 3: The results show that, while some jokes with the laugh track had a lower rating than without the laugh track, on average the rating was higher when a laugh track was used.
Summary statistics
Laugh track?
Sample size, \(n\)
Sample mean
Sample SD
Laugh track
40
\(\bar{x}_{lt} = 3.010\)
\(s_{lt} = 0.490\)
No laugh track
40
\(\bar{x}_{nl} = 2.715\)
\(s_{nl} = 0.507\)
Difference
40
\(\bar{x}_{d} = 0.295\)
\(s_{d} = 0.427\)
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
28/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.3.2: Laugh track study design.
1)
The observational units in the study are
______.
the subjects listening to jokes
whether or not a laugh track was
used
the 40 jokes
2)
The explanatory variable is _____, which
is a _____ variable.
whether or not a laugh track was
used; categorical
whether or not a laugh track was
used; quantitative
the rating for each joke;
categorical
the rating for each joke;
quantitative
3)
The response variable is _____, which is
a _____ variable.
whether or not a laugh track was
used; categorical
whether or not a laugh track was
used; quantitative
the rating for each joke;
categorical
the rating for each joke;
quantitative
Animation captions:
1. Ratings for each joke, with the laugh track and without the laugh track, were determined for 40
dad jokes. Lower ratings correspond to a less funny joke.
2. To analyze the data, the difference in ratings (laugh track - no laugh track) was computed for
each joke.
3. The results show that, while some jokes with the laugh track had a lower rating than without the
laugh track, on average the rating was higher when a laugh track was used.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
29/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4)
The laugh track study is an ______.
observational study
experiment
5)
The laugh track study is an example of
a/an ______.
independent groups design
paired design
6)
The statistic is the ______ funniness
ratings.
mean of the differences in
difference of the mean
Step 4: Draw inferences beyond the data
The parameter in the laugh track study is de±ned as the long-run difference in the ratings when a laugh track is
present compared to no laugh track and can be expressed using \(\mu_d\).
Using \(\mu_d\), the hypotheses can be stated as:
$$ H_0{:}\ \mu_d = 0 \text{ (Including a laugh track makes no difference, on average, with regard to rating the
joke.)}\\ H_a{:}\ \mu_d > 0 \text{ (Including a laugh track is associated with higher average rating for dad
jokes.)} $$
If the null hypothesis is true, the presence of a laugh track makes no difference in the funniness rating. The
chance model accounts for the "no difference" assumption and the pairing in the study design by randomly
swapping the funniness ratings between laugh track and no laugh track for each joke. Ex: The random
swapping decisions could be based on the outcome of a coin ²ip. The results of the random swapping are
used to build up a simulated null distribution as shown below.
PARTICIPATION
ACTIVITY
1.3.3: Simulation-based p-value.
Rating
Rating
Rating
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
30/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.3.4: Exploring the null distribution of the average difference in joke ratings.
Difference
Average difference
Difference
Difference
{
1000
repetitions
p-value = \(\frac{0}{1000}
\(\ge 0.295\)
Animation content:
Static ±gure: Creating a null distribution of simulated average differences and calculating a simulation-based p-value.
The observed average difference is \(\bar{x}_d = 0.295\).
Step 1: To model the null hypothesis, the ratings with and without the laugh track are randomly swapped for each joke. Repeating the random swaps for the 40 jokes 1,000 times creates a null distribution.
1000 randomizations create a null distribution of average differences that range from approximately -0.3 to 0.25. The mean is 0.000 and the standard deviation is 0.083.
Step 2: The observed average difference of 0.295 is the right tail of the null distribution.
Step 3: The estimated p-value is less than 0.001.
No randomized difference is greater or equal to the observed mean difference of 0.295.
Animation captions:
1. To model the null hypothesis, the ratings with and without the laugh track are randomly
swapped for each joke. Repeating the random swaps for the 40 jokes 1,000 times creates a null
distribution.
2. The observed average difference of 0.295 is the right tail of the null distribution.
3. The estimated p-value is less than 0.001.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
31/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1)
The above animation indicates less
than a 0.1% chance of seeing a sample
mean difference in funniness rating of
0.295 or higher assuming _____.
including a laugh track makes no
difference, on average, with
regard to rating the joke.
including a laugh track causes an
increase, on average, in the joke
funniness rating.
the p-value is small
2)
The value of the standardized statistic
is _____.
0.295
0.691
3.55
3)
The p-value and standardized statistic
suggest very strong evidence that on
average, _____.
the 40 jokes in this study were
rated more highly with a laugh
track than without a laugh track
the 40 jokes in this study were
not rated more highly with a
laugh track than without a laugh
track
dad jokes are rated more highly
with a laugh track than without a
laugh track
data jokes are not rated more
highly with a laugh track than
without a laugh track
Theory-based approach: Matched pairs \(t\)-test
To analyze the differences, a theory-based approach for one sample mean can be used as long as the validity
conditions for the Central Limit Theorem are met:
the distribution of the differences is symmetric, OR
the sample size (number of differences) is larger than 20 AND the distribution of the differences is not
strongly skewed.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
32/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
When applied to paired data, the one-sample theory-based approach is called a paired \(t\)-test
. In terms of
the observed mean difference \(\bar{x}_d\) and standard deviation of the differences \(s_d\), the standard error
and \(t\)-test statistic are computed using the formulas:
$$ \begin{align} SE(\bar{x}_d) = \frac{s_d}{\sqrt{n}} \;\;\;\;\;\;\;\;\;\;\;\;\; \;\;\;\;\;\;\; t = \frac{\bar{x}_d - 0}{s_d /
\sqrt{n}} \end{align} $$
PARTICIPATION
ACTIVITY
1.3.5: Matched pairs \(t\)-test.
Validity conditions Theory-based results Mean = 0.295
SD = 0.427
Symmetric
\(20<n\)
\(\bar{x}_d = 0.295\)
Observed statistic \(t = \dfrac{\bar{x}_d - 0}{s_d /\sqrt{n}
\(=\dfrac{0.295-0}{0.427/\sqrt{40}}=
Simulated \(t\)-statistic
& not strongly skewed
Standardize
\(4.38\)
OR
p-value < 0.0001
Differences
Animation content:
Static ±gure: Calculating a theory-based p-value.
The observed differences have a mean of \(\bar{x}_d = 0.295\) and a standard deviation of \(s_d = 0.427\).
Step 1: The paired \(t\)-test can be used if either the sample distribution of differences is symmetric or the sample includes at least 20 differences and the sample distribution of differences is not strongly skewed.
Step 2: Though one low outlier is present, the sample distribution of differences is not strongly skewed, and the sample size is 40 jokes, satisfying the validity condition.
The distribution of differences is not symmetric. However, there are at least 20 observations and the data is not strongly skewed. Therefore the data passes the validity conditions.
Step 3: The standardized \(t\)-statistic is 4.38.
\(t = \dfrac{\bar{x}_d - 0}{s_d/\sqrt{n}} = \dfrac{0.295 - 0}{0.427/\sqrt{40}} = 4.38\).
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
33/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.3.6: Comparing the simulation and theory-based approaches.
1)
In the laugh track study, the validity
conditions for a theory-based approach
are met because ______.
there are at least 10 successes
and at least 10 failures
the sample size is more than 20
and the distribution of
differences is fairly symmetric
Each joke was rated on both
conditions, laughter and no
laughter.
The individuals rating the jokes
were not aware that other
individuals were/were not
hearing a laugh track.
2)
The theory-based p-value estimates the
long-run proportion of shu³es with a \
(t\)-statistic _____.
at or below 0.295
at or above 0.295
at or below 4.38
at or above 4.38
Step 4: The \(t\)-statistic of 4.38 can be compared to the appropriate \(t\)-distribution to measure the strength of evidence against the null hypothesis.
4.38 is in the right tail of the t-distribution. The area under the curve to the right of 4.38 is less than 0.0001. Therefore, the theory-based p-value is less than 0.0001.
Animation captions:
1. The paired \(t\)-test can be used if either the sample distribution of differences is symmetric or
the sample includes at least 20 differences and the sample distribution of differences is not
strongly skewed.
2. Though one low outlier is present, the sample distribution of differences is not strongly skewed,
and the sample size is 40 jokes, satisfying the validity condition.
3. The standardized \(t\)-statistic is 4.38.
4. The \(t\)-statistic of 4.38 can be compared to the appropriate \(t\)-distribution to measure the
strength of evidence against the null hypothesis.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
34/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
3)
The theory-based p-value and
standardized statistic _____ the result
from the simulation-based approach.
match
do not match
Step 4 cont.: Estimating the long-run mean difference adding a laugh track causes
Both the simulation-based and theory-based approaches indicate very strong evidence in favor of the
alternative hypothesis that the presence of a laugh track does increase the long-run average funniness rating
of a dad joke. However, the test of signi±cance does not estimate the size of the difference.
As with previous studies, a con±dence interval can be used to estimate how large the difference is. In other
words, how much funnier are jokes with the laugh track rated on average?
PARTICIPATION
ACTIVITY
1.3.7: Theory-based con±dence intervals for the mean of the differences in
funniness rating.
Mean = 0.000
SD = 0.068
Simulation-based Theory-based \(SE({\bar{x}_d}) =\)
Mean difference
Mean difference
Difference
\(\bar{x}_d = 0.295\)
\(s_d = 0.427\)
\(SE(\bar{x}_d) = s_d/\sqrt{n}\)
\(=0.427/\sqrt{40}\)
\(=0.068\)
Multiplier =
Observed data \(0.083\)
Animation content:
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
35/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.3.8: Calculating the con±dence interval.
1)
The lower boundary for the 95% t
-
con±dence interval is ___. Type as
#.###.
2)
The upper boundary for the 95% t
-
con±dence interval is ___. Type as
#.###.
PARTICIPATION
ACTIVITY
1.3.9: Simulation and theory-based con±dence intervals for the long-run mean
of the differences in funniness rating.
Static ±gure: Comparing the simulation-based and theory-based standard error.
The observed differences have a mean of \(\bar{x}_d = 0.295\) and a standard deviation of \(s_d = 0.427\).
Step 1: The simulated null distribution has a mean of approximately zero and a standard deviation of 0.083.
Simulation-based \(SE(\bar{x}_d) = 0.083\).
Step 2: The theory-based approach predicts a standard deviation of \(s_d / \sqrt{n} = 0.427/\sqrt{40} = 0.068\).
Theory-based \(SE(\bar{x}_d) = s_d/ \sqrt{n} = 0.068\).
Step 3: For 95% con±dence and a sample size of 40, the multiplier for the t-interval is 2.023.
Animation captions:
1. The simulated null distribution has a mean of approximately zero and a standard deviation of
0.083.
2. The theory-based approach predicts a standard deviation of \(s / \sqrt{n} = 0.427 / \sqrt{40} =
0.068\)
3. For 95% con±dence and a sample size of 40, the multiplier for the \( t\)-interval is 2.023.
Check
Show answer
Check
Show answer
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
36/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1)
0.068 measures the ___.
the variability in the joke ratings
the variability in the differences in
joke ratings
the variability in the mean
differences in joke ratings
the margin of error
2)
The con±dence interval estimates _____.
the p-value
the difference in the sample
mean ratings of dad jokes
the population mean of the
differences in the ratings of dad
jokes
3)
A dad joke with a laugh track _____
compared to the same dad joke without
a laugh track.
will be rated 0.158 to 0.432
points higher 95% of the time
will be rated 0.158 to 0.432
points higher on average with
95% con±dence
will be rated 0.158 to 0.432
points higher on average by 95%
of audience members
4)
The p-value and standardized statistic
indicated that the sample mean
difference ___ signi±cantly different
from 0, so 0 ___ in the con±dence
interval.
is; is
is; is not
is not; is
is not; is not
PARTICIPATION
ACTIVITY
1.3.10: Step 5: Formulate conclusions.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
37/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1)
The researchers _____ conclude that the
laugh track caused the increased rating.
can
cannot
2)
The study results are generalizable to
_____.
all jokes
all dad jokes
all dad jokes told by this
professional comedian
Step 6: Look back and ahead
While these results were statistically signi²cant, a difference of less than 1 point on a 7-
point scale may not be worth the additional cost of producing the laugh track. The
researchers also wondered whether more authentic laughter might have an even larger
effect on funniness ratings. In the actual study, a third condition was also included:
spontaneous, genuine laughter. How can these results be analyzed along with the two
conditions presented here? The researchers also considered different types of
participants, in particular participants who were neurotypical or had autism. How might
the results differ in these different populations?
CHALLENGE
ACTIVITY
1.3.1: Theory-based approach to analyzing data from paired samples.
581360.4180600.qx3zqy7
1.4 Supplemental Exploration: Paired designs
Rounding ±rst base
Imagine you are at the plate in baseball and have hit a hard line drive. You want to try to stretch your hit from a
single to a double. Does the path that you take to “round” ±rst base make much of a difference? For example
(see the ±gure below), is it better to take a "narrow angle" or a "wide angle" around ±rst base? (This exploration
is based on an actual study reported in a master’s thesis by W. F. Woodward in 1970.
)
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
38/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Figure 1.4.1: Narrow angle versus wide angle.
Think about designing a study to investigate this question.
PARTICIPATION
ACTIVITY
1.4.1: Question 1.
Identify the explanatory and response variables in this study.
PARTICIPATION
ACTIVITY
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
39/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1.4.2: Question 2.
How would you design an observational study to investigate this question? Explain why an
observational study would not allow you to decide which base-running angle is better than
the other.
PARTICIPATION
ACTIVITY
1.4.3: Question 3.
Suppose 20 baseball players volunteered to participate in an experiment. Suppose that you
also plan to assign a single angle, either wide or narrow, to each player. How would you
decide which player ran with which base-running angle?
A reasonable experimental design would be to randomly assign 10 of the 20 players to run with the wide angle
and the other 10 to run with the narrow angle.
PARTICIPATION
ACTIVITY
1.4.4: Question 4.
Some runners are faster than others. Explain how random assignment controls for this, so
that speed is not likely to be a confounding variable in this study.
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
40/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Even though random assignment tends to balance out other variables (such as speed) between the two
groups, there’s still a chance that most of the fast runners could be in one group and most of the slow runners
in the other group. More importantly, there’s likely to be a good bit of variability in the runners’ speeds, and that
variability would make it harder to spot a difference between the base-running angles even if one angle really is
superior to the other.
PARTICIPATION
ACTIVITY
1.4.5: Question 5.
Suggest a different way of conducting the experiment to make sure that speed is completely
balanced between the two groups.
De±nition.
For a paired design
, response values come in pairs, with one response value in the pair for
Group 1 and the other for Group 2. Sometimes the pairs come from matching individuals to
create groups of two (we call this paired design using matching
); sometimes the pairs come
from measuring the same individual twice, once under each condition (we call this paired
design using repeated measures
). For an independent groups design
, each individual in a
group is unrelated to all the other individuals in the study. Each individual provides only one
response value.
In this study each runner can use both base-running angles. That way we can be sure that neither treatment
has more of the fast or slow runners, and we can also expect that differences in times for each runner will
show considerably less variability than individual running times.
PARTICIPATION
ACTIVITY
1.4.6: Question 6.
What aspect of this experiment should be determined randomly? (Hint: The treatment is not
determined randomly, because each runner experiences both treatments. But what other
factor could still have an effect on the response unless it was randomized?)
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
41/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.4.7: Question 7.
What do you suggest using as the variable to be analyzed with this paired-design
experiment? (Hint: Think of a better option than simply analyzing the set of times with the
wide angle and the set of times with the narrow angle separately the way you would for an
independent groups design of the sort described in #3 and #4.)
With a paired design, we analyze the differences in the response between the two treatments. In this case we
would calculate the difference in running times between the wide and narrow angles for each player and then
analyze the sample of differences. You will learn how to do this in future sections.
The order in which the players run with the two angles should be determined randomly; otherwise, the order
could be a confounding variable. Perhaps players would be slower with their second angle because of fatigue
or perhaps they would be faster with their second angle if they were slow to get warmed up. Randomizing the
order takes away any worries about an order effect.
PARTICIPATION
ACTIVITY
1.4.8: Question 8.
So far you have seen three designs for this study. The ±rst (#2) was observational. The
second (#3 and #4) was a randomized experiment with independent groups. The third (#5
and #6) used a paired design with pairs created by repeated measures. Consider a fourth
design: Suppose you have 20 players, as before, and you have the time for each player in a
100-yard dash. Explain how you could use this information to create pairs of runners that you
expect to be similar and how you would assign one runner in each pair to the narrow angle
and the other to a wide angle.
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
42/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.4.9: Question 9.
Of the four designs (observational with independent groups, experimental with independent
groups, paired design using repeated measures, and paired design using matching), which do
you think is best for this context? Explain why. (Hint: Pairing works best when the units in a
pair are as similar to each other as possible.)
PARTICIPATION
ACTIVITY
1.4.10: Question 10.
As noted in #9, pairing works best when the two units in a pair are as similar as possible.
When the units in a pair are not similar, pairing will not be effective. Invent and describe an
example where you think pairing will be effective. Give the response and explanatory variable,
how you would create the pairs, and why you think pairing will be effective.
PARTICIPATION
ACTIVITY
1.4.11: Question 11.
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
43/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Invent and describe an example where you think pairing will not be effective. Again, give the
response and explanatory variable, how you would create the pairs, and this time why you
think pairing will not be effective.
PARTICIPATION
ACTIVITY
1.4.12: Question 12.
For some contexts, the strategy of using repeated measures to create pairs is either
impossible or a bad idea. Invent and describe such a context.
PARTICIPATION
ACTIVITY
1.4.13: Question 13.
Pairing can also be used in the design of an observational study. Revisit the observational
study in #2, and suppose also that you have times in the hundred yard dash for each of the
20 players. Explain how you could use those times to create pairs. (Suppose that, of the 20
players, 12 used the wide angle and only 8 used the narrow angle. This means you can create
only 8 pairs.)
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
44/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1.5 Supplemental Exploration: Simulation-based
approach for analyzing paired data
Exercise and heart rate
Raising your heart rate is one of the major goals of exercise. Do some exercises raise your heart rate more
than others? Which common exercise, jumping jacks or bicycle kicks, raises your heart rate more? In this
exploration, you are going to compare heart rates of you and your classmates doing jumping jacks and bicycle
kicks. If you don’t want to collect your own data, you may use data already collected on a class where jumping
jacks were compared to bicycle kicks. The dataset is JJvsBicycle
. Below are links to YouTube videos explaining
how to do each of these exercises.
Jumping jack: http://www.youtube.com/watch?v=dmYwZH_BNd0
Bicycle kicks: https://www.youtube.com/watch?v=9FGilxCbdz8
After you do the exercise for 30 seconds, you will measure your heart rate. For instructions on how to measure
your heart rate, visit a website like WebMD: https://www.webmd.com/heart-disease/heart-failure/watching-
rate-monitor#1 It is very important that you follow these protocols carefully so we have high-quality data to
analyze. Although it is best to have the largest sample size possible, if you feel uncomfortable doing any of
these exercises, please don’t feel you have to.
Question 1.
You will ±rst randomize the order (jumping jacks ±rst or bicycle kicks ±rst) in which you
measure your heart rate. If heads:
Do jumping jacks for 30 seconds. Then, measure your jumping jack heart rate.
Sit down and take a break for two minutes.
Now do bicycle kicks for 30 seconds. Then, measure your bicycle kick heart rate.
If tails:
Now do bicycle kicks for 30 seconds. Then, measure your bicycle kick heart rate.
Sit down and take a break for two minutes.
Do jumping jacks for 30 seconds. Then, measure your jumping jack heart rate.
PARTICIPATION
ACTIVITY
1.5.1: Question 1a.
Report your heart rate for bicycle kicks and jumping jacks in beats per minute.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
45/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.5.2: Question 1b.
Find the difference in your two heart rates (jumping jack heart rate–bicycle kick heart rate) in
beats per minute.
Question 1c.
Your instructor will give you and your classmates instructions on where to record your
jumping jack heart rate, bicycle kick heart rate, and difference (jumping jacks–bicycle kick).
PARTICIPATION
ACTIVITY
1.5.3: Question 2.
Explain why it is reasonable to say that the two heart rates you collected should not be
treated as independent data.
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
46/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.5.4: Question 3.
Why do you think we randomized the order of jumping jacks and bicycle kicks before
measuring heart rates?
As seen in previous chapters, we can summarize the quantitative data on heart rates using averages or means.
But, because the data are paired, instead of comparing mean jumping jack heart rate to mean bicycle kick
heart rate, we will instead look at the mean difference in heart rate between doing jumping jacks and bicycle
kicks. Thus, we can de±ne our parameter of interest to be \[ \mu_d = \text{long run mean difference in heart
rate when doing jumping jacks and bicycle kicks in the population of interest} \] Note that the subscript \(d\) in
\(
μ
_d\) is used to denote that we are looking at an average of differences.
PARTICIPATION
ACTIVITY
1.5.5: Question 4.
State the null and alternative hypotheses to test whether the mean difference in heart rate
between the two exercises is not 0.
Key Idea.
When the parameter of interest is the long-run mean difference or population mean difference,
the corresponding statistic is the sample mean difference.
PARTICIPATION
ACTIVITY
1.5.6: Question 5.
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
47/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Find the average of the differences between the two heart rates for the entire class. This is
the statistic we will use to summarize the data.
Your null hypothesis should essentially state that there is no difference in the heart rates between the two
exercises, on average. If that is the case, it doesn’t really matter if we swap someone’s jumping jack heart rate
with his or her bicycle kick heart rate. This is how we will model the null to develop a null distribution. To
randomly swap some of the values we can just use a coin ²ip. If the coin lands heads you will swap the two
heart rates. If the coin lands tails you won’t swap the heart rates.
PARTICIPATION
ACTIVITY
1.5.7: Question 6.
Flip a coin for each pair of heart rates and switch the appropriate ones. Recalculate the
difference in heart rates and ±nd the new simulated mean difference. Plot this value on the
board in the classroom along with those from the rest of the class. Where does the actual
statistic you found in question 5 ±t in this null distribution? Is it out in the tail?
Question 7.
As you know, it would be better to have many more simulations than what your class just did.
We will do this by using an applet.
Go to the Matched Pairs applet
.
Press Clear
to erase the default data and then copy and paste the JJvsBicycle data
(both columns— jacks and kicks) into the Data
window. Then press Use Data
.
Notice that the applet graphs the individual heart rates in each group, along with the
means and standard deviations for the two groups.
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
48/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Below that, the applet provides a dotplot of the differences in the heart rates in the
sample. Please note that these difference values can be negative numbers because you
are looking at change or difference
in heart rates. The "Differences" graph also shows the
mean of the differences and the standard deviation of the differences.
Write down these values in the following table:
Condition
Sample mean \(\overline{x}\)
Sample standard deviation \(s\)
Jumping jacks
\( \overline{x}_{jj} = \)
\( s_{jj} = \)
Bicycle kicks
\( \overline{x}_{bicycle} = \)
\( s_{bicycle} = \)
Diff = JJ - BK
\( \overline{x}_d = \)
\( s_d = \)
Question 8.
The Matched Pairs applet will perform the simulation similar to what you did with ²ipping a
coin.
Check the Randomize
box.
Set the number of times to Randomize
to 1 and press Randomize
.
Once the coin tosses have determined which heart rate will be in which column, the
applet displays the rerandomized data (the colors show you the original column for each
observation, so you should see a mix in each group now).
This could-have-been value for the mean difference is added to the Average Difference
graph.
Write down these could-have-been values for the re-randomized data:
Condition
Rerandomized sample mean \
(\overline{x}\)
Rerandomized sample standard
deviation \(s\)
Jumping
jacks
\( \overline{x}_{jj} = \)
\( s_{jj} = \)
Bicycle kicks
\( \overline{x}_{bicycle} = \)
\( s_{bicycle} = \)
Diff = JJ - BK
\( \overline{x}_d = \)
\( s_d = \)
PARTICIPATION
ACTIVITY
1.5.8: Question 8a.
How does the actual mean difference compare to your simulated mean difference: More
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
49/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
extreme/Less extreme/Similar.
PARTICIPATION
ACTIVITY
1.5.9: Question 8b.
How are you deciding?
Question 9.
Update the number of times to Randomize
to 99 (for a total of 100 repetitions), and press
Randomize
. Consider the graph "Average Difference" that the applet has created
PARTICIPATION
ACTIVITY
1.5.10: Question 9a.
How many dots are in this graph?
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
50/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.5.11: Question 9b.
What does each dot represent?
The table below summarizes the key aspects of the simulation:
Null hypothesis = Long-run average difference in heart rates is 0
One repetition = Rerandomizing (possibly swapping) exercise heart rates within students
Statistic = Average difference in heart rates in the sample
Question 10.
To see many more possible values of mean difference in sample means that could have been,
IF jumping jacks and bicycle kick rates were swappable, update the number of times to
Randomize
to 900 and press Randomize
(for a total of 1,000 repetitions). Describe the
updated "Average Difference" graph with the 1,000 samples or repetitions, with regard to the
following features.
PARTICIPATION
ACTIVITY
1.5.12: Question 10a.
Shape:
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
51/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.5.13: Question 10b.
About what number is this graph centered? Explain why you were expecting this.
PARTICIPATION
ACTIVITY
1.5.14: Question 10c.
This graph also reports a value for standard deviation, SD. Report this value and give a simple
interpretation of this value, as in, "What is this value measuring?"
PARTICIPATION
ACTIVITY
1.5.15: Question 11.
You now should have generated 1,000 possible values of the mean difference in jumping
jacks and bicycle kick heart rates that were simulated assuming the null hypothesis was true
and these rates were the same, on average. How does the observed mean difference from
your data (as reported in question 5 and question 7) compare to these simulated values? Is
an average difference in heart rates like that observed in the actual study unlikely to happen
by chance alone if jumping jacks and bicycle kick heart rates are the same, on average? How
are you deciding?
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
52/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.5.16: Question 12.
To quantify the strength of evidence against the null hypothesis, you can ±nd the p-value. Go
back to the Matched Pairs applet. In the Count Samples box, make an appropriate selection
from the drop-down menu. (Hint: In what direction does your alternative hypothesis look?)
and enter the appropriate number in the box (Hint: At least as extreme as what number?).
Report the approximate p-value.
PARTICIPATION
ACTIVITY
1.5.17: Question 13.
Use the p-value to state a conclusion in the context of the problem. Be sure to comment on
statistical signi±cance. Can you conclude that there is strong evidence that jumping jack
heart rate and bicycle kick heart rate differ? Why or why not? Can you draw a cause-and-
effect conclusion? To what population are you willing to generalize the results?
PARTICIPATION
ACTIVITY
1.5.18: Question 14.
Alternatively, you can summarize the strength of evidence using a standardized statistic. Find
the standardized statistic and con±rm that the strength of evidence you receive from the p-
value is approximately the same as with the standardized statistic.
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
53/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.5.19: Question 15.
We can again use the 2SD method to approximate a 95% con±dence interval for the mean
difference in heart rates between those that do the two exercises. The overall structure of the
formula is the same: statistic ± 2 (SD), where the statistic is the sample mean difference in
heart rates for your class and SD is the standard deviation of your null distribution when you
did 1,000 repetitions in the applet (not the standard deviation from the data). Use these
numbers to ±nd an approximate 95% con±dence interval for the mean difference in heart
rates for those that do the two exercises your class did.
PARTICIPATION
ACTIVITY
1.5.20: Question 16.
Provide an interpretation of this con±dence interval, being sure to explain the parameter in
this context.
As you may have already noticed, the strategy we used to ±nd the p-value for this study is the same 3S
strategy that is found in Step 4 of our statistical investigation method and has been used in analyses involving
one or two groups:
Statistic
: Compute the statistic in the sample. In this case, the statistic you looked at was the observed
mean difference in heart rates.
Simulate
: Identify a chance model that re²ects the null hypothesis. To simulate what could have been if
the null hypothesis is true, you can toss a coin for each student, and if it lands heads, swap the two heart
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
54/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
rates recorded for that student. If the coin lands tails, do not swap the heart rates. Repeat this process
1,000 times, recording the mean difference in heart rates each time and thus obtaining a distribution of
these mean differences that were simulated assuming the null hypothesis were true.
Strength of evidence
: If your actual observed statistic falls in the tail of the null distribution, then you
have strong evidence that there is a genuine difference in the average heart rates between the two
exercises.
Note: The heart rates from both exercises were paired on the same individuals, and so you used a simulation
method that lets you use this information.
Let’s check out how things would have worked had we ignored the pairing and analyzed the data as if the
jumping jacks heart rates and bicycle kicks heart rates had come from two totally different samples that were
independent of each other.
Question 17.
Go to the Multiple Means applet and analyze the data as though we have two independent
samples, as you did in a previous chapter. (The heart rate data are unstacked, meaning the
heart rates for the two exercises are given in two columns. Make sure you have the unstacked
box checked before you paste in your data.)
PARTICIPATION
ACTIVITY
1.5.21: Question 17a.
With regard to the graph of the distribution of "Shu³ed Differences in Means": What are the
mean and SD?
PARTICIPATION
ACTIVITY
1.5.22: Question 17b.
Find and report the approximate p-value.
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
55/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.5.23: Question 18.
Compare the SD of the null distribution obtained using the two-independent-samples method
to that obtained using the paired samples method. Which SD is larger?
PARTICIPATION
ACTIVITY
1.5.24: Question 19.
Compare the p-value obtained using the two-independent-samples method to that obtained
using the paired samples method. Which p-value is smaller and hence provides stronger
evidence against the null hypothesis of no difference?
Note: Using a paired samples method will often give a smaller p-value and hence stronger evidence against the
null hypothesis than the two-independent-samples method. Perhaps this is what you found in Question 19.
This would happen if students with high heart rates after doing jumping jacks tended to have high heart rates
after doing bicycles, and students with low heart rates after doing jumping jacks tended to have low heart
rates after doing bicycles. This would make the variability of the differences in heart rates small. However this
may not have happened for students in your class. If the variability within your jumping jack data or within your
bicycle data is small but there is a lot of variability in the differences, you could get a smaller p-value using the
two-independent-samples method.
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
56/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1.6 Supplemental Exploration: Theory-based
approach for analyzing paired data
Comparing auction formats
An economist at Vanderbilt University devised a study to compare different types of online auctions. In one
experiment he compared a Dutch auction to a ±rst-price sealed bid auction. In the Dutch auction the item for
sale starts at a very high price and is lowered gradually until someone ±nds the price low enough to buy. In the
±rst-price sealed bid auction each bidder submits a single sealed bid before a particular deadline. After the
deadline, the person with the highest bid wins. The researcher auctioned off collectible trading cards from the
game Magic: The Gathering. He placed pairs of identical cards up for auction; one would go into the Dutch
auction and the other to the ±rst-price sealed bid auction. He then looked at the difference in the prices he
received on the pair. He repeated this for a total of 88 pairs.
Question 1.
Before we look at the data that were collected and start the analysis, let’s make sure you
understand the study design.
PARTICIPATION
ACTIVITY
1.6.1: Question 1a.
Explain why the price data should be analyzed using paired samples as opposed to two
independent samples.
PARTICIPATION
ACTIVITY
1.6.2: Question 1b.
What makes a pair?
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
57/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.6.3: Question 1c.
What is the explanatory variable? Is it categorical or quantitative?
PARTICIPATION
ACTIVITY
1.6.4: Question 1d.
What is the response variable? Is it categorical or quantitative?
PARTICIPATION
ACTIVITY
1.6.5: Question 2.
State the relevant hypotheses in words. (Use a two-sided alternative.)
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
58/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.6.6: Question 3.
De±ne the parameter of interest and give the symbol that should be assigned to it.
PARTICIPATION
ACTIVITY
1.6.7: Question 4.
State the relevant hypotheses in symbols.
Question 5.
The data for the auction can be found in the ±le Auction
. The selling prices for the Dutch
auction are labeled Dutch and the prices for ±rst-price sealed bid auction are labeled FP. Paste
these data into the Matched Pairs applet
.
PARTICIPATION
ACTIVITY
1.6.8: Question 5a.
What is the sample mean price for the Dutch auction?
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
59/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.6.9: Question 5b.
What is the sample mean price for the ±rst-price sealed bid auction?
PARTICIPATION
ACTIVITY
1.6.10: Question 5c.
What are the mean and standard deviation for the difference in the two prices?
PARTICIPATION
ACTIVITY
1.6.11: Question 5d.
Determine a p-value using the applet. Explain how you did so.
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
60/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Notice that the simulated null distribution is bell-shaped. This is no coincidence. A theory-based method exists
that predicts this to occur when certain validity conditions are met.
Validity condition.
Validity conditions for theory-based analysis of paired data: Theory-based methods of
inference will work well for paired data if the distribution of differences has a symmetric
distribution, or you have at least 20 pairs (i.e., at least 20 differences) and the distribution of
the sample differences is not strongly skewed. This test is known as a paired t-test
.
PARTICIPATION
ACTIVITY
1.6.12: Question 6.
Are the validity conditions met for these data? Explain.
Question 7.
Because the sample is large enough without strong skewness in the distribution of
differences, we can use a theory-based approach. We will do this by using the Theory-Based
Inference applet
. To do this:
Open the Theory-Based Inference applet
.
Choose One mean
from the pull-down menu.
Enter the sample size, sample mean, and sample standard deviation for the differences
as you found in question 5.
Check the box for Test of Signi±cance
.
Enter the appropriate information for the hypotheses.
Make sure the appropriate sign for the alternative hypotheses is chosen.
Press Calculate
.
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
61/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.6.13: Question 7a.
What is the value of the standardized statistic? What does that number tell you?
PARTICIPATION
ACTIVITY
1.6.14: Question 7b.
In the light of the value of the standardized statistic, do you expect the p-value to be small or
not small? How are you deciding?
PARTICIPATION
ACTIVITY
1.6.15: Question 7c.
What is the value of the p-value? Is it similar to the p-value found using the simulation-based
method?
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
62/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Question 8.
Recall that the Theory-Based Inference applet can also produce con±dence intervals for the
parameter of interest.
Go to the applet, check the Con±dence interval
box and let the con±dence level be 95%.
Press Calculate CI
.
PARTICIPATION
ACTIVITY
1.6.16: Question 8a.
Report the 95% con±dence interval.
PARTICIPATION
ACTIVITY
1.6.17: Question 8b.
Interpret the 95% con±dence interval in context.
PARTICIPATION
ACTIVITY
1.6.18: Question 8c.
Does the 95% con±dence interval agree with your conclusion when using the p-value? How
are you deciding?
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
63/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.6.19: Question 9.
Based on the analysis in question 7 and question 8, state your conclusion in the context of
the study. Be sure to comment on: Statistical signi±cance: Do the data provide evidence that
selling cards in a Dutch auction differ than when sold in a ±rst-price sealed bid auction on
average? How are you deciding? Estimation: Find and interpret a 95% con±dence interval.
Causation: Can you conclude causation? If yes, what causes what? If not, how are you
deciding? Generalization: Can you extend the results of this study? Other kinds of cards?
Other types of items? Anything sold in an auction format on the Internet? How are you
deciding?
Submit
Submit
1.7 Investigation: Filtering water in Cameroon
E. coli rates over time
Students and professors from Hope College installed water ±lters in a rural village in
Cameroon as part of a program to improve community health. The village had no electricity or
water distribution system so villagers got water directly from a stream.
Students working on this project examined the quality of the ±lters by looking at many
different variables, including: general observations, ±lter observations, microbiology
observations, household practice observations, user perceptions, and water source
observations. When making inferences, the water ±lters considered are a sample of all ±lters
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
64/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
that could be constructed if this pilot project were expanded to other villages. Thus, inference
is to an as-yet-unbuilt, larger population of ±lters.
Step 1: Ask a research question.
This investigation will have several research questions. The ±rst is: On average, a signi±cant
difference in the E. coli counts exist between the water that has just been ±ltered and water
that is sitting in the bottom of a ±lter after it was ±ltered the previous day? The students
measured E. coli counts (per 100 mL) on the ±rst and second day after water was ±ltered for
14 different water ±lters.
PARTICIPATION
ACTIVITY
1.7.1: Step 2: Design a study and collect data.
What are the observational units?
PARTICIPATION
ACTIVITY
1.7.2: Step 2: Design a study and collect data.
What are the variables that were measured/recorded on each unit?
PARTICIPATION
ACTIVITY
1.7.3: Step 2: Design a study and collect data.
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
65/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Is there an explanatory/response relationship for the variables?
PARTICIPATION
ACTIVITY
1.7.4: Step 2: Design a study and collect data.
Are the variables categorical or quantitative?
PARTICIPATION
ACTIVITY
1.7.5: Step 2: Design a study and collect data.
Are the ±rst day's E. coli counts independent of or dependent on of the second day's E. coli
counts? Based on your answer, is this an independent samples or paired design?
PARTICIPATION
ACTIVITY
1.7.6: Step 2: Design a study and collect data.
Could the sample of size 14 be large enough to give a valid p-value from a theory-based test
of signi±cance?
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
66/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.7.7: Step 2: Design a study and collect data.
In words, what are the null and alternative hypotheses?
Summary statistics.
Displayed below are the summary statistics for E. coli counts for day 1, day 2, and the
differences \(d = (\text{day 1}) - (\text{day 2}) \)
sample size
mean
standard deviation
Day 1
14
43.479
64.185
Day 2
14
94.786
82.456
Differences
14
-51.307
58.687
PARTICIPATION
ACTIVITY
1.7.8: Step 3: Explore the data.
What are the average E. Coli counts for each day? Did the E. coli in the sample increase or
decrease on average from day 1 to day 2?
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
67/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.7.9: Step 3: Explore the data.
What is the average difference (day 1 - day 2) in E. Coli counts for each day? Does the sign of
the difference match the increase/decrease between day 1 and day 2??
Study data.
The data set for the study is available as a .csv ±le and can be uploaded into the multiple
means tool below. When doing so, you should load columns as groups and indicate that the
groups are matched pairs.
PARTICIPATION
ACTIVITY
1.7.10: Two means simulator.
Instructions: Samples: Upload a ±le then pick variables.
.csv ±le
Drag ±le here
or
File
Load columns as
Variables
Groups
Submit
Submit
Choose on hard drive.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
68/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Explanatory (group) variable
Response variable
PARTICIPATION
ACTIVITY
1.7.11: Step 4: Draw inferences beyond the data.
Conduct the appropriate test for signi±cance. What is your p-value and 95% con±dence
interval?
PARTICIPATION
ACTIVITY
1.7.12: Step 5: Formulate conclusions.
Based on your p-value and con±dence interval, what conclusions can you draw?
PARTICIPATION
ACTIVITY
1.7.13: Step 5: Formulate conclusions.
What broader population are you willing to generalize your ±ndings to, if any? Is there a
cause-and-effect relationship between the variables?
Pick
Pick
Set samples
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
69/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.7.14: Step 6: Look back and ahead.
Discuss the study design. Does pairing make sense? What would you do differently to
improve upon this study? What further research would you propose based on your ±ndings
from this study?
E. coli rates for different types of ±lters
The ±lters installed contained a diffuser plate, ±ne sand, coarse sand, and gravel. Different
±lters had different amounts of sand that were split into two groups: ±lters with more than 2
inches of sand and ±lters with less than 2 inches of sand. Students recorded You will
investigate the difference in E. coli counts when ±ltering the water with the two types of ±lters.
In this sample, we have results from 19 ±lters, 14 with more than 2 inches of sand and 5 with
less than 2 inches of sand.
PARTICIPATION
ACTIVITY
1.7.15: Investigating further.
What are the explanatory and response variables and what type?
Submit
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
70/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.7.16: Investigating further.
Are the ±rst group's E. coli counts independent of or dependent on of the samples of the
second group's E. coli counts? Based on your answer, is this an independent samples or
paired design?
PARTICIPATION
ACTIVITY
1.7.17: Investigating further.
Could the sample of 19 ±lters split into groups of size 14 and 5 be large enough to give a valid
p-value from a theory-based test of signi±cance?
Study data.
The data set for the study is available as a .csv ±le and can be uploaded into the multiple
means tool below. When doing so, you should load columns as variables.
PARTICIPATION
ACTIVITY
1.7.18: Two means simulator.
Instructions: Samples: Upload a ±le then pick variables.
.csv ±le
Drag ±le here
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
71/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
or
File
Load columns as
Variables
Groups
Explanatory (group) variable
Response variable
PARTICIPATION
ACTIVITY
1.7.19: Investigating further.
What are the average E. coli counts and standard deviation of E. coli counts for each group?
Is the E. coli level higher or lower with the higher level of sand in the ±lter?
PARTICIPATION
ACTIVITY
1.7.20: Investigating further.
Conduct the appropriate test for signi±cance. What is your p-value and 95% con±dence
interval? What is your conclusion?
Water ±lter ²ow rate
Choose on hard drive.
Pick
Pick
Set samples
Submit
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
72/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
As a general rule, the ²ow rate of a water ±lter in good working condition should be around
1,000 mL/min. Did the water ±lters have an average ²ow rate that is signi±cantly different than
1,000 mL/min?
PARTICIPATION
ACTIVITY
1.7.21: Investigating further.
What are some similarities and differences between this research question and the previous
research question about water ±lters?
Investigating further.
The 23 water ±lters' ²ow rates have a sample mean \(\overline{x}\) = 913.57 and sample
standard deviation \(s\) = 582.89. Graphed below is a dotplot showing the distribution of the
²ow rates.
PARTICIPATION
ACTIVITY
1.7.22: Investigating further.
Describe the distribution of ²ow rates.
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
73/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
You can use the one means tool below to carry out the appropriate test of signi±cance.
Assume that a hypothetical population of water ±lters have normally distributed ²ow rates
with population mean 1000. Estimate the population standard deviation \( \sigma\) as \(s\) =
582.89.
PARTICIPATION
ACTIVITY
1.7.23: One mean simulator with con±dence intervals.
Instructions: Samples: Enter manually, or upload a ±le then pick variable(s).
Population distribution
Pick
Population mean
Population standard deviation
PARTICIPATION
ACTIVITY
1.7.24: Investigating further.
Conduct the appropriate test for signi±cance. What is your p-value and 95% con±dence
interval? What is your conclusion?
Submit
Enter
Upload
Set distribution
Submit
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
74/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
PARTICIPATION
ACTIVITY
1.7.25: Investigating further.
Do your results imply that all the ±lters have ²ow rates of about 1,000 mL/min? Does testing
a single mean really telling you what you want to know about these ±lters?
Submit
1.8 Tools, data, and formulas
Learning Goals
Distinguish between the following study designs for comparing two groups on a
quantitative response: observational, randomly assigned independent groups,
matched pairs, repeated measures.
Perform and interpret a simulation-based analysis of paired data for a quantitative
response.
Perform and interpret a theory-based analysis of paired data for a quantitative
response.
Tools
The standalone tool below is the previously introduced simulation zyTool for two means (
8.59
) with the
capabilities for two independent samples plus added functionality for matched pairs.
PARTICIPATION
ACTIVITY
1.8.1: Two means simulator.
Instructions: Samples: Upload a ±le then pick variables.
.csv ±le
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
75/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Drag ±le here
or
File
Load columns as
Variables
Groups
Explanatory (group) variable
Response variable
Table 1.8.1: Multiple means simulator relevant features by section.
Section
Feature
Simulation-based approach for
analyzing paired data
Simulation-based analysis for matched pairs
using the mean difference
Theory-based approach for analyzing
paired data
Theory-based analysis for matched pairs using
the mean difference
Tools, data, and formulas
Input sample by uploading a .csv ±le
Data
Table 1.8.2: References.
Section
Data
Reference
Example: Paired designs
Strauss (2009)
Supplemental Exploration: Paired designs
FirstBase.csv
Woodward (1970)
Example: Simulation-based approach
for analyzing paired data
FirstBase.csv
Woodward (1970)
Supplemental Exploration: Simulation-
based approach
JJvsBicycle.csv
Choose on hard drive.
Pick
Pick
Set samples
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
76/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Section
Data
Reference
for analyzing paired data
Example: Theory-based approach
for analyzing paired data
DadJokes.csv
Cai et al. (Curr. Biol. 2019)
Supplemental Exploration: Theory-based
approach
for analyzing paired data
Auction.csv
Lucking-Reiley (Am. Econ.
Rev. 1999)
Investigation: Filtering water in Cameroon
Ecoli_time.csv
Ecoli_sand.csv
Ecoli_²ow.csv
Hope College (2007)
Formulas for two means
Independent samples:
\(\mu_1\) and \(\mu_2\) are the population means for the two groups.
\(n_1\) and \(n_2\) are the sample sizes for the two groups.
\(\overline{x}_1\) and \(\overline{x}_2\) are the sample means for the two groups.
\(s_1\) and \(s_2\) are the sample standard deviations for the two groups.
\(SD(\overline{x}_1 - \overline{x}_2)\) is the standard deviation of the statistic.
\(\text{SE} = \sqrt{ \frac{s_1^2}{n_1} + \frac{s_2^2}{n_2} } \) is the theory-based standard error that
approximates SD of the null distribution and \(SD(\overline{x}_1 - \overline{x}_2) \).
\(\text{standardized statistic} = \frac{\text{statistic} - \text{mean of the null distribution}}{\text{SD of the
null distribution}} \) is the number of standard deviations away from the mean of the sampling
distribution a statistic is.
\( t = \frac{\big(\overline{x}_1 - \overline{x}_2 \big) - 0}{\text{Simulated SD of null distribution}} \) is
the simulation-based standardized statistic for two independent means.
\( t =\frac{ \big(\overline{x}_1 - \overline{x}_2 \big) - 0 }{SE} = \frac{ \big(\overline{x}_1 -
\overline{x}_2 \big) - 0 }{ \sqrt{ \frac{s_1^2}{n_1} + \frac{s_2^2}{n_2} } } \) is the theory-based
standardized statistic for two independent means.
\(\text{statistic} \pm \text{margin of error} = \text{statistic} \pm \text{(multiplier)} \times \text{(SD of
statistic)}\) is a con±dence interval.
\( (\overline{x}_1 - \overline{x}_2) \pm 2 \times SD(\overline{x}_1 - \overline{x}_2) \) is a 2SD CI
approximating a 95% con±dence interval estimating \( \mu_1 - \mu_2 \).
\((\overline{x}_1 - \overline{x}_2) \pm \text{(multiplier)} \times (SE) \) is a theory-based con±dence
interval estimating \(\mu_1 - \mu_2\), where the multiplier comes from a t distribution with \( \frac{
\big( \frac{s_1^2}{n_1} + \frac{s_2^2}{n_2} \big)^2 }{ \frac{1}{n_1-1}\big( \frac{s_1^2}{n_1} \big)^2 +
\frac{1}{n_2-1}\big( \frac{s_2^2}{n_2} \big)^2 } \) degrees of freedom.
Matched pairs:
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
77/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
\(d = (group \ 1) - (group \ 2)\) is the variable of differences for each matched pair.
\(\mu_d\) is the population mean difference.
\(n\) is the sample size (number of matched pairs).
\(\overline{x}_d\) is the sample mean of the differences.
\(s_d\) is the sample standard deviation of the differences.
\(SD(\overline{x}_d)\) is the standard deviation of the statistic.
\(\text{SE} = s_d / \sqrt{n} \) is the theory-based standard error that approximates SD of the null
distribution and \(SD(\overline{x}_d) \).
\(\text{standardized statistic} = \frac{\text{statistic} - \text{mean of the null distribution}}{\text{SD of the
null distribution}} \) is the number of standard deviations away from the mean of the sampling
distribution a statistic is.
\( t = \frac{\overline{x}_d - 0}{\text{Simulated SD of null distribution}} \) is the simulation-based
standardized statistic for matched pairs.
\( t =\frac{ \overline{x}_d - 0 }{SE} = \frac{ \overline{x}_d - 0 }{ s_d/\sqrt{n} } \) is the theory-based
standardized statistic for matched pairs.
\(\text{statistic} \pm \text{margin of error} = \text{statistic} \pm \text{(multiplier)} \times \text{(SD of
statistic)}\) is a con±dence interval.
\( \overline{x}_d \pm 2 \times SD(\overline{x}_d) \) is a 2SD CI approximating a 95% con±dence
interval estimating \( \mu_d \).
\( \overline{x}_d \pm \text{(multiplier)} \times (SE) \) is a theory-based con±dence interval
estimating \(\mu_d\), where the multiplier comes from a t distribution with \(n-1\) degrees of
freedom.
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
©zyBooks 06/05/24 21:24 2090300
Melinie Weaver
MTH_218_58079156
6/5/24, 9:24 PM
zyBooks
https://learn.zybooks.com/zybook/MTH_218_58079156/chapter/1/print
78/78
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help