Lab_Wk05_2023a (1)
docx
keyboard_arrow_up
School
University of Wollongong *
*We aren’t endorsed by this school
Course
251
Subject
Statistics
Date
Jan 9, 2024
Type
docx
Pages
12
Uploaded by PresidentMusicCaterpillar33
STAT251
Fundamentals of Biostatistics
LABORATORY NOTES Week 5
Probability Distributions
Aim:
The focus of this lab is computing theoretical probabilities, means and standard deviations for discrete random
variables, and simulating random observations from discrete probability distributions.
1. Discrete probability distributions
Reference Table:
Refer to this table to help you complete the Logbook questions:
Table 1: Some properties of discrete distributions.
General discrete
distribution
Binomial distribution
(Week 4 Lectures)
Poisson distribution
(Week 4 Lectures)
Assumptions
1.
0
≤P
(
X
=
x
)
≤
0
0
≤P
(
X
=
x
)
≤
1
2.
∑ P
(
X
=
x
)
=
1
1. Two possible outcomes,
success and failure
2. Fixed number of trials, n
3. Trials independent
4. Fixed probability of
success
In sufficiently short
time:
1. Only 0 or 1 events
can occur.
2. Prob. of 1 event is
proportional to the
length of the interval.
3. Numbers of events
in non-overlapping
intervals are
independent.
Probability
(X=x)
P
(
X
=
x
)
p
(
x
)
=
(
n
x
)
p
x
(
1
−
p
)
n
−
x
,
where
Cx
❑
n
=
(
n
x
)
=
n!
x!
(
n
−
x
)
!
p
(
x
)
=
λ
x
e
−
λ
x!
Mean
(Expected
value)
μ
=
E
(
x
)
=
∑ x
⋅
p
(
x
)
E
(
x
)
=
np
E
(
x
)
=
λ
Standard
deviation
σ
σ
=
√
∑
(
x
−
μ
)
2
⋅
p
(
x
)
√
np
(
1
−
p
)
√
λ
Variance
σ
2
σ
2
=
∑
(
x
−
μ
)
2
⋅
p
(
x
)
np
(
1
−
p
)
λ
2. The Binomial Distribution
2.1 Binomial Probabilities
Log book questions:
1.
Check the capabilities of your calculator, and use it to evaluate
a.
10!
b.
0!
c.
10
C
2
d.
10
C
0
2.
Bill is very clumsy when it comes to using laboratory equipment. His supervisor has estimated that the
probability that Bill will break at least one test tube in any laboratory classes in which he takes part is 0.3. The
supervisor believes that this occurs independently of what happens in any other of Bill’s laboratory sessions.
This semester, Bill will take part in 10 laboratory classes; the supervisor is interested in the number of labs in
which at least one test tube was broken.
a.
Can the binomial probability formula be applied in this scenario? Justify your answer.
b.
Define an appropriate binomial random variable
X
, and state the number of trials
n
.
c.
If Bill takes part in 10 laboratory classes, use a hand calculator to find the probability that Bill breaks
test tubes in
i.
exactly two
classes.
ii.
at least two
classes.
2.2. Finding Binomial Probabilities using Jamovi
Now we will use the set-up of logbook question 2 to demonstrate how binomial probabilities may be computed with
Jamovi. In order to do so, we will need to install the
distrACTION
module.
Installation instructions
:
On the top right corner of Jamovi, click on the
Modules
button, then select Jamovi library. A list of modules will
appear. Click on the
Available
tab, then scroll down (about halfway) to find the
distrACTION
module and click
Install
.
Once the distraction module has been installed, it will appear along the other modules on the top bar.
To compute binomial probabilities, select
distrACTION > Binomial Distribution
. The binomial distribution has two
parameters:
size
(the number of trials) and
probability
(the probability of success). For this example, set size = 10
and probability = 0.3.
Now to calculate the probability of observing
x
successes, tick the Compute probability option under Function. In the
“
x1 =
” box, we will enter the values of
x
(the number of successes) whose probabilities we want to compute. For this
example, we will calculate the probability of having 0, 1, …, up to 10 successes. Note that you can only enter one
value of
x
at a time. The resulting probabilities will appear in the output on the right-hand side.
To reset the number of decimal places:
Click on the 3 dots on top right of window (just above the Modules button.
Change
Results
– Number format to 5.
Log book questions:
3.
For question 2 above,
a.
Complete the table of outcomes (possible
x
values) by copying the associated probabilities from
Jamovi. From the output check your answers in question 2 above.
x
P
(
X
=
x
)
0
0.02825
1
2
3
4
5
6
7
8
9
10
Total
1
b.
Find the probability that
X
= 3, P(X=3).
c.
Find the probability that
X
is less than 3, P(X<3).
d.
Find the probability that
X
is at least 3.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2.3 Expected Value and Standard Deviation of a Binomial Random Variable
The
expected value
or
mean
E(
X
) of a random variable is the value expected on average for a large number of
repetitions of the random experiment. Due to sampling variation, an observed value of a random variable is often
higher or lower than expected. This variation between observations is described by the
standard deviation
.
Log book questions:
4.
Use your calculator and the appropriate entries in the Reference Table on page 1, to find the
a.
expected value (mean);
b.
standard deviation of the number of laboratory classes during which Bill breaks at least one test tube.
2.4 What happens when we have data from a sample?
When we have data that come from a population, we will not directly observe the
EXACT
expectations or SDs we
have calculated from the
THEORY
.
Why? Well, remember those theoretical values are for the whole population , or “on the long run”, hence sampling
error (which is just the fact that we do not have the FULL POPULATION)
will creep in.
We will now check the theoretical calculations in Sections 2.2 and 2.3 by using simulated data. Simulated data are
data that via the computer are generated as if they had been sampled from a population with the specified distribution
behaviour.
Dow\nload and open the “Xbinom##.csv” from Moodle where “##” is the month you were born. If you are really
adventurous (and I suggest,
after the lab) go to the appendix section where I show how to simulate the data yourself.
2.4.1 Using simulated Binomial Data
We will now check the theoretical calculations in Sections 2.2 and 2.3 by using the simulated data
Each dataset contains 3 samples, each of which contain 100 values from the Binomial distribution with number of
trials n = 10 and probability of success p = 0.3.
Now that we have a randomly generated sample from the binomial distribution with
n
= 10 and
p
= 0.3, we can
compute summary statistics and a bar plot for this sample.
Note: these 3 variables are the observed outcomes of a discrete random variable. However, to get the required
graphical
output in Jamovi, we need to keep the variable type as set to Nominal (this is not ideal but necessary within
limitations of the package).
To compute the mean and standard deviation, use
Analyses > Exploration
> Descriptives
: put sample1,
sample2, and sample3 into the Variables box.
Under
Plots
, select
Bar plot
.
Note: These are particular to your “month” randomly generated data values. Yours are likely to be different from a
neighbour’s!
Compare with your neighbour.
Paste in your own!
Log book questions:
5.
For
your
randomly generated
sample1
, write down the
a.
Sample mean.
b.
Sample standard deviation.
c.
Comment on these values in comparison to the theoretical mean and standard deviation calculated in
Section 2.3.
d.
An estimate for
p
can be determined from the sample using the sample mean to estimate the mean
µ
;
that is,
x≈np
x
=
np
can be rearranged to find an estimate of
p.
Use the sample mean of your
sample1
to estimate
p
as
^
p
=
x
/
n
. (
Note that here,
n
is the number of binomial
trials
for
each
random draw, not the number of random numbers you’ve generated, which Jamovi calls
N
.)
Optional Log book question:
6.
Compare the mean and standard deviation of
your
sample2
and
sample3
against
sample1
. Estimate
p
based on the value of sample2 and do the same for sample 3, then compare these estimated values of
p
to
the theoretical ones in 2.2.
Example output:
Descriptives
sample1
sample2
sample3
N
100
100
100
Descriptives
sample1
sample2
sample3
Missing
0
0
0
Mean
3.3000
2.9400
3.0400
Standard deviation
1.3962
1.3395
1.5037
3. Poisson Distribution
e = alpha + (x10x)
3.1 Poisson Probabilities
Log book questions:
The following example has been taken from Daniel, W. (1999)
Biostatistics: A Foundation for Analysis in the Health
Sciences,
John Wiley & Sons: New York
.
In a study of suicides, Gibbons
et al
, found that the monthly distribution of adolescent suicides between 1977 and
1987 closely followed a Poisson distribution with parameter
λ
=
2.75
.
Let
X
represent the number of suicides in a month.
7.
Find the probabilities that:
a.
There are 2 adolescent suicides in a month.
b.
There are fewer than 2 adolescent suicides in a month.
c.
There are 2 or more adolescent suicides in a month.
d.
What are the mean and the standard deviation of random variable
X
?
8.
How would you answer questions 7 (a) – (d) for a two-month period? (i.e., what is the probability that there is
a total of two adolescent suicides in two months?)
3.2
OPTIONAL: Simulating Poisson Data
Again if you feel adventurous try the simulation of data, but I suggest to do it after the lab, seeing appendix A.2
Optional question:
Find the mean and standard deviation and include the ‘bar plot’. Comment on the mean and standard deviation with
reference to your answer in question 7(d).
4. General Discrete Random Variable
Log book questions
:
9.
Consider all the possible outcomes of tossing two 4-sided dice, each labelled from 1 to 4. Now define the
random variable
X
as the maximum of the two numbers obtained on each possible outcome.
a.
Draw a tree diagram to list the 16 possible outcomes of tossing two 4-sided dice.
b.
What are the possible values that
X
can take?
c.
Determine how frequently each unique value of
X
appears given the list of all possible outcomes of
tossing two 4-sided dice.
d.
Using your answers to (a), (b), and (c), determine the probability distribution of
X
, and write it in the
following table.
x
1
2
3
4
P(X = x)
e.
Calculate the mean, variance and standard deviation of
X
using the formulae in the Reference Table
in Section 1.
Continuous Random Variables: The Normal Distribution
Aim:
The aim of this part of the lab is to determine probabilities and quantiles for the normal distribution, determine and
interpret confidence intervals for population means. The normal and t-distributions are required for some of these
intervals.
Note:
Starred (*) exercises do not require Jamovi.
1.
The Normal Distribution
Mean
μ
and standard deviation
σ.
1.1 Finding probabilities using the properties of the normal curve.
Log book questions*
:
1.
For the following questions:
match the interval with the appropriate diagram of the shaded area under the Normal curve; and
use the diagram above to calculate the area shaded.
i.
Mean = 0 and sd = 1; between -1 and 3.
ii.
mean = 0 and sd = 1; between -1 and 1.
iii.
mean = 55 and sd = 4; between 47 and 59.
a)
iv.
μ
= 30 and
σ
= 5; between 30 and 35.
v.
μ
= 100 and
σ
= 15; less than or equal to 100.
vi.
μ
= 546.6 and
σ
= 73.1; between 619.7 and 692.8.
b)
c)
d)
e)
f)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1.2
The Standard Normal Distribution
The Standard Normal distribution has a mean
μ
= 0 and a standard deviation
σ
=1.
To calculate probabilities from the normal distribution in Jamovi, we need the
distrACTION
module.
Instructions:
1.
Click on
distrACTION
> Normal distribution
.
2.
Specify the mean and standard deviation under Parameters.
3.
To compute the probability P(X ≤ x
1
), tick the
Compute probability
option.
4.
Specify the value x
1
.
5.
The resulting probability will appear on the right hand side output under
Results
.
Alternatively, you can also use the online calculator in Appendix A to calculate probabilities from the
normal distribution.
Log book questions
:
2.
For the following questions:
Illustrate on the curve the area represented by the probability.
Note:
If using Word, use the
Insert
->
Shapes
function, draw the line to outline the region of in-
terest, and use a shape (e.g., a star) to mark the area you need. Alternatively, print these pages
out and draw and shade by hand.
Find the probability using Jamovi, an online calculator, or the Standard Normal Distribution Ta-
bles (on the Moodle site under
Probability Calculators and Statistical Tables
).
Reference: derived from those in the Stat131 Laboratory Manual compiled by Assoc. Prof. Anne Porter.
STAT
251
Laboratory Notes Week 5
7
a.
P
(
Z
<-1.2)=
b.
P
(
Z
>-1.2) =
c.
P
(
Z
>1.8)=
d.
P
(-1.2<
Z
<1.8)=
e. Show
P
(
Z
<-1.96) =
P
(
Z
>1.96)
STAT
251
Laboratory Notes Week 5
8
1.3 Standardising values and finding probabilities for a given distribution
Two steps are involved:
1. Standardise the
x
values to find the corresponding z-score:
z
=
x
−
μ
σ
2. Use the
z
-score to find the probabilities.
Log book questions
:
3.
The right hand span of males is normally distributed with a mean of 27 cm and standard deviation
of 4.5 cm. Let
X
denote the right hand span of males. For parts (a) to (e), complete the following
steps:
Determine the
z
-score.
Illustrate on the curve the area represented by the expression.
Determine the probability requested.
a.
P (X < 22.5) =
Z
b.
P
(
X
> 22.5) =
Z
c.
P
(
X
< 19.5) =
Z
STAT
251
Laboratory Notes Week 5
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
d.
P
(19.5 <
X
< 22.5) =
Z
e.
P
(
X
< 19) or
P
(
X
> 23) =
Z
STAT
251
Laboratory Notes Week 5
10
APPENDIX : Simulating the data
A.1 Simulating Binomial Data
We will now check the theoretical calculations in Sections 2.2 and 2.3 by simulation. Simulating Binomial
Data requires the
Rj
module, which you can install by
clicking the
Modules
button on the top right of the
Analyses
screen, and selecting
jamovi library
. Find the
Rj
module and click
INSTALL
.
Once the
Rj
module has been installed, a new icon labelled ‘
R
’ should appear along the taskbar next to the
other modules. Click on this button and select
Rj Editor
.
Now, to simulate samples from the binomial distribution, copy and paste the following block of code into
notepad or textedit, then copy and paste again (or type it all) intothe Editor window:
NOTE: important to copy paste first to a simple text editor as WORD
adds weird stuff that we can not see.
And hence if you copy paste directly you may run into trouble.
for(i in 1:12) {
sample1 <- rbinom(100, 10, 0.3)
sample2 <- rbinom(100, 10, 0.3)
sample3 <- rbinom(100, 10, 0.3)
XBinom <- data.frame(sample1, sample2, sample3)
write.csv(XBinom, file = paste("…/Xbinom",i,".csv",sep=""), row.names = FALSE)
}
Replace the “…” in the last line with the location on your device where you would like to save the
samples. For example, you could write
file =
"
C:/Users/Documents/STAT251/XBinom.csv
"
.
Make sure the slashes in your file path are forward slashes (/) and NOT backward slashes (\).
The first 3 lines generate 3 samples, each of which contain 100 values from the Binomial distribution with
number of trials
n
= 10 and probability of success
p
= 0.3. These variables are then stored in a “data frame”
(data table) and then saved to a XBinom
##.csv
(## goes from 1 to 12) files to the specified location on your
device.
Click the
button to run the code. Once you have run the code, a
XBinom##.csv
file containing the binomial
samples should appear in the location you just specified. In your Jamovi screen, click the triple-barred icon
on the top left, select Open, Use Browse to navigate to the location you specified, and open the new file.
Now that we have computed a randomly generated sample from the binomial distribution with
n
= 10 and
p
=
0.3, we can compute summary statistics and a bar plot for this sample.
STAT
251
Laboratory Notes Week 5
11
Note: these 3 variables are the observed outcomes of a discrete random variable. However, to get the
required output in Jamovi, we need to keep the variable type as set to Nominal (this is not ideal but
necessary within limitations of the package).
A.2 Poisson simulation
If time permits, repeat the steps in A.1 to simulate a sample of 100 from the Poisson distribution with
λ
=
2.75
. Note that the Jamovi procedure is exactly the same, except instead of
rbinom
, we use
rpois
for the Poisson distribution, and specify the value for
lambda
.
In Jamovi, click on the
R
module and select
Rj Editor
. Then copy and paste the following block of code into
the editor window:
Instead of
sample1 <- rbinom(100, 10, 0.3)
Do
sample1 <- rpois(100, lambda = 2.75)
And do this for sample2 and sample3
Change
XBinom to Xpois
wherever you saw it in the previous code
Make sure to replace the “…” in the filepath with the path where you want the file to be saved.
OK here is the full code
for(i in 1:12) {
sample1 <- rpois(100, lambda = 2.75)
sample2 <- rpois(100, lambda = 2.75)
sample3 <- rpois(100, lambda = 2.75)
XPois <- data.frame(sample1, sample2, sample3)
write.csv(XPois, file = paste("…/XPois",i,".csv",sep=""), row.names = FALSE)
}
STAT
251
Laboratory Notes Week 5
12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Recommended textbooks for you

Linear Algebra: A Modern Introduction
Algebra
ISBN:9781285463247
Author:David Poole
Publisher:Cengage Learning
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage
Recommended textbooks for you
- Linear Algebra: A Modern IntroductionAlgebraISBN:9781285463247Author:David PoolePublisher:Cengage LearningAlgebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:Cengage

Linear Algebra: A Modern Introduction
Algebra
ISBN:9781285463247
Author:David Poole
Publisher:Cengage Learning
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:9781133382119
Author:Swokowski
Publisher:Cengage