Notes4-RandomVariables_S
docx
keyboard_arrow_up
School
University of Wisconsin, Madison *
*We aren’t endorsed by this school
Course
324
Subject
Statistics
Date
Feb 20, 2024
Type
docx
Pages
17
Uploaded by AmbassadorDanger11020
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
N
OTES
4: D
EFINING
R
ANDOM
V
ARIABLES
AND
T
WO
C
OMMON
ONES
Q
UANTIFYING
THE
POSSIBLE
OUTCOMES
OF
A
STUDY
Before a study is performed, we do not know with certainty what outcome will be observed. Sometimes outcomes have qualitative descriptions (E.g.: moderate, severe, or no nausea in response to a medicine) and sometimes numerical values (E.g.: length of bolt produced). By focusing on the numeric summaries of qualitative
data, we can associate a numerical value with each outcome of an experiment.
In these notes we will learn some theoretical tools to describe characteristics of the population we are drawing from and also the numeric outcomes of some studies. We will also apply the probability rules learned in Notes 3 and continue to develop our simulation and coding skills.
R
ANDOM
V
ARIABLE
T
ERMINOLOGY
A random variable (RV) associates a numerical value with each outcome of a random process. It is customary to denote random variables with uppercase letters when considering all values it could take. It is called “random” because we don’t know the value observed until the experiment is completed. *You can think of the RV as the population of values that could be observed.
E.g. Many measurements are taken on babies moments after birth. Examples
of a few random variables of interest for newborn babies include: W=birth weight, X=Apgar score, Y=length of baby at birth, N=number of medical staff
in room, G=number of hemlock seeds that germinate out of 4 A realization of the RV is the value that is observed when the experiment is performed/recording is made. Realizations are usually denoted by lower-case letters. *You can think of a sample as a collection of realizations of a RV.
E.g.: w=3,321 grams is the weight of a recently born baby; x=9 is the Apgar score for a recently born baby, y1=20.0, y2=19.3, y3=19.4 are the lengths of
the last 3 babies born, g=2 if only 2 of the 4 hemlock seeds germinate
Weld Failures Example:
According to past study of
weld failures in a certain assembly, 85% of them occur
in the weld metal itself, 10% occur in the base metal,
and the cause is unknown in 5% of failures. Consider a scenario where 3 weld failures in this type of
assembly are observed. 1
Weld Metal Failure
0.85
Base Metal Failure
0.10
Unknown Failure
0.05
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
Weld Failures Example a:
Define a random variable X: the number of weld failures caused by the weld metal in the 3 failures observed
. Discuss what we know about this random variable before and after the experiment is conducted.
X: the number of weld metal cause failures in the observed.
Before: After: Types of Random Variables
A RV is called discrete if it has a countable number of values. If the values are arranged in order, there is a gap between each value and the next. The set of possible values may be infinite. E.g.: X: The Apgar score is on a scale from 0-10 based on skin condition, heart rate, muscle tone, breathing, and response when stimulated. Also, N: the number of nurses in the room for a procedure is a discrete random variable. A RV is called continuous if it is capable of taking an uncountable number of values in an interval. It represents some measurement on a continuous scale. E.g.: T: the daily maximum temperature in Madison, WI can be measured to any precision. Similarly for birthweight, W.
D
ESCRIBING
THE
V
ALUES
O
F
A D
ISCRETE
R
ANDOM
V
ARIABLE
A probability distribution of a random variable consists of the RV’s possible values along with the probabilities of realizations occurring. The descriptions of the possible values and probabilities can take the form of a probability histogram, table (discrete RV only), or formula.
*probability distributions are often approximated from empirical studies Probability Mass Function (pmf) is the probability distribution for a discrete random variable
and is a list of values that can be obtained, together with the probabilities of each value. *Values are mutually exclusive
*Each value has a probability between 0 and 1
*Sum of the probabilities is 1
E.g.1: We can write out the probability mass function for the random variable F: number of dots on the face that lands up when rolling a fair 6-sided die. There are 6 possible outcomes: [1,2,3,4,5,6] which we assume are equally likely so each outcome has a probability of 1/6.
E.g.2: Researchers recorded the Apgar scores of over 2 million newborn in a single year. The approximate probability mass function for Apgar score (X) 2
F
1
2
3
4
5
6
P(F=
f)
1/6
1/6
1/6
1/6
1/6
1/6
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
based on these 2 million newborns is given below. Each probability is based on the relative frequency observed in the 2 million newborns (ex, 2% of newborns in the 2 million had an Apgar score of 5). X
0
1
2
3
4
5
6
7
8
9
10
P(X=
x)
0.00
1
0.00
6
0.00
7
0.00
8
0.01
2
0.02
0
0.03
8
0.09
9
0.31
9
0.43
7
0.05
3
Weld Failures b:
According to past study of weld
failures in a certain assembly, 85% of them occur in the
weld metal itself, 10% occur in the base metal, and the
cause is unknown in 5% of failures.
Consider a scenario where 3 weld failures
in this type
of assembly are observed.
Complete the probability distribution for X: the number of weld metal caused failures in the observed weld failures .
What assumptions are we making in our calculations?
*Consider weld metal caused failure a Success (S) and all other outcomes Failure (F)
Meaning
x
P(X=x)
0 weld metal , 3 other
P(x=0)=P(FFF)=.15*.15*.15 =0.003375
1 weld metal , 2 other
1
2 weld metal , 1 other
2
P(x=2)= P(SSF or SFS or FSS)=3*.85^2*.15^1=0.325125
3
Notice:
P(X=0)+P(X=1)+P(X=2)+P(X=3)=
3
Weld Metal Failure
0.85
Base Metal Failure
0.10
Unknown Failure
0.05
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
Weld Failures c: An approximate probability histogram for X is given at right. Use it to compute the probability that in 3 weld failures, at least one of the failures will be in the weld metal, that is P
(
X ≥
1
)
. C
ENTER
AND
S
PREAD
O
F
A D
ISCRETE
R
ANDOM
V
ARIABLE
The expected value or mean value of a RV X is denoted E(X) or μ
X
. It represents
the mean of an infinite number of realizations
of X (infinite number of replications of
random process). Note: the mean doesn’t need to equal an observable value. *For discrete RV, we find the mean by finding the sum of the products of each
value by its probability: μ
X
=
E
(
X
)
=
∑
x
x
i
∗
P
(
X
=
x
i
)
*This will be the balancing point of a histogram showing the distribution of a random variable.
E.g.1: To compute the mean of F: number of dots on the face that lands up when rolling a fair 6-sided die:
μ
F
=
1
6
∗
1
+
1
6
∗
2
+
1
6
∗
3
+
1
6
∗
4
+
1
6
∗
5
+
1
6
∗
6
=
1
6
∗
(
1
+
2
+
3
+
4
+
5
+
6
)
=
1
6
∗
(
21
)
=
21
6
=
3.5
E.g.2: To compute the mean Apgar score in this population of 2 million:
μ
X
=
0.001
∗
0
+
0.006
∗
1
+
0.007
∗
2
+
0.008
∗
3
+
0.012
∗
4
+
0.02
∗
5
+
0.038
∗
6
+
0.099
∗
7
+
0.319
∗
8
+
0.437
∗
9
. The variance of a RV X, denoted VAR(X) or σ
X
2
gives a measurement of the variability of an infinite number of realizations of X. *For discrete RV, we multiply the squared deviations - from the values of the random variable to the mean by the probability of each respective value- and then find the sum. (This is equivalent to how we computed variance for a population of values) σ
X
2
=
VAR
(
X
)
=
∑
i
(
x
i
−
μ
x
)
2
∗
P
(
X
=
x
i
)
E.g. To compute the variance of F: number of dots on the face that lands up when rolling a fair 6-sided die:
σ
F
2
=
1
6
∗
(
1
−
3.5
)
2
+
1
6
(
2
−
3.5
)
2
+
1
6
(
3
−
3.5
)
2
+
1
6
∗
(
4
−
3.5
)
2
+
1
6
∗
(
5
−
3.5
)
2
+
1
6
∗
(
6
−
3.5
)
2
=
2.916667
4
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
The standard deviation of a RV X, denoted SD(X) or σ
X
is the standard deviation of an infinite number of realizations of X. It is the square root of the variance computed above.
E.g. To compute the standard deviation of F: number of dots on the face that lands up when rolling a fair 6-sided die: σ
F
=
√
2.916667
=
1.707825
Weld Failures d: Calculate the mean and standard
deviation of X: [the count of weld metal failures in 3
failures] using the approximate probability distribution
given at right in the histogram. Check your
computations in R. Interpret the value in context. (note: the probability values are rounded to 3 decimal
places)
Mean of X: μ
X
:
Meaning of μ
X
:
Variance of X: σ
X
2
Standard Deviation of X: σ
X
: Meaning of σ
X
:
5
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
B
INOMIAL
: A C
OMMON
D
ISCRETE
R
ANDOM
V
ARIABLE
Often we are interested in counting the number of times an outcomes of interest occurs in identically (or approximately so) repeated trials.
Bernoulli Trials
: The requirements of a Bernoulli trial are that they:
(i) only yield one of two outcomes (usually called Success and Failure)
(ii) have the same probability of success P(S)=
π
and probability of failure P(F)=1-
π
(iii) are independent.
Type O Blood Example a: Genetics says that each child of a particular pair of parents has probability 0.25 of having type O blood. Explain why each child’s blood type being type 0 from this pair of parents is reasonably approximated by a Bernoulli Trial.
(i) is met if we define the two outcomes as “Type O blood” (we’ll call
this “Success”) and “Not type O blood” (we’ll call this “Failure.”) Each child has the same probability, which meets requirements (ii): P(S) = 0.25 and P(F) = 0.75, so π=0.25.
Because each child’s blood type does not affect their siblings’ blood type, these are independent and (iii) is met.
The RV X is called a Binomial Random Variable if it gives the number of successes in n
Bernoulli trials. The number of trials (n) must be fixed and π
is the [consistent] probability of success in each trial. We denote such a RV as X~Bin(n,
π
). * P
(
X
=
x
)
=
(
n
x
)
π
x
(
1
−
π
)
n
−
x
=
n!
x !
(
n
−
x
)
!
π
x
(
1
−
π
)
n
−
x
for x=0, 1, …n gives the binomial probability distribution with n trials. in R: dbinom(x, size, prob, log = FALSE)
6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
*Mean of X: μ
X
=
nπ
and Variance of X: σ
X
2
=
nπ
(
1
−
π
)
Weld Failures Example e: Consider X: the number of weld metal failures observed in 3 weld failures.
Check the assumptions that need to be made so that X
is reasonably approximated by a binomial random
variable
. Is X counting the number of successes in a
fixed number of trials? Are they Bernoulli Trials?
1. Success? 2.
P(Success) the same for each trial? 3.
Are trials independent?
Weld Failures Example f: Use the binomial formula and R to calculate P(X=2)
and explain what that value means. Check your computations with R (Revisit parts b & c)
Weld Failures Example g: Use R to calculate P
(
X ≥
1
)
and explain what that value means. Check your computations for parts (b and c) with R.
Weld Failures Example h: Calculate the mean, variance, and standard deviation of X using the binomial shortcuts and compare these values to what we computed in Weld Failures Example d.
Mean of X: μ
X
=
¿
Variance of X: σ
X
2
=
¿
and Standard Deviation of X: σ
X
:
7
Weld Metal Failure
0.85
Base Metal Failure
0.10
Unknown Failure
0.05
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
Weld Failures Example i:
Consider the random variable Y: The number of base metal failures in 20 weld failures. Is Y a binomial random variable? If so, calculate P(Y=2) using the formula and also using the R function. If not, explain why not.
Weld Failures Example j:
Consider the random variable U: The number of weld
failures that occur before a base metal failure is observed. Is U a binomial random variable? If so, calculate P(U=2) using the formula and also using the R function. If not, explain why not.
Type 0 Blood Example b: Suppose a set of parents has 4 children. Let X be the number of children born with type O blood. Calculate the probability of each possible value of X assuming a binomial model for X~Bin(4, 0.25) is reasonable and create a histogram of its distribution. Describe the shape, center, and spread of the distribution of X.
*The probability that no children get type O blood: P(X=0)=
(
4
0
)
0.25
0
∗
0.75
4
=
0.3164062
*The probability that one child gets type O blood: P(X=1)=
(
4
1
)
0.25
1
∗
0.75
3
=
0.421875
*The probability that two children get type O blood: P(X=2)=
(
4
2
)
0.25
2
∗
0.75
2
=
0.2109375
*The probability that three children get type O
blood: P(X=3)=
(
4
3
)
0.25
3
∗
0.75
1
=
0.046875
*The probability that all four children get type O blood: P(X=4)=
(
4
4
)
0.25
4
∗
0.75
0
=
0.00390625
8
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
Check in R: dbinom(0:4, size=4, prob=0.25)
0.31640625 0.42187500 0.21093750 0.04687500
0.00390625
We see that the distribution of X~Bin(4,0.25) is right skewed. The average number of children with type 0 blood in a sample of size 4 is μ
X
=
4
∗
0.25
=
1
and the number of children with type O blood will typically differ from the average amount by about σ
X
=
√
4
∗
.25
∗
.75
=
0.866
in families with 4 children.
Type 0 Blood Example c: Suppose we evaluate 100 children from pairs of parents
that genetics says have probability 0.25 of having type O blood. Use R to calculate the probability of each possible value of X assuming a binomial model for X~Bin(100, 0.25) is reasonable. Create a histogram for the distribution of X~Bin(100, 0.25) and describe it’s shape, center, and spread. y=dbinom(0:100, size=100, prob=0.25)
x=0:100
plot(x,y, type="h", ylab="Probability"
main="Number of Children with Type 0
blood")
The distribution of X~Bin(100, 0.25) is discrete,
symmetric, and unimodal. It has mean
μ
X
=
100
∗
.25
=
25
and standard deviation
σ
X
=
√
100
∗
0.25
∗
0.75
=
4.330
. It is well approximated
by a normal distribution (see below and CLT, notes
5).
Describing the Values Of A Continuous Random Variable
Probability Density Function (pdf)
is the probability distribution
for a continuous
RV. It consists of ranges of values the RV can take, together with a density function that lives on those ranges.
*The total area under the density function is 1.
*The area under the density function between two possible realizations gives the probability the RV will realize to a value in that range. *P(Y=y)=0 for each distinct realization of Y. E.g. 1: P(Y=27.3 g of emissions)=0, but 27.3 could be a measured emission value. The probability of Y being exactly
27.3 is 0, but in practice we can measure it as 27.3 if we are only able to measure to one decimal precision
.
E.g.2: The height of the frequency histogram for
birthweights of 3,228 newborns is approximated by a
smooth curve at right. The approximate density curve
for the birthweight data is given far right. A density
curve has area of exactly 1 underneath it and also
assumes we have an infinite number of unique
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
birthweights on any interval. We can use the area under the density curve between W=3 and W=4 to approximate the probability of a baby being born with birth weight between 3 and 4 kg.
The expected value or mean value of a continuous RV X is denoted E(X) or μ
X
. It represents the mean of an infinite number of realizations
of X (infinite number of replications of random process). *For generic continuous RVs, we need to use integration which we will not be doing in this class
*This will be the balancing point of a histogram showing the distribution of a random variable.
The variance and standard deviation of a continuous RV X, denoted VAR(X)= σ
X
2
and SD(X) = σ
X
,
respectively, gives a measurement of the variability of an infinite number of realizations of X. *For generic continuous RVs, we again need to use integration T
RANSFORMING
R
ANDOM
V
ARIABLES
Linear Transformations of RV: Suppose we have constants a, b, c and random variable X with E
(
X
)
=
μ
X
and Var
(
X
)
=
σ
X
2
. We can define new random variables by transforming X by adding/subtracting and multiplying/dividing by a constant [a linear transformation]
. The properties of the new random variables are described below:
*Let Y
=
X
+
c
,
E
(
Y
)
=
E
(
X
+
c
)
=
E
(
X
)
+
c
=
μ
X
+
c
and Var
(
Y
)
=
Var
(
X
+
c
)
=
Var
(
X
)
=
σ
X
2
; SD
(
Y
)
=
SD
(
X
)
=
σ
X
*Let P
=
aX
,
E
(
P
)
=
E
(
aX
)
=
a
∗
E
(
X
)
=
aμ
X
and Var
(
P
)
=
Var
(
aX
)
=
a
2
Var
(
X
)
=
a
2
σ
X
2
; SD
(
P
)
=
¿
a
∨
SD
(
X
)
=
¿
a
∨
σ
X
*Let L
=
aX
+
c
, E
(
L
)
=
E
(
aX
+
c
)
=
aE
(
X
)
+
c
=
aμ
X
+
c
and Var
(
L
)
=
Var
(
aX
+
c
)
=
a
2
Var
(
X
)
=
a
2
σ
X
2
; SD
(
L
)
=
¿
a
∨
SD
(
X
)
=
¿
a
∨
σ
X
Discrete E.g. A [hypothetical] college considers a student to be full-time if they are taking between 12 and 18 units in a semester. The number of credits C taken by a randomly selected full-time student at this college is given by the following distribution and is based on current enrollment . Confirm that the number of credits taken is on average: μ
C
=
14.73
with standard deviation σ
C
=
1.933
: C:
12
13
14
15
16
17
18
P(C=c)
0.21
0.08
0.07
0.37
0.10
0.02
0.15
10
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
If each credit is $75 and each full time student is assessed a student fee of $100 per semester, a probability distribution for T=75C+100 were T is the semester cost would be as given below:
T:
1000
1075
1150
1225
1300
1375
1450
P(T=t)
0.21
0.08
0.07
0.37
0.10
0.02
0.15
We can compute that the average total cost
μ
T
=
75
∗
μ
C
+
100
=
75
∗
14.73
+
100
=
1204.75
with standard deviation:
σ
T
=
75
∗
σ
C
=
75
∗
1.933
=
144.975
. These values match what we would find calculating mean and standard deviation from the T probability distribution.
Ferry Example: A [hypothetical] small car ferry runs every hour
from one side of a river to the other. Based on historical records,
the number of vehicles V on a randomly chosen ferry trip has the
probability distribution shown below in the table and histogram.
You can confirm that expected value is μ
V
=
3.87
and standard
deviation is σ
V
=
1.286
. (rounded to 3 decimal places)
Ferry Example a: The ferry charges $5 for each vehicle that
makes the trip. Let C=5*V
be the random variable for the amount
of money collected on a randomly selected trip. Fill in the
probability distribution for C and compute the expected value (
μ
C
),
standard deviation (
σ
C
), and variance (
σ
C
2
), for C. μ
C
=
¿
σ
C
2
=
¿
σ
C
=
¿
Ferry Example b: The ferry’s expenses are $20 per trip and they charge each car $5 for the trip. Let the random variable P=5*V-20 be the profit made by the ferry company on a randomly selected trip. Fill in the probability
distribution for P and compute the expected value (
μ
P
) and standard
deviation (
σ
P
) for P.
11
# of Vehicles,
V:
0
1
2
3
4
5
Probability:
0.0
2
0.0
5
0.0
8
0.1
6
0.2
7
0.4
2
Collected, C:
1*5=
5
3*5=1
5
4*5=2
0
5*5=2
5
Probability:
0.02
0.05
0.16
0.27
0.42
Profit, P:
-15
-5
5*4-
20=
0
5*5-
20=
5
Probability:
0.02
0.05
0.16
0.27
0.42
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
μ
P
=
¿
σ
P
2
=
¿
σ
P
=
¿
*Example adapted from Starnes, et al The Practice of Statistics
Birthweights Example: From the histogram at right, we are going to model Birthweight, W, as a continuous random variable with mean 3.39 kg and standard deviation 0.55 kg. Describe the distribution of values if we instead had the measurements in pounds.
You can use the conversion: 1
kg≈
2.205
lbs
.
Let P be the continuous random variable
describing birth weight in pounds. So P
=
2.205
∗
W
The overall shape of the density histogram would
not change, however the new mean and standard
deviation would be: μ
P
=
E
(
2.205
∗
W
)
=
2.205
∗
E
(
W
)
=
2.205
∗
3.39
=
7.475
lbs
σ
P
=
|
2.205
|
∗
σ
W
=
2.205
∗
0.55
=
1.213
lbs
The most common linear transformation is converting values into Standardized Units (a z score)
using the formula z
=
x
−
μ
σ
, where μ
and σ
are the mean and standard deviation of the population from which x was drawn.
*A z score tell us how many standard deviations an observation is above or below the population mean *One can compare z scores across distributions to compare scores within their relative distributions
E.g. A value that has a z score of 1 is a value that is exactly 1 standard deviation above the mean.
A Common Continuous Random Variable:
12
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
A RV X is called a Normal (Gaussian) RV if it has probability density function:
f
(
x
)
=
1
√
2
π σ
2
e
−
(
x
−
μ
)
2
2
σ
2
.
*A Normal RV X has a bell-shaped distribution symmetric about mean μ
X
with st. dev σ
X
, and these properties are specified in the
notation: X
N
¿
)
*Within 1, 2, and 3
standard deviations of the mean, there is ~68%, ~95%, and ~99.7% of data,
respectively.
P
(
μ
−
σ
<
X
<
μ
+
σ
)
=
.683
P
(
μ
−
2
σ
<
X
<
μ
+
2
σ
)
=
.954
P
(
μ
−
3
σ
<
X
<
μ
+
3
σ
)
=
.997
*Total area under the curve of f(x) from −
∞
<
X
<
∞
is 1. * If a RV X is distributed X
N
(
μ,σ
2
)
, the standardized variable Z
=
X
−
μ
σ
has the standard normal distribution Z
N
(
0,1
)
.
*If a RVs X
is normally distributed, then any linear transformation of X
is also normally distributed.
*The area under the density function between two possible realizations gives the probability the RV will realize to a value in that range and is computed in R. We need advanced computation methods to compute this area.
*Some common R functions for normally distributed random variable, X:
pnorm(): compute the area below [ or above] a specified value x
qnorm(): compute the specific value x with a desired amount of area below [or above] rnorm(): generate n random observations from a normal distribution
AA Battery Voltage Example: All of the AA batteries of a certain
model produced by a company had their voltage tested at
production and the voltages were well approximated by a normal
random variable with mean 1.6 and standard deviation 0.05.
V
N
(
μ
V
=
1.6
,σ
V
2
=
0.05
2
)
. Each battery is marketed as being 1.5V .
AA Battery Voltage a: Label a density histogram of V assuming a normal distribution and identify some of its characteristics
.
13
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
AA Battery Voltage b: Draw where a 1.5V battery falls on the V distribution. Also, convert 1.5V to its standardized value (z-score) and draw it on the standard normal distribution. AA Battery Voltage c:
What is the probability that a
randomly chosen battery from this population has
lower voltage than the advertised amount (1.5V)?
AA Battery Voltage d: If you examine one AA battery off of the production line, what is the probability that it has a voltage between 1.58V and 1.64V?
AA Battery Voltage e:
Above what weight is the highest 1% of voltages? That is what voltage is at the 99
th
percentile? What is the value’s z score?
14
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
AA Battery Voltage f:
What voltage is the 3
rd
quartile? AA Battery Voltage g:
Suppose two batteries are randomly chosen
off of the production line. They are selected without replacement, but
because there are so many batteries made by the production line (the
population is so large relative to the sample size), we can assume both
are selected from V
N
(
μ
v
=
1.6
,σ
v
2
=
0.05
2
)
.
What is the probability that
both batteries have voltage above the labeled amount (1.5 V)?
AA Battery Voltage h:
Suppose ten (10) AA batteries are randomly chosen off of the production line. They are selected without replacement, but because there are so many batteries made by the production line, we can assume all are
selected from V
N
(
μ
v
=
1.6
,σ
v
2
=
0.05
2
)
.
What is the probability that exactly
6 batteries have voltage above the labeled amount (1.5V)?
15
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
Boys’ Heights Example:
In the children’s growth chart that medical professionals reference, the heights of two-year old boys are nearly normally distributed with a mean of 34.5 inches and a standard deviation of 1.4 inches. We approximate the height of two-year old boys with a normally distributed random variable
H
N
(
μ
H
=
34.5
,σ
H
2
=
1.4
2
)
. Boys’ Heights a: Show that the distribution Z
=
H
−
34.5
1.4
has a
Normal distribution with a mean of 0 and standard deviation of 1.
Before we go through any mechanics, think of what the
equation is doing- it is subtracting 34.5 from each value of
H (which shifts H left by 34.5 units so the new mean is 0)
and dividing all values of H by 1.4 (reducing spread). Let’s confirm with
our transforming RV rules:
E
(
Z
)
=
E
(
H
−
34.5
1.4
)
=
1
1.4
∗
[
E
(
H
)
−
34.5
]
=
1
1.4
∗
(
34.5
−
34.5
)
=
0
Var
(
Z
)
=
Var
(
H
−
34.5
1.4
)
=
1
1.4
2
∗
Var
(
H
)
=
1
1.4
2
∗
1.4
2
=
1
Boys’ Heights b:
What is the probability that a randomly chosen two-year old boy has a height less than 33 in?
We can convert the height to standard units
P
(
H
<
33
)
=
P
(
Z
<
33
−
34.5
1.4
)
=
P
(
Z
←
1.071429
)
.
This height is more than 1 standard deviation below the
mean height. R by default assumes we are plugging in the
standardized score of the value of interest: pnorm(-1.071429)=
0.1419883
or we can specify the original value, mean, and sd: pnorm(33, mean=34.5, sd=1.4)=
0.1419884. There is a probability of ≈
.142 since ≈
14.2%
of heights (values
of H) are below 33 inches. We say the height of 33 is around
the 14th percentile of H.
Boys’ Heights c: Approximately what percent of two-year old boys have heights between 33.1 and 35.9?
We can convert to standard units:
P
(
33.1
<
H
<
35.9
)
=
P
(
33.1
−
34.5
1.4
<
Z
<
35.9
−
34.5
1.4
)
=
P
(
−
1
<
Z
<
1
)
to see
we are looking for the percent of values of H within one
standard deviation of the mean. Since H is approximately normally distributed, our rule of thumb tells us ≈
68% of heights.We can also use R
to check: pnorm(1)-pnorm(-1)= 0.6826895
16
Statistics
University of Wisconsin Madison – Chelsey
Green
chelseygreen@wisc.edu
Boys’ Heights d:
What height of two-year-old boys is at the 95
th
percentile? (ie, what value of H has ≈
95%
below and ≈
5%
above?)
We can look for what standardized (z) score has .95 area
below it with qnorm(.95, mean=0, sd=1)= 1.644854
and then convert that
value to h by undoing the standardization (z score)
transformation: 1.644854
=
h
−
34.5
1.4
, so h=
1.644854
∗
1.4
+
34.5
=
36.8028
inches. R will also do the conversion for us if we specify the mean and sd of H: qnorm(.95, mean=34.5, sd=1.4)= 36.8028
.
17
Related Documents
Recommended textbooks for you

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill

Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt

Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL

College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtHolt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGAL
- College Algebra (MindTap Course List)AlgebraISBN:9781305652231Author:R. David Gustafson, Jeff HughesPublisher:Cengage Learning

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill

Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt

Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL

College Algebra (MindTap Course List)
Algebra
ISBN:9781305652231
Author:R. David Gustafson, Jeff Hughes
Publisher:Cengage Learning