Entity Academy Lesson 5 Normal Distribution Notes
docx
keyboard_arrow_up
School
Liberty University *
*We aren’t endorsed by this school
Course
BASIC
Subject
Statistics
Date
Jan 9, 2024
Type
docx
Pages
13
Uploaded by JusticeFogPrairieDog37
The
Normal
Distribu
tion
Data distributions come in all shapes and sizes
The Normal Distribution is perfectly symmetrical and shaped like a bell. It will always look like a bell and is sometimes called the “Bell curve”. Having been discovered by Gaussian, “Gaussian” as well, so the normal distribution, the bell curve, or the
Parameters of Normal Distribution
Gaussian distribution all mean the same thing. A very common
-1 SD
Mean
+1 SD
distribution. Talk about normal distributions in general terms. The
Majority
shape implies that an infinite number of measurements was taken.
of Data
The measurements near the center are common, while others aren’t.
Though the distribution doesn’t show it, the curved part never to9uches the horizontal
axis; it goes left to right
forever—otherwise known as a
distribution.
Descriptive Statistics for the
Normal Distribution
Mean – the mean of a normally
distributed variable is shown
graphically as the vertical line in the center of the symmetry or is in the middle of the Bell curve.
Standard Deviation – the vertical lines are placed representing “mean +/- x
standard deviations”. Much of
the data in a normally
distributed variable are within
one standard deviation of the
mean. If you go out to +/- 2
standard deviations from the
mean, you have captured most
of the data in the distribution.
If you go out to +/- 3 standard deviations, you have captured almost all of the data.
Median – the median is the “middle value,” or the point at which the half of the area under the curve of the distribution is to the left, and the other ½ is to the right. In a normal distribution, the mean and the media are the same number.
Sindy Saintclair
Monday, November 28 2021
Lesson 5 – Normal Distribution and the Central
Limit Theorem
Learning Objectives and
Questions
Notes and Answers
Range – because the distribution goes on forever to the left and right, there is no min, and there is no max. So, there is no
range.
Practical Usage of the Normal Distribution
Most of the berries measure were between 70 and 130 mg. Interpret that as meaning that the mean of the distribution is 100 mg, and the standard deviation is 10 mg. Can you see how a lot of the berries are between 90 and 110, most are 80 and 120, and virtually all are between 70 and 130? This is the practical interpretation of the mean and standard deviation. Later, you will learn how you can determine the probability of finding a berry in a certain weight range, and how rare it is to
see an unusually large or unusually small berry.
The
Standa
rd
Normal
Distrib
ution
The mean will always be 0 and the standard deviation will always be 1. Greek symbol stands for sigma.
68% of the values will be within 1 SD of the mean
95% of values will be within 2 SDs of the mean
99.7% of value will be within 3 SDs of the mean
The 69-95-99 Rule
Here are a few values for the standard deviation areas on the Standard Normal Distribution:
-
Area between -3 and -2 = 0.022
-
area between -2 and -1 = 0.136
-
area between -1 and 0 =0.341
-
area between 0 and 1 = 0.341
-
area between 1 and 2 = 0.136
-
area between 2 and 3 = 0.022
The z-
score
Example One: Ghana height of a young adult woman, with a mean of 159.0 cm and a standard deviation of about 4.9 cm. Gabianu is a college student originally from Ghana standing at 169.0 cm tall. 10 cm taller than the average woman which is about 4 inches tall—this is
considered 2 standard deviations taller (2 x 4.9 = 9.8) than the average woman. So Gabianu is about 2.04 standard deviations taller than the average Ghanian woman. Is she extraordinarily tall or just a bit on the tall side?
Rather than depending on subjective declaration of Gabianu’s stature,
standardize her height. This is z-score, a great way to measure any individual piece of data relative to the population. To calculate the z-
score, you need to know a couple of things:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
-
population mean
-
population standard deviation
Though population parameters are technically unknown, treat them as if they are known because many calculations depend on knowing population parameters.
Verbiage used to imply that the population parameters are known:
-
the baseline value of fat content in cheese
-
the historical mean test score
-
the agreed upon value for the speed of light
-
the average lifespan of an incandescent light bulb
mu or µ stands for population mean
Z-score is simply the
difference
sigma or σ stands for the population standard deviation
between x-value and µ
If the value is larger than the mean, the numerator will be positive. If the value is less than the mean, the numerator will be negative, which is fine.
A z-score example
In the case of Gabianu, the numerator for her z-score is 10 (169.0-
159.0). With sigma in the denominator of the fraction, the difference (in the numerator) is simply getting scaled. Gabianu’s height difference from the mean is 10cm. When you divide that by 4.9 (which is the population SD, or sigma), you are essentially converting the height difference of 10cm and expressing it in terms of the sigma.
So, ten divided by 4.9 is about 2.04, which means Gabianu’s height is about 2.04 SDs more than the average height.
Gabianu’s friend Rashida from Ghana is 154.8 cm tall. Calculate her z-
score:
Because Rashida has a z-score of -0.86, she is about one SD shorter than the average Ghanian woman.
In short, the z-score is a measure of how many standard deviations your value is away from the population mean.
z-
scores
in the
Standa
rd
Normal
Distrib
ution
Overlay the Standard Normal Distribution on IQ
-
the z-score will always be equal to x
-
unusual events that happen 5% or less
-
can be used to calculate probabilities, which are equal to the area between the curve and the horizontal axis of the distribution from which the random value is taken. To make the math easy, you can arbitrarily set the value of the area under the entire curve to 1.
Berry example – figure out the probability of selecting a single berry at random with the weight between 90 and 110mg. In other words, you would like to figure out the area under the curve between 90 and 110, and compare that to the area under the whole curve.
What is the area of the green shaded region, relative to the area under the entire curve (blue and green regions combined)? The z-
score for 90 is -1, and the z-score for 110 is 1, because µ=100 and σ=10. The probability of a single berry being between 90 and 110 is the same as the probability of a single z-score being between -1 and 1.
Since the area under the curve is between -1 and 1 for The Standard Normal Distribution, this means the value is 0.683. Thus, the probability of a single berry being between 90 and 110 mg is the same as the probability of the z-score being between -1 and 1, which is the same as the rea between -1 and 1 on The Standard Normal Distribution. The answer is 0.683, or 68%.
Unusual Events
When an unusual event occurs, it indicates one of two things
1.
something rare just happened
2.
it is wrong
For example, in golf, a hole-in-one is a pretty rare event. Among the total population, the chances of making a hole-in-one are about 1 in 12,500. The probability of making a hole-in-one is 1/12,500 or 0.00008.
In the 2016 Master’s at Augusta, in the final round on Sunday, 3 different golfers made a hole-in-one on the same hole, and all 3 happened within about an hour. To make it even more unusual, there were only 57 golfers playing that day, so 1 out of every 19 golfers hit a hole-in-one that day.
Whenever there is a probability for an event of 0.05 or less, and the event occurs, it is considered to be a rare event. When a rare event occurs, the first response of a data scientist is to question the validity of the assumptions in the system. The truth of the matter is that sometimes, it is just a rare event that just happened to occur.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Calcula
ting a
z-test
A Z-test will tell you how far away a particular value is from the mean and whether or not it is significant.
Examples Using the z-score Applet
Example 1
I want to know the probability of having a randomly selected berry weigh between 85mg and 1116mg, for a reference point found before
according to the 68-95-99 rule, that would be somewhere between 0.68 and 0.95 since 85 to 115 is 1.5 standard deviations away from 100.
Start with z-score. If x=85, and µ=100, and sigma=10, then the z-
score is:
When I plug -1.5 into the z-score applet at the bottom left, and then hit return, you will notice that the bottom right value changed to 1.5. Inputting the correct z-score is only half the battle here. I am currently
looking for the probability that a single berry will be between
85 and 115 mg.
The area (probability) shown at the top is 0.8664, so the probability of a single berry being between 85 and 115 mg is about 0.87.
Example 2
– Suppose I want to know the probability of a single berry less than 88 mg? The z-
score for 88 is: (88-100)/10 = -12/10 = -1/2
Plugging the mean and standard deviation values in, then choose the below option on the applet.
Example 3
– Suppose I want to know the probability of a single berry greater than 94 mg? Then plug the mean and the standard deviation in, and select the option for before:
The probability of a single berry greater than 94 mg is about 0.73.
Using
the z-
score
to
Determ
ine a
Percen
tile
A percentile is 1/100 of the whole data set and are often seen in standardized exams. For a z-score of 1.2, select everything below that
point and you will get the area of 0.88.
Finding x from a z score
Remember that a percentile is the point at which a certain amoun of a distribution is below a certain value, and the rest is above that value. For instance, the 45
th
percentile is the point in distribution where 45% of the data are below(to the left) of that value, and 55% are above (or ot the right) of that value. Suppose you took a standardarized exam and the mean score is 440 with a standard deviation of 23. You score is 472, what percentile is that?
Since the area reads 0.9179, the percentil is 91.8. Approximately 91.8% of all scores are less than or equal to 472, and therefore 8.2% of test scores are greater than or equal to 472.
Using a Percentile to Determine a z-Score or a Value for x
Suppose your friend test results for the sam test say they scolred in the 73
rd
percentil, and they want to know what their raw score
was. In the applet, choose the option of value from an area and enter the mean (440), standard deviation (23) and the area (0.73) in. Then select the below option, since percentiles are all about finding the amount below:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
The output in the below section is 454. That means your friend’s
raw score was 454.
Parent
and
Child
Distrib
utions
Same mean, smaller standard deviation. The overall distribution, called the parent distribution
typically has a similar mean to the samples you collectd, called the child distributions.
However, the SD is smaller. The SD of Child Distribution is referred to as “Standard Error”!
Berry Example
A simulation of the selection and weighing of about 5600 individual berries, to where the distribution looked like this:
The weights of individual berries is normally distributed. Now, instead of plotting the weight of every individual berry, take a sample of size 16, one sample at a time, and calculate the sample means.
Here is the first simulated sample:
A second sample is collected, 16 more berries are selected, each is weighed individually, and then the mean of the second sample is calculated:
Fast forward, take 500 samples, each of size 16, calculate the sample means, and write the means down on your spreadsheet. Then, construct another distribution, with the sample means instead of individual berry weights.
The distribution of sample mean also looks to be normally distributed.
The mean of distribution of sample means looks to be somewhere around 100. This is because the mean of th child distribution is the same as the mean of the parent distribution. It is important to also not that the SD of the Child distribution is smaller than the SD of the parent distribution in this formula:
The distribution of individual berry weights (parent) has a mean of 100, and a SD of 10. The distribution of sample means (child) has a mean of 100, and a SD of 2.5:
The “width” of the child distribution is about ¼ the “width” of the parent distribution. If you apply what you know about the 68-95-99 rule, that says the distribution of sample means should have almost all of it’s values between 92.5 and 107.5, which are the mean +/- 3 SDs for the child distribution.
Central Limit Theorem
-
the larger the sample size, the more likely it is:
normally distributed
apprxoimating the population
accurate
-
also referred to as the “true” population
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Recommended textbooks for you

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
Recommended textbooks for you
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill

Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill