Shah_Random Sampling f21
docx
keyboard_arrow_up
School
Rowan University *
*We aren’t endorsed by this school
Course
02280
Subject
Statistics
Date
Feb 20, 2024
Type
docx
Pages
12
Uploaded by Ishashah221022
Biometry, Random Sampling
1
Biometry
Random Sampling
Text: Sections 1.3, 2.7
Objectives
o
How to take a random sample from a population
o
Discussion and questions on why random samples are necessary
o
Create and interpret scatterplots, time plots
o
Linear transformations on data sets: when, why, and how to do them
Terminology
simple random sample
(
SRS
) of size n
= a sample of n
items in which (a) every member of the population has the same chance of being included; and (b) the members of the sample are chosen independently of each other. sampling bias
= a bias resulting from a faulty sampling method sampling frame
=a list of individuals from which the sample was actually selected.
population distribution
of a variable is the distribution of its values for all members
of the population.
sampling variability (error)
= arises because the observed value of a statistic depends on the particular sample selected, and typically varies from sample to sample. We cannot eliminate this variability but we can learn how it works.
sampling distribution (of a statistic)
= the theoretical
probability distribution of ALL of the possible values of the sample statistic. Note that changing the sample size
(n) has an important effect on this.
parameter
= a number describing the population, or a population characteristic statistic
= a number describing the sample, or a sample characteristic Parameter
Statistic
Mean
μ
¯
y
Standard deviation
σ
s
Proportion
p
^
p
Biometry, Random Sampling
2
Parameters vs. statistics:
in the science of statistics we need to use sample statistics (like the sample mean number of squares in a clutch,
¯
y
) to estimate the population parameter (like the population mean number of squares in a clutch, μ
), because often a census is impossible or too expensive
.
PART 1: Random Sampling and Experimental Design
Random rectangles
For this exercise, we are trying to estimate the mean number of squares in a rectangle. For example, each of the rectangles below has 4 squares, so the mean number of squares in each rectangle is 4.
Use this table to record your estimates of the mean number of squares:
Number squares in rectangle
Mean # of
squares (
¯
y
)
First estimate
(guess)
Second estimate
(judgment)
8
6
3
5
18
8
Third estimate
(random # table)
1
4
4
14
16
7.8
Fourth estimate
(calculator or JMP)
12
1
1
5
2
4.2
1. First estimate
(Guess): When the instructor says “Go”, you will look at “random rectangle” document for 5 seconds and from this come up with an estimate (guess) of the average number of squares. When the instructor says “Stop” write
your estimate in the table below in the row labeled “First estimate”.
2. Second estimate
(Judgment) Look at the “random rectangle” document and select five (5) different
rectangles that, in your judgment, are representative
of the population of rectangles. Write down the # of squares
for those five rectangles in the table above, and calculate the mean number of squares (
¯
y
).
Biometry, Random Sampling
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Biometry, Random Sampling
4
3. Third estimate
(random sample using the random number table above).
We will
use a random number table to pick a simple random sample
(
SRS
, see the “New Terminology at the beginning of the lab) of 5 different random numbers between 01 and 100. Skip over any repeats
, so that you get 5 different numbers. Your instructor will demonstrate the use of the random numbers table.
Write down 5 random numbers from the random numbers table on the lines below: ______ ______ ______ ______ ______
You will now use these random numbers to identify rectangles in the “random rectangle” document and count the squares in those rectangles. •
Write down the number of squares
in each of the 5 clutches in the table above. •
Calculate the mean number of squares
in the 5 clutches and write it into the table.
DO NOT RECORD YOUR RANDOM NUMBERS FROM THE RANDOM NUMBER TABLE IN THE TABLE OF YOUR ESTIMATES OF THE MEAN!!
4. Fourth estim
ate
(random sample using JMP).
Generate 5 random integers between 1 and 100 using JMP (instructions below). Skip over any repeats
, so that you get 5 different numbers.
Write down 5 random numbers from JMP on the lines below: ______ ______ ______ ______ ______
You will now use these random numbers to identify rectangles in the “random rectangle” document and count the squares in those rectangles. •
Write down the number of square
in each of the 5 clutches in the table above.
•
Calculate the mean number of squares
in the 5 clutches and write it into the table.
DO NOT RECORD YOUR RANDOM NUMBERS FROM JMP IN THE TABLE OF YOUR ESTIMATES OF THE MEAN!!
5. When you are done, record your four estimates on the spreadsheet at the front
of the class.
Biometry, Random Sampling
5
How to generate random numbers in JMP
1.
Create a new column in JMP.
2.
Right-click on the top of the column. Select “Column Info…”, and give it a name. 3.
Pull down the menu next to “Initial Data Values” and pick “Random”.
4.
The number of rows you pick will be the number of random numbers generated.
5.
Leave “Random Integer” selected, then pick the range of numbers you want generated. For this purpose, we want numbers between 1 and 99.
Note: the random number generator may pick the same number twice, even in a small sample (e.g. 5 numbers). If you’re sampling without replacement, you may want to have it generate more than you need, and just use the first 5 that are not the same number. If you are sampling with replacement, then repeats are ok and you’ll use them.
6.
Hit “OK”, and JMP will populate the column with your random numbers. Always start with the one on top and work down; don’t pick numbers, use them in order. Only skip a number if it’s a repeat of the one above (again, if you want numbers that are sampled without replacement).
Questions: please answer these using the class data
a) Compare the class’ first
and second
estimates (guess and judgment). Are these
two estimates close to each other? Please answer YES or NO and explain briefly
why they might be similar or different
.
b) Compare the class’ first
and third
estimates (guess and simple random sample using the random numbers table). Are these two estimates close to each other? Answer YES or NO and explain briefly
why they might be similar or different
.
c) Compare the class’ second
and third
estimates (judgment and simple random sample using the random numbers table). Are these two estimates close to each
other? Answer YES or NO and explain briefly
why they might be similar or different
.
d) Compare the class’ third
and fourth
estimates (simple random samples, one using the random numbers table, the other using JMP to generate random numbers). Are these two estimates close to each other? Answer YES or NO and explain briefly
why they might be similar or different
.
Make separate histograms
for each of the four class estimates to answer the following questions:
Biometry, Random Sampling
6
e) Compare the histograms for each of the four estimates by looking at their means (point of balance). Do you see differences among them? What might explain any differences you see, if present? Explain briefly.
f) Is there a sampling bias
in the first two estimates, when everyone guessed or use their judgment? Explain briefly. (Reminder: the definition of sampling bias is at
the beginning of the lab, under “New Terminology”).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Biometry, Random Sampling
7
PART 2: SCATTERPLOTS, TIME PLOTS, LINEAR TRANSFORMATIONS
Tips on graphing
●
The independent variable goes on the x-axis; the dependent variable goes on the
y-axis
Double-check your graph once it’s done. Does it look like you expected it should, from the data? This final check will help you catch a lot of errors. Don’t just make a graph and consider it done: always ask if it looks right!!!
Terminology
Scatterplot:
a plot of two related, quantitative variables on the Cartesian coordinate system (the x
-
y
plane). If you think one of the variables explains the other, the former, or explanatory (independent) variable goes on the x
-
axis, and the latter, or dependent variable, goes on the y-axis.
Data are from: Samuels et al., 2012, Problem 12.3.8
Time plot or time series plot:
scatterplot where one variable is time. Plot the values of the other
variable on the vertical axis (
y
-axis) against time
on the horizontal axis. Tips on interpretation of time plots: look for the overall
pattern
(
trend
: a general upward or downward ‘movement’ over time) & ‘seasonal’ variation (a pattern that repeats every 12 months, 4 quarters, or
fixed time period), and striking deviations
from the overall pattern (peculiar bumps, valleys, etc., over a short period of time).
Biometry, Random Sampling
8
Data are
from NOAA, at: ftp://ftp.cmdl.noaa.gov/ccg/co2/trends/co2_mm_mlo.txt
, accessed 9.18.11
Lab Exercises
Scatterplots
1
. Make a scatterplot in JMP using the Survey Data for height vs. footprint length. Before making the graph, think about which variable goes on each axis
. Questions: g) Of the two variables (height, footprint length), which is the explanatory variable, if either, that should go on the x-axis? Explain your choice.
h) Interpret the pattern
in the scatterplot you just made: what does the graph tell you about the relationship between height and footprint length? i) Now focus on the variability
in the relationship between height and footprint length. Is this a strong or weak relationship (i.e. little or quite a bit of scatter in
the pattern)? Why is there any variability (why don’t the data fall along a straight line)?
Biometry, Random Sampling
9
Points to help with the interpretation of scatterplots
Look for the overall pattern and striking deviations from that pattern (from Moore & McCabe’s Introduction to the Practice of Statistics
, Freeman):
●
overall pattern
is made up of: direction, form, and strength
direction
– is there a positive association, a negative association, or none.
form – is it linear or curved?
strength
– is the pattern strong, moderate, or weak? Look for how tight the overall pattern is. (Hint: If you cannot see a pattern, then what is the strength?)
●
striking deviations
from the overall pattern: usually we are talking about outliers, which are points outside the overall pattern.
Time plots, or time series plots
2
. Construct a time plot (or time series plot) of the data provided below (these data are entered into a JMP spreadsheet posted on Canvas), which represent the number of bacterial colony-forming units (CFUs) per mL over time. Connect the points
on this graph to make it easier to view the trend over time.
You can do this by selecting the red arrow by “Bivariate Fit…” and then choosing “Flexible” and “Fit each Value”.
Time (h)
CFUs / mL
Time (h)
CFUs / mL
0
12000000
5.0
750000000
0.5
11300000
5.5
2000000000
1.0
10300000
6.0
1700000000
1.5
24800000
6.5
2000000000
2.0
60000000
7.5
3480000000
2.5
84000000
8.0
4000000000
3.0
166000000
12
4100000000
3.5
330000000
16
3700000000
4.0
2900000000
20
800000000
4.5
620000000
24.5
125000000
Questions : j) Interpret the time plot you just created. What do the data tell you about how bacterial colony-forming units (CFUs) change over time? Is this what you expected the graph to look like? k) Imagine we could do another experiment related to the one that generated these bacterial growth data. What new experiment could we do that would be likely to
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Biometry, Random Sampling
10
change the shape
of the graph? Explain one variable you could change, how it would change the graph, and why you think it would change the graph in the way
you predict (i.e. what’s your rationale?).
3
. Construct a time plot (or time series plot) of the data provided below (they are entered into a JMP spreadsheet posted on Canvas), which represent the power load used over time, where each time period represents a 3-month period
. Connect the points on this graph to make it easier to see a pattern in the data over time.
Time
period
Power load
(MW)
Time
period
Power
load (MW)
1
68.8
26
116.8
2
65
27
144.2
3
88.4
28
123.3
4
69
29
142.3
5
83.6
30
124
6
69.7
31
146.1
7
90.2
32
135.5
8
72.5
33
147.1
9
106.8
34
119.3
10
89.2
35
138.2
11
110.7
36
127.6
12
91.7
37
143.4
13
108.6
38
134
14
98.9
39
159.6
15
120.1
40
135.1
16
102.1
41
149.5
17
113.1
42
123.3
18
94.2
43
154.4
19
120.5
44
139.4
20
107.4
45
151.6
21
116.2
46
133.7
22
104.4
47
154.5
23
131.7
48
135.1
24
117.9
25
130.6
Question: l) Interpret the time plot you just created. How does the power load change over time? Why do you think the time plot looks the way it does (i.e. why does it change over time with the pattern it does)? Remember that each time point represents a 3-month interval.
Biometry, Lab 3 – scatterplots, time plots, linear transformations, random sampling and design
11
Linear transformations
Many transformations are linear and their effects on ¯
y
and s
are predictable. Let Y’
be the ‘new’ value and Y
be the ‘old’ value. Linear transformations
change an old variable into a new one by the equation: Y
'
=
mY
+
b
Under a Linear Transformation:
●
The effect on ¯
y
is ‘natural’. That is, ¯
y
changes like y
: ¯
y
'
=
m
¯
y
+
b
. It works for the median, Q
1
and Q
3
, too.
●
The effect on s
is just multiplicative. That is, s
'
=
ms
. It works for the IQR, too. Because we are talking about measures of dispersion, or variability, the addition of the constant b
gets ‘wiped out.’
4
. Using the Survey Data,
do the following calculations on the footprint lengths data, which was measured in cm:
o
Write out the equation in the form above for the linear transformation of footprint length in centimeters to footprint length in inches. (
Hint
: think about what is “m” or “b” in the equation above, for your transformation)
o
Do this calculation for 2 footprint lengths. Show your work and show the final result of footprint lengths in inches.
o
Now, have JMP do this calculation for you for all the footprint length data in the Survey Data
file. To do this, create a new column by scrolling in JMP to the right until there is an empty column. Double click in the top of the column and type in a column name (e.g. “Footprint length (in.))”. Hit Enter. Then right
click on the new column title and select “Formula”. In the window that appears, re-create the equation you wrote above for the linear transformation from footprint length in cm. to inches. Click “Apply” or “Ok”. The new column should populate with the values of footprint length in inches.
o
To calculate the mean and standard deviation of footprint length in inches, create a histogram of the data in this new column. The descriptive statistics you need will be displayed.
Question: m) When would doing a linear transformation be useful? Give an example other than the calculation you just did on footprint lengths, and then explain
why this other transformation would be useful.
Biometry, Lab 3 – scatterplots, time plots, linear transformations, random sampling and design
12
Submission Details
●
Submit this complete assignment on Canvas before the due date, which is the beginning of next week’s lab.
●
Your assignment should include all of the graphs/charts you were asked to make,
with figure legends, and correctly formatted as figures. You will also need to turn
in your answers to questions a-m. Do not turn in any tables or lists of raw data. ●
This assignment should be completed with your lab partner. If you are at a table with only three students, you may complete the assignment as a group of three. Only one assignment should be turned in per group. Participation statement removed and updated for canvas F21 NAR
Participation statement s21 by nar and tjo
Reformatted and revised by nar F19
V6 based on cer: tjo revised 2.4.13; dcw & chd statistics revised 12.30.13; NR proofed and assignment change 9/19/2014.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Recommended textbooks for you
![Text book image](https://www.bartleby.com/isbn_cover_images/9781680331141/9781680331141_smallCoverImage.jpg)
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
![Text book image](https://www.bartleby.com/isbn_cover_images/9780079039897/9780079039897_smallCoverImage.jpg)
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337111348/9781337111348_smallCoverImage.gif)
Functions and Change: A Modeling Approach to Coll...
Algebra
ISBN:9781337111348
Author:Bruce Crauder, Benny Evans, Alan Noell
Publisher:Cengage Learning
![Text book image](https://www.bartleby.com/isbn_cover_images/9781285463247/9781285463247_smallCoverImage.gif)
Linear Algebra: A Modern Introduction
Algebra
ISBN:9781285463247
Author:David Poole
Publisher:Cengage Learning
Recommended textbooks for you
- Big Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtGlencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillFunctions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage Learning
- Linear Algebra: A Modern IntroductionAlgebraISBN:9781285463247Author:David PoolePublisher:Cengage Learning
![Text book image](https://www.bartleby.com/isbn_cover_images/9781680331141/9781680331141_smallCoverImage.jpg)
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
![Text book image](https://www.bartleby.com/isbn_cover_images/9780079039897/9780079039897_smallCoverImage.jpg)
Glencoe Algebra 1, Student Edition, 9780079039897...
Algebra
ISBN:9780079039897
Author:Carter
Publisher:McGraw Hill
![Text book image](https://www.bartleby.com/isbn_cover_images/9781337111348/9781337111348_smallCoverImage.gif)
Functions and Change: A Modeling Approach to Coll...
Algebra
ISBN:9781337111348
Author:Bruce Crauder, Benny Evans, Alan Noell
Publisher:Cengage Learning
![Text book image](https://www.bartleby.com/isbn_cover_images/9781285463247/9781285463247_smallCoverImage.gif)
Linear Algebra: A Modern Introduction
Algebra
ISBN:9781285463247
Author:David Poole
Publisher:Cengage Learning