ISYE-3030-Assignment-3
docx
keyboard_arrow_up
School
Georgia Institute Of Technology *
*We aren’t endorsed by this school
Course
4031
Subject
Statistics
Date
Feb 20, 2024
Type
docx
Pages
6
Uploaded by UltraBisonMaster1012
ISYE 3030 Assignment 3
Lokranjan Lakshmikanthan
10/6/2021
Libraries
library
(tidyverse)
Question 8.2.9
Part A
Chapter8 =
read.csv
(
"ch08.csv"
)
data =
na.omit
(Chapter8
$
EX.
8
.
2.9
)
ggplot
(
as.data.frame
(data), aes
(
sample =
data)) +
stat_qq
() +
stat_qq_line
()
The data appears to be approximately normal. The data mostly falls on the linear line shown in the normal probability plot above. The deviations of the data points from the line
are small enough to assume normality however the population variance is not known. Below is the confidence interval made using T-scores.
Part B
sMean =
mean
(data)
SD =
sd
(data)
n =
length
(data)
t =
qt
(
0.975
, n
-1
)
TwoSidedError =
t *
SD
/
sqrt
(n)
Upper =
sMean +
TwoSidedError
Lower =
sMean -
TwoSidedError
Upper
## [1] 2282.516
Lower
## [1] 2237.317
The two-sided 95% confidence interval on the mean strength is (2237.317 <= True Mean <= 2282.516)
Part C
OneSidedError =
qt
(
0.95
, n
-1
) *
SD
/
sqrt
(n)
Lowerz =
sMean -
OneSidedError
Lowerz
## [1] 2241.477
The one-sided lower 95% confidence interval on the mean strength is (2241.477 <= True Mean). The lower bound of the two-sided interval is less than the lower bound for the one sided interval. This is since the alpha of the one-sided interval is larger and results in a lower T-score. The lower T-score reduces the error on the one-sided interval.
Question 8.3.3
data2 =
as.numeric
(Chapter8
$
EX.
8
.
3
.
3.1
[
2
:
9
])
ggplot
(
as.data.frame
(data2),
aes
(
sample =
data2)) +
stat_qq
() +
stat_qq_line
()
Data from the Average Mean Temperature of the 8 sites appears to be normally distributed.
The normal probability plot has few extreme deviations from the line making it reasonable to assume a normal distribution.
SQU =
qchisq
(
0.975
,
7
)
SQL =
qchisq
(
0.025
,
7
)
V1 =
var
(data2)
n1 =
length
(data2) -
1
LowerC =
sqrt
((n1 *
V1) /
SQU)
UpperC =
sqrt
((n1 *
V1) /
SQL)
LowerC
## [1] 1.402562
UpperC
## [1] 4.317464
By assuming a normal distribution for the data we can arrive at a 95% two-sided confidence interval for the standard deviation over the sites: (1.402562 <= True Standard Deviation <= 4.317464).
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Question 8.3.6
TumorAppearanceTimes =
na.omit
(Chapter8
$
EX.
8
.
3.6
)
ggplot
(
as.data.frame
(TumorAppearanceTimes), aes
(
sample = TumorAppearanceTimes)) +
stat_qq
() +
stat_qq_line
()
The data does not appear to be normally distributed. It does not follow a linear trend on the normal probability plot.
n2 =
length
(TumorAppearanceTimes) -
1
CHSQU_Tumor =
qchisq
(
0.975
,n2)
CHSQL_Tumor =
qchisq
(
0.025
,n2)
V_Tumor =
var
(TumorAppearanceTimes)
Upper_Tumor =
sqrt
( (n2 *
V_Tumor) /
CHSQL_Tumor )
Lower_Tumor =
sqrt
( (n2 *
V_Tumor) /
CHSQU_Tumor )
Upper_Tumor
## [1] 20.46307
Lower_Tumor
## [1] 13.13045
If the data did have a normal distribution, the 95% two-sided confidence interval for the standard deviation over the sites would be (13.13045 <= True SD <= 20.46307).
Question 8.4.3
Part A
n_r =
30
phat =
12
/
n_r
Error_r =
qnorm
(
0.975
) *
sqrt
( phat *
(
1
-
phat) /
n_r )
Upper_r =
phat +
Error_r
Lower_r =
phat -
Error_r
Upper_r
## [1] 0.5753045
Lower_r
## [1] 0.2246955
The 95% two-sided confidence interval for the true proportion of underweight rats is (0.2247 <= p <= 0.5753). We can assume normality due to the large n.
Part B
Required_SampleSize =
phat *
(
1
-
phat) /
(
0.02
/
qnorm
(
0.975
))
^
2
Required_SampleSize
## [1] 2304.875
A sample size greater than or equal to 2305 is needed to keep the error less than 0.02.
Part C
n =
(
1.96
/
0.02
)
^
2
n =
n
*
.
25
n
## [1] 2401
n >= 2401
Question 8.4.6
Part A
n_seeds =
200
phat_seeds =
180
/
n_seeds
Error_seeds =
qnorm
(
0.975
) *
sqrt
( phat_seeds *
(
1
-
phat_seeds) /
n_seeds )
Upper_seeds =
phat_seeds +
Error_seeds
Lower_seeds =
phat_seeds -
Error_seeds
Upper_seeds
## [1] 0.9415771
Lower_seeds
## [1] 0.8584229
Since N is large enough to assume a normal distribution, the 95% two-sided confidence interval for the true proportion of seeds that germinate is (0.8584 <= p <= 0.9416).
Part B
Yes, the confidence interval we constructed provides evidence that the claim made by the packet of seeds is reasonable. The true proportion of 93% falls in between the confidence interval we constructed.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Related Documents
Related Questions
How many rows or observations are there
in the msleep.csv data set?
arrow_forward
Please do not give solution in image format thanku
arrow_forward
when data are classified using the following format: under 20, 20-<30, 30- <40 etc then the data must be?
arrow_forward
How Democratic is Georgia? Cou X
com/webapps/discussionboard/do/conference?action=list_forums&course_id=_21308_1&nav=discussion
M
ek 6
DESCRIPTION
Political Science: Georgia Democrats How Democratic is
Georgia? County-by-county results are shown for a recent
election. For your convenience, the data have been sorted in
increasing order (Source: County and City Data Book,
12th edition, U.S. Census Bureau).
Percentage of Democratic Vote by Counties in Georgia
31
41
33 34 34
46
55
+
56
Q Search
5
2
5
49
7
9
S
56 57 57 59
G
38 38 39
43
- ©
hp
a. Make a box-and-whisker plot of the data. Find the
interquartile range.
50
66
$
$
♫
8
V
N
8
TOTAL
POSTS
33
hs
W
arrow_forward
We will refer to the data table below as
"Facebook_Survey_Sample".
In the first column 1:="yes" and 0:="no"
In the "Gender" column 1:="Female" and 0:="Male"
VisitsPerWeek Friends
1
45
100
0
0
1
25
60
1
40
80
m
1
40
60
0
0
1
15
40
1
30
25
80
150
90
Facebook
1
1
200
Age
34
65
46
29
21
25
22
18
42
52
Gender
1
0
0
1
1
1
1
1
Make a scatter plots by sex for VistsPerWeek
vs. Friends for those in the Facebook_Survey_Sample who
have a facebook account.
arrow_forward
A data set contains the observations 8,5,4,6,9. find ( ∑x )^2
arrow_forward
We will refer to the data table below as
"Facebook_Survey_Sample".
In the first column 1:="yes" and 0:="no"
In the "Gender" column 1:="Female" and 0:="Male"
Facebook VisitsPerWeek Friends
1
0
1
1
1
0
1
1
1
1
45
0
25
40
40
0
15
30
80
90
100
0
60
80
60
0
40
25
150
200
Age
5.)
3
34
65
46
29
21
25
22
18
42
52
Gender
1
Make a scatter plots by sex for VistsPerWeek
vs. Friends for those in the Facebook_Survey_Sample who
have a facebook account.
2.)
For the men's scatter plot, from question 1,
find the line through (25,60) and (15,40) and then find the
RSE for the line.
3.)
Find the regression line for the points in the
men's scatter plot from question 1.
1
0
0
1
1
1
0
0
1
1
4.)
Use the regression as a model to predict how
many facebook friends a man who visits Facebook 80 times
per week should have.
Find the RSE for the regression line in question
arrow_forward
A survey of 23 retirees was taken. Among other things, the retirees were asked to report the age at which they retired. Here are those 23 ages (in years).
37, 42, 43, 47, 51, 51, 57, 57, 62, 64, 65, 65, 66, 67, 67, 68, 69, 69, 71, 72, 73, 75, 76
Send data to calculator
Send data to Excel
Frequency
10
10-
8.
6-
4
4.
2+
0-
60
80
50
Age (in years)
30
40
70
OMean
(a) Which measures of central tendency do
OMedian
not exist for this data set? Choose all that
OMode
apply.
None of these measures
OMean
(b) Suppose that the measurement 37 (the
smallest measurement in the data set) were
replaced by 24. Which measures of central
tendency would be affected by the change?
Median
OMode
None of these meASures.
Choose all that apply.
Check
Explanation
2021 McGraw-Hill Education. All Rights Reserved. Terms of Use Privacy Access
耳
99+
e here to search
in
3.
arrow_forward
Sahar Rasoul-Math 7 End of Yea X Gspy ninjas book-Google
docs.google.com/spreadsheets/d/1j5MotWzsc0V1V3Qyl4rbP_OFOUotaNXCIIFax>
Copy of Copy of Col...
8.8
Sahar Rasoul - Math 7 End of Year Digital Task Cards Student Version ☆
File Edit View Insert Format Data Tools Extensions Help Last edit was 5 minu
$ % .0 .00 123 Century Go... ▼ 18 Y BIS
fx| =IF(B4="Question 1", Sheet2! H21, if(B4="Question 2", Sheet2! H22, IF(B4="
n
100%
36:816
A
B
C
6
16
A flashlight can light
a circular area of up
to 6 feet in diameter.
What is the maximum
area that can be lit?
Round to the nearest
tenth.
30x
0004
15
A Sheet1
https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.amazon.com%2FSpy-Ninjas-Ultimate-Guidebook-Scholastic%2Fdp
7
8
9
10
11
12
13
14
3
5.
7.
a
5
$9
A
arrow_forward
Information for questions 4
•
•
Please Download "wages" from Canvas (the link to
this dataset is right below the HWA1 questions - it is
a Microsoft excel worksheet) and store it in your
favorite folder.
It contains 797 observations and 16 variables. The
"state" variable gives the names of the states
involved in this dataset.
• You need to have excel on your computer to open
this dataset.
i. You should use File > Import > Excel Spreadsheet
etc. as done in class 3 convert this file into a
Stata dataset. Once you are done, write the final
STATA code that makes the transformation of an
excel file to a STATA file possible.
ii. Write a code that will close the log file that has
been open since Question 1 part ii.
arrow_forward
Base on the same given data uploaded in module 4, will you conclude that the number of bathroom of houses is a significant factor for house price? I your answer is affirmative, you need to explain how the number of bathroom influences the house price, using a post hoc procedure. (Please treat number of bathrooms as a categorical variable in this analysis)
Base on the same given data, conduct an analysis for the variable sale price to see if sale price is influenced by living area. Summarize your finding including all regular steps for your method. Also, will you conclude that larger house corresponding to higher price (justify)?
arrow_forward
Find Interquartile Range (IQR) for the following set of data:44,30,22,25,50,22,42,47,20
arrow_forward
What is the 5-number-summary of this data set? 0, 6, 6, 7, 11, 18, 18, 20, 22, 24, 33, 38
arrow_forward
what is the population of this study
arrow_forward
Help
arrow_forward
Example: If a student Vishnu scored 45/50 on exam-1, 92/100 on exam-2 and 55.5/100 on the exam-3, the
complete list of data looks like 105, 82, 94.5, 72.5, 92, 91, 52, 86, 100, 96, 98, 109, 96, 90, 92, 55.5 which is the
16 data points Vishnu uses for this project.
IMPORTANT: Assume that the complete list of 16 scores as scores of 16 different students in a
class and answer the questions below.
Q1. What is the sample size of your data?
Qualitative
Quantitative
Neither
Discrete
Continuous
Neither
Nominal
Ordinal
Interval
Ratio
Q2. Is the data of scores qualitative or quantitative?
Q3. Is the data of scores discrete or continuous?
Q4. What is the level of measurement for this data?
arrow_forward
Answer number 4
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
data:image/s3,"s3://crabby-images/9ae58/9ae58d45ce2e430fbdbd90576f52102eefa7841e" alt="Text book image"
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
data:image/s3,"s3://crabby-images/af711/af7111c99977ff8ffecac4d71f474692077dfd4c" alt="Text book image"
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt
Related Questions
- How Democratic is Georgia? Cou X com/webapps/discussionboard/do/conference?action=list_forums&course_id=_21308_1&nav=discussion M ek 6 DESCRIPTION Political Science: Georgia Democrats How Democratic is Georgia? County-by-county results are shown for a recent election. For your convenience, the data have been sorted in increasing order (Source: County and City Data Book, 12th edition, U.S. Census Bureau). Percentage of Democratic Vote by Counties in Georgia 31 41 33 34 34 46 55 + 56 Q Search 5 2 5 49 7 9 S 56 57 57 59 G 38 38 39 43 - © hp a. Make a box-and-whisker plot of the data. Find the interquartile range. 50 66 $ $ ♫ 8 V N 8 TOTAL POSTS 33 hs Warrow_forwardWe will refer to the data table below as "Facebook_Survey_Sample". In the first column 1:="yes" and 0:="no" In the "Gender" column 1:="Female" and 0:="Male" VisitsPerWeek Friends 1 45 100 0 0 1 25 60 1 40 80 m 1 40 60 0 0 1 15 40 1 30 25 80 150 90 Facebook 1 1 200 Age 34 65 46 29 21 25 22 18 42 52 Gender 1 0 0 1 1 1 1 1 Make a scatter plots by sex for VistsPerWeek vs. Friends for those in the Facebook_Survey_Sample who have a facebook account.arrow_forwardA data set contains the observations 8,5,4,6,9. find ( ∑x )^2arrow_forward
- We will refer to the data table below as "Facebook_Survey_Sample". In the first column 1:="yes" and 0:="no" In the "Gender" column 1:="Female" and 0:="Male" Facebook VisitsPerWeek Friends 1 0 1 1 1 0 1 1 1 1 45 0 25 40 40 0 15 30 80 90 100 0 60 80 60 0 40 25 150 200 Age 5.) 3 34 65 46 29 21 25 22 18 42 52 Gender 1 Make a scatter plots by sex for VistsPerWeek vs. Friends for those in the Facebook_Survey_Sample who have a facebook account. 2.) For the men's scatter plot, from question 1, find the line through (25,60) and (15,40) and then find the RSE for the line. 3.) Find the regression line for the points in the men's scatter plot from question 1. 1 0 0 1 1 1 0 0 1 1 4.) Use the regression as a model to predict how many facebook friends a man who visits Facebook 80 times per week should have. Find the RSE for the regression line in questionarrow_forwardA survey of 23 retirees was taken. Among other things, the retirees were asked to report the age at which they retired. Here are those 23 ages (in years). 37, 42, 43, 47, 51, 51, 57, 57, 62, 64, 65, 65, 66, 67, 67, 68, 69, 69, 71, 72, 73, 75, 76 Send data to calculator Send data to Excel Frequency 10 10- 8. 6- 4 4. 2+ 0- 60 80 50 Age (in years) 30 40 70 OMean (a) Which measures of central tendency do OMedian not exist for this data set? Choose all that OMode apply. None of these measures OMean (b) Suppose that the measurement 37 (the smallest measurement in the data set) were replaced by 24. Which measures of central tendency would be affected by the change? Median OMode None of these meASures. Choose all that apply. Check Explanation 2021 McGraw-Hill Education. All Rights Reserved. Terms of Use Privacy Access 耳 99+ e here to search in 3.arrow_forwardSahar Rasoul-Math 7 End of Yea X Gspy ninjas book-Google docs.google.com/spreadsheets/d/1j5MotWzsc0V1V3Qyl4rbP_OFOUotaNXCIIFax> Copy of Copy of Col... 8.8 Sahar Rasoul - Math 7 End of Year Digital Task Cards Student Version ☆ File Edit View Insert Format Data Tools Extensions Help Last edit was 5 minu $ % .0 .00 123 Century Go... ▼ 18 Y BIS fx| =IF(B4="Question 1", Sheet2! H21, if(B4="Question 2", Sheet2! H22, IF(B4=" n 100% 36:816 A B C 6 16 A flashlight can light a circular area of up to 6 feet in diameter. What is the maximum area that can be lit? Round to the nearest tenth. 30x 0004 15 A Sheet1 https://www.google.com/url?sa=i&url=https%3A%2F%2Fwww.amazon.com%2FSpy-Ninjas-Ultimate-Guidebook-Scholastic%2Fdp 7 8 9 10 11 12 13 14 3 5. 7. a 5 $9 Aarrow_forward
- Information for questions 4 • • Please Download "wages" from Canvas (the link to this dataset is right below the HWA1 questions - it is a Microsoft excel worksheet) and store it in your favorite folder. It contains 797 observations and 16 variables. The "state" variable gives the names of the states involved in this dataset. • You need to have excel on your computer to open this dataset. i. You should use File > Import > Excel Spreadsheet etc. as done in class 3 convert this file into a Stata dataset. Once you are done, write the final STATA code that makes the transformation of an excel file to a STATA file possible. ii. Write a code that will close the log file that has been open since Question 1 part ii.arrow_forwardBase on the same given data uploaded in module 4, will you conclude that the number of bathroom of houses is a significant factor for house price? I your answer is affirmative, you need to explain how the number of bathroom influences the house price, using a post hoc procedure. (Please treat number of bathrooms as a categorical variable in this analysis) Base on the same given data, conduct an analysis for the variable sale price to see if sale price is influenced by living area. Summarize your finding including all regular steps for your method. Also, will you conclude that larger house corresponding to higher price (justify)?arrow_forwardFind Interquartile Range (IQR) for the following set of data:44,30,22,25,50,22,42,47,20arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Holt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGALBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt
data:image/s3,"s3://crabby-images/9ae58/9ae58d45ce2e430fbdbd90576f52102eefa7841e" alt="Text book image"
Holt Mcdougal Larson Pre-algebra: Student Edition...
Algebra
ISBN:9780547587776
Author:HOLT MCDOUGAL
Publisher:HOLT MCDOUGAL
data:image/s3,"s3://crabby-images/af711/af7111c99977ff8ffecac4d71f474692077dfd4c" alt="Text book image"
Big Ideas Math A Bridge To Success Algebra 1: Stu...
Algebra
ISBN:9781680331141
Author:HOUGHTON MIFFLIN HARCOURT
Publisher:Houghton Mifflin Harcourt