using R please show the code needed for each step 1) Initial data overview a. Load the faithful dataset in R b. What are the column headers for this data set? c. How many rows of data are in the data set?
using R please show the code needed for each step 1) Initial data overview a. Load the faithful dataset in R b. What are the column headers for this data set? c. How many rows of data are in the data set?
Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
Related questions
Question
using R please show the code needed for each step
1) Initial data overview
a. Load the faithful dataset in R
b. What are the column headers for this data set?
c. How many rows of data are in the data set?
2) Summary stats for the full data set
a. Compute all of the following for the duration of the eruptions and the waiting time
between the eruptions
i. mean
ii. population variance
iii. population standard deviation
iv. population coefficient of variation
3)Sampling
a. Load the faithful dataset in R
b. What are the column headers for this data set?
c. How many rows of data are in the data set?
2) Summary stats for the full data set
a. Compute all of the following for the duration of the eruptions and the waiting time
between the eruptions
i. mean
ii. population variance
iii. population standard deviation
iv. population coefficient of variation
3)Sampling
a. Create a new data frame that contains 100 samples of size 10 from the eruption
duration column of the faithful data set
i. You can use the sample() function to create your samples of size 10
ii. You can use the replicate() function to repeat the sampling 100 times
iii. You can cast the result as a data frame using data.frame()
4) Analyze the Samples
a. Create 3 new emptyvectors – these will store the sample mean, sample variance, and
sample standard deviation of each of your 100 samples
b. For each sample
i. compute the sample mean and add it to the sample means vector
ii. compute the sample variance and add it to the sample variances vector
iii. compute the sample standard deviation and add it to the sample standard
deviations vector
c. Compute the average and variance of each new vector
i. sample means
ii. sample variances
iii. sample standard deviations
d. Compute the bias (estimate – true parameter) of the
i. sample means
ii. variance
iii. standard deviations
5) Distribution Quantiles
a. Look up the z value that corresponds to a 2 sided 95% confidence interval
b. Look up the t value that corresponds to a 2 sided 95% confidence interval with n = 10
6) Confidence Intervals
a. Make 6 empty vectors to store the confidence interval upper and lower bounds when
the variance is and is not assumed to be known and the information about if the true
parameter is in each interval
b. For each of the 100 samples
i. Find the lower and upper bounds for a 95% confidence interval of the mean
when the variance is known and store these values in the appropriate vectors
ii. Determine if the true mean is within the known variance CI and store this
information in the appropriate vector
iii. Find the lower and upper bounds for a 95% confidence interval of the mean
when the variance is not known and store these values in the appropriate
vectors
iv. Determine if the true mean is within the unknown variance CI and store this
information in the appropriate vector
c. Count the number of known variance intervals that contain the true mean. Is the result
what you expected? Why or why not?
d. Count the number of unknown variance intervals that contain the true mean. Is the
result what you expected? Why or why not?
7)Data Frame Modification
duration column of the faithful data set
i. You can use the sample() function to create your samples of size 10
ii. You can use the replicate() function to repeat the sampling 100 times
iii. You can cast the result as a data frame using data.frame()
4) Analyze the Samples
a. Create 3 new empty
sample standard deviation of each of your 100 samples
b. For each sample
i. compute the sample mean and add it to the sample means vector
ii. compute the sample variance and add it to the sample variances vector
iii. compute the sample standard deviation and add it to the sample standard
deviations vector
c. Compute the average and variance of each new vector
i. sample means
ii. sample variances
iii. sample standard deviations
d. Compute the bias (estimate – true parameter) of the
i. sample means
ii. variance
iii. standard deviations
5) Distribution Quantiles
a. Look up the z value that corresponds to a 2 sided 95% confidence interval
b. Look up the t value that corresponds to a 2 sided 95% confidence interval with n = 10
6) Confidence Intervals
a. Make 6 empty vectors to store the confidence interval upper and lower bounds when
the variance is and is not assumed to be known and the information about if the true
parameter is in each interval
b. For each of the 100 samples
i. Find the lower and upper bounds for a 95% confidence interval of the mean
when the variance is known and store these values in the appropriate vectors
ii. Determine if the true mean is within the known variance CI and store this
information in the appropriate vector
iii. Find the lower and upper bounds for a 95% confidence interval of the mean
when the variance is not known and store these values in the appropriate
vectors
iv. Determine if the true mean is within the unknown variance CI and store this
information in the appropriate vector
c. Count the number of known variance intervals that contain the true mean. Is the result
what you expected? Why or why not?
d. Count the number of unknown variance intervals that contain the true mean. Is the
result what you expected? Why or why not?
7)Data Frame Modification
a. Use the rbind() function to append the vectors you have made to the data frame
containing your 100 samples of size 10 in the following order
i. sample mean
ii. known variance CI lower bound
iii. known variance CI upper bound
iv. if each known variance CI contains the true mean
v. unknown variance CI lower bound
vi. unknown variance CI upper bound
vii. if each unknown variance CI contains the true mean
b. use the row.names() function to rename the rows of the data frame with appropriate &
informative names
8) Write a Data Frame to a CSV File
a. Use the write.csv() function to write your data frame with the 100 samples of size 10
and the confidence interval information to a csv file.
containing your 100 samples of size 10 in the following order
i. sample mean
ii. known variance CI lower bound
iii. known variance CI upper bound
iv. if each known variance CI contains the true mean
v. unknown variance CI lower bound
vi. unknown variance CI upper bound
vii. if each unknown variance CI contains the true mean
b. use the row.names() function to rename the rows of the data frame with appropriate &
informative names
8) Write a Data Frame to a CSV File
a. Use the write.csv() function to write your data frame with the 100 samples of size 10
and the confidence interval information to a csv file.
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
This is a popular solution!
Trending now
This is a popular solution!
Step by step
Solved in 3 steps
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Recommended textbooks for you
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education