Using R please do not use t.test() function 1) You will need the datarium package for the work with the mice2 dataset Install the package in R using the install.packages() function 2) Initial data overview a. Load the nile dataset in R b. How many years of data are included in this data set? c. Load the AirPassengers data set in R d. What are the column headers for this data set? e. How many rows of data are in the data set? f. Load the mice2 data set in R. We need to use the datarium package when loading the data, so we will include that as a parameter in the data() function. data("mice2", package = "datarium") Note that the mice2 data set contains paired data, so we have 2 data points for each mouse. g. How many rows of data are in the data set? 3) Summary stats for the Nile data set a. Compute the following for the annual flow of the Nile River and the i. mean ii. variance b. Create a qq plot of the data that includes a colored reference line using the qqnorm and the qqline functions. It does not matter what color your reference line is as long as it is not black. Would you conclude that the data is approximately normally distributed? Why or why not? 4) 2-sided hypothesis test when the true variance is known Using the Nile data set, conduct a hypothesis test to determine if the true mean is equal to 920. It is given that the true variance is 28350. You should use α = 0.05. Do not use the z.test() function to complete your work for this question. a. Define the null and alternate hypothesis for this problem. b. Use the qnorm function to look up the z value that corresponds to α = 0.05 for a two sided test. c. Compute the upper and lower rejection values for this test d. Compute the z statistic for this problem. e. Would you reject or fail to reject the null hypothesis?
SQL
SQL stands for Structured Query Language, is a form of communication that uses queries structured in a specific format to store, manage & retrieve data from a relational database.
Queries
A query is a type of computer programming language that is used to retrieve data from a database. Databases are useful in a variety of ways. They enable the retrieval of records or parts of records, as well as the performance of various calculations prior to displaying the results. A search query is one type of query that many people perform several times per day. A search query is executed every time you use a search engine to find something. When you press the Enter key, the keywords are sent to the search engine, where they are processed by an algorithm that retrieves related results from the search index. Your query's results are displayed on a search engine results page, or SER.
Using R please
do not use t.test() function
Install the package in R using the install.packages() function
2) Initial data overview
a. Load the nile dataset in R
b. How many years of data are included in this data set?
c. Load the AirPassengers data set in R
d. What are the column headers for this data set?
e. How many rows of data are in the data set?
f. Load the mice2 data set in R. We need to use the datarium package when loading the
data, so we will include that as a parameter in the data() function.
data("mice2", package = "datarium")
mouse.
g. How many rows of data are in the data set?
3) Summary stats for the Nile data set
a. Compute the following for the annual flow of the Nile River and the
i. mean
ii. variance
b. Create a qq plot of the data that includes a colored reference line using the qqnorm and
the qqline functions. It does not matter what color your reference line is as long as it is
not black. Would you conclude that the data is approximately normally distributed?
Why or why not?
4) 2-sided hypothesis test when the true variance is known
Using the Nile data set, conduct a hypothesis test to determine if the true mean is equal to 920.
It is given that the true variance is 28350. You should use α = 0.05.
Do not use the z.test() function to complete your work for this question.
a. Define the null and alternate hypothesis for this problem.
b. Use the qnorm function to look up the z value that corresponds to α = 0.05 for a two
sided test.
c. Compute the upper and lower rejection values for this test
d. Compute the z statistic for this problem.
e. Would you reject or fail to reject the null hypothesis?
5) 1-sided hypothesis test when the true variance is unknown
Using the AirPassengers data set, conduct a hypothesis test to determine if the true mean of the
monthly number of passengers in 1955 is equal to 280 or greater than 280. Use α = 0.05 for
your test.
Do not use the t.test() function to complete your work for this question.
a. Define the null and alternative hypotheses for this test.
b. Create a new data structure that contains only the data from the year 1955. Note: The
AirPassengers data set is time series data, so you will need to identify the starting and
ending indices for 1955 in this full dataset to be able to create the subset for 1955 only.
c. Find the sample mean and sample standard deviation of the number of monthly
passengers in 1955.
d. Use R to look up the t-value from the t-table for this test using the qt function. In your
report be sure to include what the values of alpha and n are for the t-value.
e. Compute the t-statistic for this test.
f. What is the p-value for this test?
g. Should you reject or fail to reject the null hypothesis?
6) Set up the differences column and check assumptions for the mice2 data set
a. Compute the sample mean and standard deviation of the weights of the mice before
and after treatment. From these summaries alone, do you think that there is a
difference in the average weight of the mice before and after the experiment?
b. Add a column to the mice2 data frame that contains the difference between the weight
after the experiment and the weight before the experiment (use after- before).
A new column can be added to a data frame using the following approach dataFrame$newColumn <- newDataVector
d. Create a qq plot of the weight difference and add a colored reference line. Would you
conclude that the differences are normally distributed?
7) Paired t-test
Do not use the t.test() function to complete your work for this question.
Using the mice2 data set
a. Conduct a paired t-tests to determine if the weight of the mice is the same before and
after the treatment. Use α = 0.02.
i. Define the null and alternative hypotheses for this test.
ii. Compute the value of the test statistic for the difference in the weights of the
mice
iii. Use R to look up the t-value from the t-table for this test.
iv. Should we reject or fail to reject the null hypothesis?
b. Conduct a paired t-tests to determine if the weight of the mice is greater after the
treatment. Use α = 0.02.
i. Define the null and alternative hypotheses for this test.
ii. Compute the value of the test statistic for the difference in the weights of the
mice
iii. Use R to look up the t-value from the t-table for this test.
iv.Should we reject or fail to reject the null hypothesis?
Step by step
Solved in 3 steps