Using SPSS to Generate 95% Confidence Intervals in NEUR 3001

NEUR 3001 Assignment 3: Using SPSS to generate 95% confidence interval estimates Preamble Confidence intervals are fundamentally important for many inferential statistics. In this assignment, you will generate confidence intervals to estimate the population parameter of the mean for 5 different samples, based on the central limit theorem (CLT) or based on bootstrapping. Note that the analyses in SPSS are fairly simple, and will become rather repetitive as I am asking you to essentially do the same steps 5 times. Apologies – this is necessary in order for me to ask the questions at the end. Is your SPSS patch working? When bootstrapping, your SPSS output table will have the word bootstrap within the column headings (see lecture slides for details). If you do not see this amongst the headers, you need to correctly patch your version of SPSS (see cuLearn section on software installation for details).

Part 1: Running the SPSS analyses You have 5 samples, one in each of the 5 SPSS files (A-E). The samples differ in their sample size, and at least some will come from different populations. For each sample, generate the following:  Histogram. For each histogram, include an overlay of the normal distribution.  95% confidence intervals based on the central limit theorem (CLT). To applying the CLT, ensure you use the one-sample t test menu in SPSS.  95% confidence intervals based on bootstrapping. Use BCa bootstrapping, and set the number of samples used in bootstrapping to 5000. When applying bootstrapping, ensure you use the descriptive statistics menu in SPSS – do not use the one-sample t test menu for bootstrapping confidence intervals. See lecture slides for details. Insert screenshots of the 5 histograms generated across the 5 analyses below. Submit a video of you generating all histograms and running all confidence interval analyses for each of the 5 datasets.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Use the data in each of the 5 analyses to complete the table below. Make sure you are reading your tables correctly (for CRT, the CI is called the 95% confidence interval of the difference. For bootstrapping, remember you are looking for the 95% Cis for the population mean ) Sample N Mean CLT 95% CI lower limit CLT 95% CI upper limit Bootstrap 95% CI lower limit Bootstrap 95% CI upper limit A 10 6 4.88 7.12 5.10 7 B 30 6 5.44 6.56 5.47 6.53 C 11 10 1.03 18.97 5.86 18.36 D 31 7.42 4.47 10.37 5.61 10.52 E 40 8.05 7.28 8.82 7.28 8.75 By reference to the histograms, raw scores, and the table that you have filled in above, answer the questions below . Note that in general, I am after answers that are simple and often obvious. Don’t worry if some questions seem ‘too easy’. I’m asking these questions to try and guide your thinking through the data in a step-by-step manner (and some of the steps and very straight-forward).

Part 2: Questions based on histograms and raw data These questions aim to get you more familiar with the data, before answering questions about confidence intervals. a) Which two distributions most closely approximate the normal distribution? (1 mark) Distributions A and B are closest to the normal distribution. b) What is the main difference between sample A and sample C? (Note that it is useful here to actually look at the numbers in SPSS, not just at the histogram) (1 mark) C has a very small lower limit and a very large upper limit, creating a skewed distribution. A has upper and lower limits that are somewhat similar, creating a normal distribution c) What is the main difference between sample B and sample D? (1 mark) B is a normal distribution and D is a skewed distribution d) Which one distribution do you think is the most different from a normal distribution? Briefly justify your answer. (1 mark) I think distribution C is the most different from a normal distribution because it has the most dramatically different upper and lower limits. Part 3: Questions about confidence intervals for samples A and B a) Samples A and B look very similar, with similar means and standard deviations. Yet sample B generated narrower 95% confidence intervals. Why would the Central Limit Theorem (CLT) generate narrower 95% confidence intervals for sample B compared with A? (1 mark) Your answer should demonstrate an understanding of how those distributions are generated by the CLT Sample B has a narrower confidence interval because it has a larger sample size, when a sample size is larger it will be more representative of the population and show a more narrow distribution b) Why would bootstrapping also generate narrower 95% confidence intervals for B compared with A? (1 mark) Again, your answer should demonstrate an understanding of how those distributions are generated. This question may require some thought and is not directly answered in my lecture slides…. Bootstrapping would generate narrow 95% confidence intervals based on the degrees of freedom

Part 4: Questions about confidence intervals for samples C and D a) Sample C is similar to sample A (above), and sample D is similar to sample B. The only differences are that C and D each contain an additional outlier. What is the most obvious impact on confidence intervals of having an outlier in the sample? Is the impact of the outlier affected by sample size? (2 mark) When there are outliers present it will present a skewed distribution rather than a normal distribution, outliers do affect the set of data but an outlier in a smaller sample size will more greatly impact the data set b) Considering just sample C, how does the confidence interval generated by CLT differ from that generated by bootstrapping? (1 marks) Don’t just tell me if the interval is bigger or smaller – focus on comparing the values that define the lower and upper limits of the confidence intervals generated by the two methods. The confidence interval generates a normal, symmetrical distribution with the 95% tails being even and the bootstrapping does not. c) What does your answer to (b) tell you about the shapes of the sampling distributions based on the CLT compared with bootstrapping? (1 mark) My answer from b tells the the shape of the distribution based on just the CLT would show a normal distribution, but bootstrapping would show a skewed distribution. Part 5: Questions about confidence intervals for sample E a) Our histogram of sample E showed a clearly non-normal distribution of raw scores. It is likely therefore that it comes from a population that also has a non-normal distribution. We know that the distribution of sample means generated by the CLT has to be symmetrical (see lecture for details). Do you think that the distribution of sample means generated by bootstrapping also approximates to a symmetrical distribution? To help you answer, try calculating the difference between the mean and the upper, and lower limits of the bootstrapped 95% confidence interval (1 mark) No, bootstrapping does no approximate a symmetrical distribution. b) The CLT assumes that the distribution of sample means approximates normality if sample size is >30, irrespective of the distribution of scores in the underlying population. Is this assumption supported by your answer above? Explain why. (1 mark) Yes, it is. This is because the central limit theorem 95% confidence interval gives a normal distribution for sample E which is above 30, but bootstrapping never assumes that the

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

distribution is normal which is shown in our values in the table. Submission checklist  Video(s) of you generating histograms, CLT confidence intervals, and bootstrapped confidence intervals for all 5 datasets (A-E)  This document in which you have o Completed table above (with data on N, Mean, 95% CI limits) o Answered all questions (Parts 2-5)  Note that I am not asking you to include the output tables generated by SPSS (though I will typically request this in future assignments)

Assignment 3

Related Documents