STAT847_W24_Reading1

pdf

School

University of Waterloo *

*We aren’t endorsed by this school

Course

847

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by DoctorMoonCrocodile37

STAT 847: Reading Assignment 1 DUE: Friday January 19, 2024 by 11:59pm Eastern NOTES Your assignment must be submitted by the due date listed at the top of this document, and it must be submitted electronically in .pdf format via Crowdmark. Organization and comprehensibility is part of a full solution. Consequently, points will be deducted for solutions that are not organized and incomprehensible. Furthermore, if you submit your assignment to Crowdmark, but you do so incorrectly in any way (e.g., you upload your Question 2 solution in the Question 1 box), you will receive a 5% deduction (i.e., 5% of the assignment’s point total will be deducted from your point total). Reading: Hands-On Exploratory Data Analysis with R [8 marks] Open the UWaterloo Library website, lib.uwaterloo.ca , and use your WatIAM account to search for an open the book Hands-On Exploratory Data Analysis with R. By Radhika Datar, Harish Garg. The following questions can be answered by reading the “Univariate and Control Datasets” chapter. Please put your answers to questions 1-4 on a separate page from your answers to questions 5-8 , this can be done in Word with Crtl + Enter, or in Markdown with \newpage . Each question is work one mark. (Unless “in your own words” is specified, you can directly quote the book.) Winter 2024 Reading Assignment Questions Q1 . What’s the name of the test for outliers used in this chapter? Answer: Tietjen-Moore test Q2 . What does the variable ‘pdays’ represent? Answer: The variable ’pdays’ represents the number of days that have elapsed since the customer was last contacted Q3 . How many rows are there in the bank marketing data? Answer: 11162 Q4 . What does each row represent in the bank marketing data? Answer: Each row in the dataset represents the details of a single client contact made during a bank marketing campaign. The columns of each row can be categorized as follows: 1. client’s personal information 2. client’s Financial information 3. details of the interaction regarding campaign 4. historic information about previous campaign Each row is a comprehensive record of a single marketing interaction with a client, encompassing personal, financial, and campaign-related information. 1

Q5 . (Challenge) What is two sample t-test actually comparing in the “the t-test in R” page? Answer: The t-test is a method for comparing two samples. In the page in context, the client’s age and bank balance is being compared i.e. we are trying to determine whether their means differ significantly. The test is conducted under the null hypothesis that there is no difference between the means. The alternative hypothesis is that there is a difference. In this case, the result suggests rejecting the null hypothesis in favor of the alternative. Q6 . What makes a model parsimonious? Answer: A model is considered parsimonious if it is simple yet has great explanatory or predictive power, using a minimum number of parameters or predictor variables. It should employ parsimonious covariance structures and only consider relevant variables. Q7 . Almost every named distribution (e.g., the normal, the uniform) has a function that calculates its cumulative distribution function. What is the letter that all such functions start with? Answer: The letter ’p’. Density or probability functions start with the letter ’d’ (eg. dnorm) and the R funtion that calculates their respective cumulative distribution starts with the letter ’p’ (eg pnorm) Q8 . According to the Shapiro-Wilk test, are bank balances normally distributed? Answer: In the example provided in the book, a small fraction of samples are being used to perform the Shapiro-Wilk test in which the p-value is less than 0.05 and we reject the null hypothesis that bank balances are normally distributed (for those 10 samples only). However, it is provided in the text that as we increase the number of samples, the p-value increases beyond 0.05, satisfying the null hypothesis for the larger set of samples that form a normal distribution. 2

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

STAT847_W24_Reading2.pdf

Recommended textbooks for you

MATLAB: An Introduction with Applications

Statistics

ISBN:9781119256830

Author:Amos Gilat

Publisher:John Wiley & Sons Inc

Probability and Statistics for Engineering and th...

Statistics

ISBN:9781305251809

Author:Jay L. Devore

Publisher:Cengage Learning

Statistics for The Behavioral Sciences (MindTap C...

Statistics

ISBN:9781305504912

Author:Frederick J Gravetter, Larry B. Wallnau

Publisher:Cengage Learning

Elementary Statistics: Picturing the World (7th E...

Statistics

ISBN:9780134683416

Author:Ron Larson, Betsy Farber

Publisher:PEARSON

The Basic Practice of Statistics

Statistics

ISBN:9781319042578

Author:David S. Moore, William I. Notz, Michael A. Fligner

Publisher:W. H. Freeman

Introduction to the Practice of Statistics

Statistics

ISBN:9781319013387

Author:David S. Moore, George P. McCabe, Bruce A. Craig

Publisher:W. H. Freeman

SEE MORE TEXTBOOKS

Recommended textbooks for you

MATLAB: An Introduction with Applications
Statistics
ISBN:9781119256830
Author:Amos Gilat
Publisher:John Wiley & Sons Inc
Probability and Statistics for Engineering and th...
Statistics
ISBN:9781305251809
Author:Jay L. Devore
Publisher:Cengage Learning
Statistics for The Behavioral Sciences (MindTap C...
Statistics
ISBN:9781305504912
Author:Frederick J Gravetter, Larry B. Wallnau
Publisher:Cengage Learning
Elementary Statistics: Picturing the World (7th E...
Statistics
ISBN:9780134683416
Author:Ron Larson, Betsy Farber
Publisher:PEARSON
The Basic Practice of Statistics
Statistics
ISBN:9781319042578
Author:David S. Moore, William I. Notz, Michael A. Fligner
Publisher:W. H. Freeman
Introduction to the Practice of Statistics
Statistics
ISBN:9781319013387
Author:David S. Moore, George P. McCabe, Bruce A. Craig
Publisher:W. H. Freeman