STA3701_2023_TL_016_0_E

pdf

School

University of South Africa *

*We aren’t endorsed by this school

Course

3701

Subject

Statistics

Date

Nov 24, 2024

Type

pdf

Pages

3

Uploaded by ChancellorFang8501

Report
STA3701/016/0/2023 Tutorial Letter 016/0/2023 Applied Statistics III STA3701 Year module Department of Statistics ASSIGNMENT 6 QUESTIONS
STA3701/016/0 ASSIGNMENT 06 Unique Nr.: 859830 Due date: 10 November 2023 Instructions 1. Do not PLAGIARISE. Students suspected of plagiarism will be subjected to disciplinary processes. 2. Use R to answer all the questions. Present or attach R outputs. Label all the figures and tables . Append the R-codes to the essay - do not include them in the body of the discussion. You must only include the R-outputs. Question 1 Each of the datasets hellung , coking and cystfibr available in ISwR package in R can be analysed using either multiple regression technique, two-way analysis of variance (Anova) or one-way covariance analysis (Ancova). Download and analyse ONE of these datasets using the relevant model and report the results from the data analysis in the form of an essay. The essay should be structured as follows: 1 Exploratory data analysis. (10 marks) Briefly describe the chosen dataset and specify the response variable and the explanatory variables. Present and interpret numerical and graphical summaries of the data. For the Anova and Ancova, use interaction plots to investigate the interaction effects of the independent variables on the response variable. 2 Methodology. (10 marks) Introduce the method of analysis (model) and motivate why it is the appropriate technique to analyse the data. List and discuss the assumptions underpinning the selected method of analysis. 3 Data analysis and discussion of results. (30 marks) Fit the model. (i) If the chosen model is multiple regression model your discussion should address the questions which follow. - Does multicollinearity exist in the model? Use the relevant plots (scatterplot matrix) and numerical analyses (correlation and ViF analysis) to investigate if there’s an existence of multicollinearity. - Investigate if the assumptions of independence, homoscedasticity and normality of the errors have been violated. Use both graphical representations and hypothesis-based tests.
STA3701/016/0 - Use the relevant regression diagnostics to assess the model for any unusual observations. - Use Mallow’s Cp criterion to select the “best” model and give the fitted model - Use p-value procedure to test the significance of the parameters in the “best” model. Set the level of significance to five percent. - Interpret the regression coefficients. (ii) If the chosen method of analysis is the analysis of variance your discussion should address the questions which follow. - Investigate if any of the model assumptions has been violated. - Determine the significance of the main effects and the interaction effect (Formulate hypotheses in terms of treatment means comparisons) - Use proper analysis of variance diagnostics to verify the assumptions and identify outliers / influential points in the data and propose appropriate remedies. (iii) If the chosen method of analysis is the analysis of covariance your discussion should address the questions which follow. - Investigate if the assumptions of the Ancova model holds. - Formulate hypotheses in terms of treatment means comparisons and hypotheses to compare two or more regression models. - Use the relevant analysis of covariance diagnostics to verify the assumptions and identify outliers / influential points in the data and propose appropriate remedies. NB: For all tests of hypotheses, state the null and alternative hypotheses, critical regions (or rejection regions), test statistics and conclusions. Grand total = [50]
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help