Questions-HW3-STA305F2023

pdf

School

University of Toronto *

*We aren’t endorsed by this school

Course

305

Subject

Statistics

Date

Apr 3, 2024

Type

pdf

Pages

5

Uploaded by BarristerEagle1470

Report
University of Toronto Department of Statistical Sciences Courses: STA305H1F Section L0101 Homework: HW3 Due date: 24 November 2023 0. Read the Assignment descriptions, information inside each box for submitting your answer file on the Crowdmark. 1. Submit your response to Image/PDF questions box (e.g., Box Qx) as a single PDF file containing everything: handwritten responses + R/Rmd outputs + R/Rmd codes. Marks will be assigned to your answer in PDF file (Box Qx). 2. Submit your R/Rmd codes to Text answer question box (e.g., Box Qx R/Rmd) as a separate file(s). This file will be downloaded and evaluated but marks will be assigned to your answer in PDF file (Box Qx). 3. If mentioned specifically, R/Rmd codes may not be required for some questions or its parts. 4. Submit files separately for individual questions. 5. Computations using a language other than R will not be accepted. 6. PENALTY: Upload/submit your files in time to avoid penalty. Also consult the syllabus. 7. Full credit will be given if you justify your answers using clear and sys- tematic approach. The justification must help the grader to assess how you reached your answers. You may take help of the codes and concept in Prof. Nathan’s textbook (2022): Link: http://designexptr.org/index.html/ 1
STA305H1F2023-HW3: Q1 [Points =3+3+2=8 ] Q1. DCM(2010)- Problem 4.16 . An article in Communications of the ACM (Vol. 30, No. 5, 1987) studied different algorithms for estimating software development costs. Six algorithms were applied to several different software development projects and the percent error in estimating the development cost was observed. Some of the data from this experiment is show in file: DataHW3Q1F2023Algorithms.csv. (a) [Points = 3] Do the algorithms differ in their mean cost estimation accuracy? Use α = 0 . 05. Justify your answer. (b) [Points = 1+1+1=3] Analyze the residuals from this experiment for departure from the model assumption you made to analyze this dataset, in particular: examine the constancy of error variance, normality of residuals and any outliers. Comment on any observed departures from the assumptions. (c) [Points = 2] Which algorithm would you recommend for use in practice? Justify your answer. 2
STA305H1F2023-HW3: Q2 [Points =4+5=9 ] Q2 (a) [Marks=4] For this part, R/Rmd coding is NOT REQUIRED; solution should be handwritten and submitted as a PDF file. Consider an experiment conducted in a LSD of order p . Let the observations be denoted by y ijk for the row factor (Row) at level i , column factor (Col) at level k and treatment factor (Trt) at level j , i, j, k = 1 , .., p . The observation in the (single) cell corresponding to the factor levels: Row = i, Col=k and Trt =j is missing. All other observations are available. Derive an expression to estimate the missing value by minimizing the error sum of squares. (b) For part (b), you may use R/rmd codes. The file DataHW3Q2F2023MissingV.csv has data from an LSD with four treatments. The orthogonal blocking factors are Row and Col. The treatment factor is Treat and the response is Yield. Note there is a missing value in Yield column. [Marks=1+1+2+1] Write a model for the data and fit the model. Use the fitted model to estimate the missing value. Use the formula (expression) you derived in part (a) to estimate the missing value in this dataset. Report them and compare their values up to 2 decimal places. (Submit your answer PDF in Box for Q2, for parts (a) and (b), and com- puting R/Rmd codes files associated with part (b) in Box for Q2 R/Rmd – all on Crowdmark.) 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
STA305H1F2023-HW3: Q3 [Points =4+4+4=12 ] Q3 (a) [Marks= 3+1=4] (a) Examine the following incomplete block design where the first column is block number and the remaining four columns show treatments (1 to 9) in the 4 plots of the block. Is this design a BIBD? Show your details. If the design is a BIBD, then write the parameters of the design. Block (first column) followed by treatment numbers in plots (last 4 columns) 1 1 4 6 7 2 2 6 8 9 3 1 3 8 9 4 1 2 3 4 5 1 5 7 8 6 4 5 6 9 7 2 3 6 7 8 2 4 5 8 9 3 5 7 9 10 1 2 5 7 11 2 3 5 6 12 3 4 7 9 13 1 2 4 9 14 1 5 6 9 15 1 3 6 8 16 4 6 7 8 17 3 4 5 8 18 2 7 8 9 (b) An engineer is studying the mileage performance characteristics of five types of gaso- line additives. In the road test he wishes to use cars as blocks; however, because of a time constraint, he must use an incomplete block design. He runs the balanced design with the five blocks that follow. The data on the mileage is given in the file DataHW3Q3F2023Gasoline.csv, where columns are Additive, Car and Mileage. ( i ) [Marks= 3+1] Analyze the data from this experiment in terms of the ANOVA and draw conclusions (use α = 0 . 05). ( ii ) [Marks= 3+1] Write the adjusted means of the treatments and their standard errors. (Hint: In order to carryout the analysis, you may use the formulae for various sums of squares and for adjusted means ( kQ i / ( λa ) and its estimated standard error from the intrablock analysis), or use other linear model fitting functions in R (see the class notes and R codes). Do not forget to add general mean from the data. No need to carry out interblock analysis. ) 4
STA305H1F2023-HW3: Q4 [Points =3+3+4=10 ] Q4 (a) [Marks=3] Construct a single replicate 2 5 factorial design in four blocks by confound- ing the interactions ACE and BCD with blocks. Name the factors as A, B, C, D and E . Show your steps clearly using handwriting. Do not use and R function or a software to generate the design. (b) A dataset from a fraction of 2 5 factorial is given in file ”DataHW3Q4F2023Lenth.csv”, experimented with homogeneous experimental material. Analyze the data by providing the following information: (i) [Marks=3] Prepare ANOVA table for the data using all the estimable lower order effects and interactions in a model. (ii) Marks given in part (iii). Examine the contributions of various factors’ effects and interactions. You need to screen the model terms and retain the active ef- fects/interactions. You may find Daniel plots helpful. (iii) [Marks=2+2] Fit a model for the same data again, but this time only with active effects/interactions you have found. Conclude the significance of the effects and inter- actions. If you need a significance level, use α = 0 . 01. (Submit your answer PDFs in Box for Q4, for parts (a) and (b), and com- puting R/Rmd codes files associated with part (b) in Box for Q4 R/Rmd – all on Crowdmark.) 5