midterm_review_session

.pdf

School

Hong Kong Polytechnic University *

*We aren’t endorsed by this school

Course

273

Subject

Statistics

Date

Nov 24, 2024

Type

pdf

Pages

8

Uploaded by lixun73230

STAT 151A Lab 7: Midterm Review Session October 6, 2023 Note: there is no submission required for lab 7. This worksheet doesn’t include everything you need to review for the midterm. Please see the midterm study guide posted on bCourses for a more comprehensive list of concepts, examples and exercises. 1 Data transformation Problem 1 Conceptual Review (a) Why do we transform data? (b) What is Box-Cox transformation on X ? (c) What p do you use to correct positive skewness (right skew)? What p do you use to correct negative skewness (left skew)? (d) A good transformation will make this ratio UQ M M LQ close to 1. (e) What is Tukey and Mosteller’s bulging rule and how to use it to correct monotone non-linearity? Problem 2 Excercise 4.1 - Fox Creat a graph for the ordinary power transformations X X p for p = 1 , 0 , 1 , 2 , 3. (When p = 0, however, use the log transformation.) Compare the graph to Figure 4.1, and comment on the similarities and differences between the two families of transformations x p and ( x p 1) /p . 1
STAT 151A Lab 7: Midterm Review Session October 6, 2023 2 Simple linear regression Problem 3 SLR review Consider simple linear regression y i = β 0 + β 1 x i + ϵ i . (a) what are the assumptions? (b) Derive the least squares estimates of β 0 and β 1 . (c) Show that ˆ β 0 and ˆ β 1 are unbiased. What assumptions are used? (d) Derive var ( ˆ β 0 ), var ( ˆ β 1 ) and cov ( ˆ β 0 , ˆ β 1 ). What assumptions are used? (e) What is an unbiased estimator for σ 2 ? Problem 4 TSS, RSS and R 2 review Consider simple linear regression y i = β 0 + β 1 x i + ϵ i under standard linear model assumptions: (a) What is residual standard error and how to interpret it? (b) What are total sum of squares, regression sum of squares, and residual sum of squares? (c) Definition of R-squared and what does it represent? Problem 5 (SP23 HW) Consider simple linear regression where there is one response variable y and an explanatory variable x and there are n subjects with values y 1 , · , y n and x 1 , · · · , x n . (a) What are the estimates for α 0 and α 1 if we regress x on y ? (b) Let ˆ β 0 and ˆ β 1 be the estimate from regressing y on x . Intuition might suggest that ˆ α 1 = 1 / ˆ β 1 . Is this true? Problem 6 Excercise 5.9 Show that in simple-regression analysis, the standardized slope coefficient B is equal to the correlation coefficient r . (In general, however, standardized slope coefficients are not correlations and can be outside of the range [0, 1].) 2
STAT 151A Lab 7: Midterm Review Session October 6, 2023 3 Multiple regression Problem 7 MR Review Consider multiple regression y = + ϵ . (a) what are the assumptions? (b) Derive the least squares estimates of β . (c) Show that ˆ β is unbiased. What assumptions are used? (d) Derive cov ( ˆ β ). What assumptions are used? (e) What is an unbiased estimator for σ 2 ? Problem 8 Other concepts of MR (a) what is adjusted R-squared? Why R 2 can only rise? (b) How do correlated variables impact the regression coefficient? (c) What are the standardized coefficient and how to interpret them? Problem 9 True/False (Past midterm) (a) R 2 is an effective model selection criterion for deciding the best size for a linear model. (b) If I assume the data-generating process is y = + ϵ with full rank matrix X treated as fixed, then the following is true: arg min || y || 2 2 = ( X T X ) 1 X T y regardless of the distribution of ϵ . (c) The R-squared summary output will always increase if I add more covariates to the regression. Problem 10 SP23 midterm In many data analyses, y observations are collected from various sensors with different mea- surement variabilities. Let’s say that I know the variability of each sensor such that I can safely assume the following model: 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help