Quiz3

pdf

School

University of California, Los Angeles *

*We aren’t endorsed by this school

Course

100A

Subject

Statistics

Date

Feb 20, 2024

Type

pdf

Pages

Uploaded by aed.academics

In R there is a dataset called BikeCommute . This dataset has information on several variables including Minutes which is the number of minutes the person spent on their commute. You can see the beginning of the dataset by typing head(BikeCommute) . Match the sums of squared residuals, variance, and standard deviation of the Minutes to the numbers in the matrix below. Calculate the Z-score for a case with a score of 12 from a sample with a mean of 14 and a variance of 25. (Remember you can use R like a calculator) 0.524 1664.739 5.502 30.268 28.823 0.724 Sums of Squares Residual for a simple model of Minutes Variance of Minutes Standard deviation of Minutes 11.44

Which of the following statistics is used to describe how unusual a specific case is compared to the rest of the cases on a specific variable? In R there is a dataset called BikeCommute . This dataset has information on several variables including Minutes which is the number of minutes the person spent on their commute. You can see the beginning of the dataset by typing head(BikeCommute) . Consider the BikeCommute data, introduced above. Case 4 has a Z-score on the Minutes variable of 2.03, which of the following interpretations is correct? In the simple model we select a predicted score which is the same for all cases. Depending on how we measure total error the predicted score which minimizes error will differ. Match the method of measuring error to the predicted score which would minimize that error. 11.44 -0.4 9.2 -0.08 Variance Mean Z-score Sum of Squared Residuals Case 4's commute is 2.03 standard deviations Case 4's commute is 2.03 minutes Case 4's commute is 2.03 standard deviations above the average commute time Case 4's commute is 2.03 minutes above the average commute time Mean Median Interquartile Range Standard Deviation Sum of squared residuals

If you run this code in R ..... model <- lm(Distance~NULL, data = BikeCommute) model anova(model) The output looks like this... Call: lm(formula = Distance ~ NULL, data = BikeCommute) Coefficients: (Intercept) 27.16 Df Sum Sq Mean Sq F value Pr(>F) <int> <dbl> <dbl> <dbl> <dbl> Residuals 55 4.620352 0.08400641 NA NA Based on this output which of the following statements is TRUE? ( Check all that apply ) Maya is doing a study and wants to use a general linear model to analyze her data. She wants her predictions to be as close to the data as possible while still making the same prediction for everyone, so she says she wants to use a statistic that maximizes squared error. What is the issue with Maya's logic? Sum of absolute deviations Mean Median Interquartile Range Standard Deviation If we had used 28 as our predicted score (instead of the mean), the variance would be greater than 0.08400641 If we had used 28 as our predicted score (instead of the mean), the sum of squared residuals would be greater than 4.620352 If we had used 26 as our predicted score (instead of the mean), the sum of squared residuals would be less than 4.620352 If we had used 26 as our predicted score (instead of the mean), the variance would be less than than 0.08400641 Maximizing squared error is not as good as maximizing the sum of absolute residuals, so she should do that instead.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version