In-Class SPSS Problem Set_SPSS Exercise_1 (1)

docx

School

York University *

*We aren’t endorsed by this school

Course

211

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

10

Uploaded by imafattiemeh

Report
Week 8 In-Class Activity: SPSS Exercise #1 Go to AppsAnywhere, Sheridan College @ https://apps.sheridancollege.ca/ When you are in AppsAnywhere, the system might ask you to install the application, if you don’t have it already – so, please go ahead and install the application. Once you are in AppsAnywhere, scroll down to find Sheridan Virtual Desktop > then click Launch > Launch the Client > Pick your Sheridan Account > Click General Purpose => You should now be in Sheridan Virtual Desktop (depending on the way your laptop is setup, these steps might be slightly different). Click AppsAnywhere icon again on your Virtual Desktop > Find and Launch SPSS 28 Bring the data analysis Excel File in your Virtual Desktop (copy from Slate and paste in Virtual Desktop) Import the Data in SPSS from Excel. Follow this path: File > Import Data > Excel > Choose Data Analysis Excel File > Open > Check Appropriate Boxes > OK Set the hypothesis to test the association between gender and cola preference. Cross-tabulate the respondents’ genders with cola preference and test the hypothesis. Do you see any statistically significant association between gender and cola preference? Explain. Hypotheses: H0: There is no association between sex and cola preference. H1: There is an association between sex and cola preference. SPSS procedure: Analyze > Descriptive Statistics > Crosstabs > Drag and drop one variable in Row box and the other in Column box > Statistics > Check off Chi-square > Continue > OK Note: Chi-square test in cross tabulation is used to test the association between categorical variables only (We will use a level of significance of 5%). Go to Output window to see the results. Report cross-tabulation and Chi-square results.
There is no association between sex and cola preference. It is because since p-value (1.000) is greater than the level of significance (5% or 0.05), we fail to reject H0 (as per p-value method). In-Class Work_SPSS Exercise #2 1. Formulate a statistical hypothesis appropriate for the consumer group’s purpose. H0: The mileage of the car is 30 miles per gallon or µ = 30 (or, there is no difference between sample mean and population mean). H1: The mileage of the car isn’t 30 miles per gallon or µ ≠ 30 (or, there is a difference between sample mean and population mean). 2. Calculate the mean (average) miles per gallon and the sample standard deviation. SPSS procedure: Analyze>Compare Means>One-Sample T Test>Drag and drop variable in Test Variable(s) box>Enter Test Value as 30>OK
Mean = 28.17 Standard Deviation = 3.03 3. Conduct an appropriate hypotheses test to decide whether or not Premier Motorcar’s claim is correct, using a 0.05 significance level. Since critical value (rule of thumb value is 2, or from the table – 2.064, two tailed test) is less than test statistics (3.013, ignore negative sign), we reject H0. Therefore, the Premier Motorcar’s claim is false. In-Class Work_SPSS Exercise #3 1. Paired Samples t-test Used to test the difference in means between two groups with paired observations. H0: There is no difference in distance run between the groups of protein and carb-protein. (or, μ 1 = μ 2 ) H1: There is a difference in distance run between the groups of protein and carb-protein. (or, μ 1 ≠μ 2 ) (two-tailed test) SPSS procedure: Analyze>Compare Means>Paired Samples T Test>Drag two variables under Variable 1 and Variable 2 on the right (doesn’t matter where you place them)> OK Go to Output window to see the results. Choose the third table.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Findings: Since p-value (0.000) is less than the level of significance (0.05), we reject H0 (used p- value method). Decision: There is a difference in distance run between the groups of protein and carb-protein. 2. Independent samples t-test Used to test the difference in means between two groups with independent observations. H0: There is no difference in engagement with TV between male and female. (or, μ 1 = μ 2 ) H1: There is a difference in engagement with TV between male and female. (or, μ 1 ≠μ 2 ) (two- tailed test) SPSS procedure: Analyze>Compare Means>Independent-Samples T Test>Drag Engagement variable to the right in the top box and gender in the grouping variable box>Click Define Groups> Enter 1 under Group 1 and 2 under Group 2 (note the numbers 1 and 2 comes from the way your gender variable has been coded. If they were defined as F and M, for example, you should enter F and M here)>Continue>OK Go to Output window to see the results. Choose the table in the middle. Findings: There are two rows and you need to pick the first row because equal variance is assumed as p-value. Since p-value (0.023) is less than the level of significance (0.05), we reject H0 (as per p-value method). Decision: There is a difference in engagement with TV between male and female. 3. One-Way ANOVA
Used to test the difference in means among more than two groups with independent observations. Here you have four groups (sedentary, low, moderate, high). You can find this when you look at Variable View window of the SPSS file (under ‘Values’ for variable ‘group’) H0: There is no difference in stress level among four group of people. (or, μ 1 = μ 2 = μ 3 = μ 4 ) H1: At least one group is different in stress level. (or, at least one μ is different) SPSS procedure: Analyze>Compare Means>One-Way ANOVA> Drag Coping Stress variable to the right in the top box and Physical Activity Level under Factor> OK Go to output window and choose the top table. Findings: Since p-value (0.000) is less than level of significance (0.05), we reject H0 (used p- value method). Decision: At least one group is different in stress level.
In-Class Work_SPSS Exercise #4 In a pretest, data were obtained from 20 respondents on preferences for sneakers as measured by number of purchases in the last five years (Preference). The respondents also provided their evaluations of the sneakers on comfort as measured by the variable 'Comfort', style by 'Style', and durability by 'Durability', on 7-point scales, with 1 being ‘poor’ and 7 being ‘excellent’. The resulting data are given in 'Sneakers Study' SPSS data file.  1. Run a bivariate regression with preference for sneakers (Preference) as the dependent variable and evaluation on Comfort as the independent variable. Does comfort have any impact on people's preference for sneakers shoes? Why? 2. Interpret regression coefficient on Comfort and R-square. 3. Run a correlation test using the variables Style and Durability. Interpret correlation coefficient. Do you think there is a statistically significant correlation between them? Give reason. ** KNOW THE DEPENDENT VARIABLE – PREFERENCE IS DEPENDENT 1. Testing Impact of Comfort on Preference (two-tailed) H0: There is no impact of comfort on preference for sneakers shoes (or, β = 0 ) H1: There is an impact of comfort on preference for sneakers shoes (or, β≠ 0 ) SPSS procedure: Analyze>Regression>Linear>Drag and drop dependent and independent variables to the corresponding boxes to the right>OK Coefficients a Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta 1 (Constant) .028 1.337 .021 .983 Comfort .921 .310 .573 2.967 .008 a. Dependent Variable: Preference for Sneakers Since p-value (0.008) is less than LoS (0.05), we reject H0. Therefore, there is an impact of shoe’s comfort level on people’s preference for sneakers shoes. 2. Interpretation of Regression Coefficient on Comfort and R-Square Coefficients a Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta 1 (Constant) .028 1.337 .021 .983 Comfort .921 .310 .573 2.967 .008
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
a. Dependent Variable: Preference for Sneakers Interpretation of coefficient on Comfort When comfort level increases by one unit , people’s preference level increases by 0.921. Interpretation of R-Square Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1 .573 a .328 .291 1.599 a. Predictors: (Constant), Comfort A 32.8% variation in people’s preference for sneakers shoes is explained by the variation in comfort level, and remaining 67.2% variation is due to other factors. 3. Correlation Coefficient between Style and Durability H0: There is no correlation between Style and Durability (ρ = 0). H1: There is a correlation between Style and Durability (ρ ≠ 0). SPSS Procedure: Analyze>Correlate>Bivariate>Drag and drop variables to the box in the right>OK Correlations Durability Style Durability Pearson Correlation 1 .364 Sig. (2-tailed) .114 N 20 20 Style Pearson Correlation .364 1 Sig. (2-tailed) .114 N 20 20 Interpretation of correlation coefficient: Pearson correlation coefficient is .364 which is slightly moderate positive correlation. Since p-value (0.114) is greater than LoS (0.05), we fail to reject H0. Therefore, there is no statistically significant correlation between Style and Durability. + moving in the same direction - moving in the negative direction ** CORRELATION
How much variation there is in the dependent variable (R^2) In-Class Work_SPSS Exercise #5 Use the data Profit Margin and answer the following questions: 1. Set up the multiple regression model of margin on sales, labour, experience, and performance.   2. Identify the variables that are significant predictors of profit margin and the ones that are not. Interpret the regression coefficient of significant predictors. 3. Interpret R-squared.  4. Interpret F-test under ANOVA. 5. Conduct test of multicollinearity and obtain the right model after addressing the problem of multicollinearity.  1. Multiple regression model Y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + ε Where, Y = Profit margin (dependent) X1 = Sales (independent) X2 = Labour X3 = Experience X4 = Performance 2. Identifying significant predictors of profit margin In order to identify the variables that are significant predictors, we need to do t-test for regression coefficients. SPSS Procedure: Analyze>Regression>Linear>Drag and drop dependent and independent variables to the appropriate box in the right>OK Coefficients a Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta 1 (Constant) 171.243 235.937 .726 .473 Average Sales .091 .031 2.340 2.944 .006 Labor Costs -.070 .035 -1.588 -2.007 .053 Years Experience -.488 .956 -.054 -.511 .613 Job Performance Rating -1.856 3.034 -.069 -.612 .545 a. Dependent Variable: Average Margin
Average Sales is a significant predictor of average profit margin. This is because p-values for t- test on this coefficient is less than level of significance (0.05) and we reject the null hypothesis of no impact of this variable on average profit margin. Interpretation of coefficient on average sales: When average sales increase by one unit, the average profit margin increases by 0.091, keeping all other independent variables constant . 3. Interpretation of R-Square Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1 .780 a .608 .563 51.25132 a. Predictors: (Constant), Job Performance Rating, Years Experience, Labor Costs, Average Sales The variation in all independent variables can explain the variation in profit margin by 60.8% whereas remaining variation of 39.2% is accounted by other unknown factors. 4. Interpretation of F-test under ANOVA H0: The model isn’t statistically significant. H1: The model is statistically significant. ANOVA a Model Sum of Squares Df Mean Square F Sig. 1 Regression 142566.533 4 35641.633 13.569 .000 b Residual 91934.438 35 2626.698 Total 234500.972 39 a. Dependent Variable: Average Margin b. Predictors: (Constant), Job Performance Rating, Years Experience, Labor Costs, Average Sales Since p-value for F-test (0.001 or less) is less than the level of significance (0.05), we reject H0. This shows that the model is statistically significant. F > 4 = Reject the hypothesis (null/H0) 5. Test for multicollinearity Multicollinearity problem arises when independent variables are correlated and our prediction may not be accurate. To test multicollinearity, one of the methods is to calculate Variance Inflation Factor (VIF). As a rule of thumb, if this is greater than 5, there is potentially a problem of multicollinearity.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
SPSS Procedure: Analyze>Regression>Linear>Drag and drop dependent and independent variables to the appropriate box in the right>Statistics>Check ‘Collinearity Diagnostics’>OK Coefficients a Model Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics B Std. Error Beta Tolerance VIF 1 (Constant) 171.243 235.937 .726 .473 Average Sales .091 .031 2.340 2.944 .006 .018 56.384 Labor Costs -.070 .035 -1.588 -2.007 .053 .018 55.897 Years Experience -.488 .956 -.054 -.511 .613 .990 1.011 Job Performance Rating -1.856 3.034 -.069 -.612 .545 .881 1.135 a. Dependent Variable: Average Margin You can remove Average Sales and Labor Costs (or remove only Labor Costs which is not significant) to get a right regression model without having the problem of multicollinearity (as shown below). It is because these two variables have VIF greater than 5. Means they are correlated and impacting each other * Resolve it by removing both or one and running the model again Coefficients a Model Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics B Std. Error Beta Tolerance VIF 1 (Constant) 117.564 244.074 .482 .633 Years Experience -.392 .994 -.044 -.395 .695 .992 1.008 Job Performance Rating -1.367 3.149 -.051 -.434 .667 .887 1.128 Average Sales .030 .005 .761 6.494 .000 .884 1.131 a. Dependent Variable: Average Margin