Concept explainers
In a simulation of 30 mobile computer networks, the average speed, pause time, and number of neighbor were measured. A “neighbor” is a computer within the transmission
- a. Fit the model with Neighbors as the dependent variable, and independent variables Speed, Pause, Speed,·Pause, Speed2, and Pause2.
- b. Construct a reduced model by dropping any variables whose P-values are large, and test the plausibility of the model with an F test.
- c. Plot the residuals versus the fitted values for the reduced model. Are there any indications that the model is inappropriate? If so, what are they?
- d. Someone suggests that a model containing Pause and Pause2 as the only dependent variables is adequate. Do you agree? Why or why not?
- e. Using a best subsets software package, find the two models with the highest R2 value for each model size from one to five variables. Compute Cp and adjusted R2 for each model.
- f. Which model is selected by minimum Cp? By adjusted R2? Are they the same?
a.
Construct a multiple linear regression model with neighbor as the dependent variable, speed, pause,
Answer to Problem 5SE
A multiple linear regression model for the given data is:
Explanation of Solution
Calculation:
The data represents the values of the variables number of neighbors, average speed and pause time for a simulation of 30 mobile network computers.
Multiple linear regression model:
A multiple linear regression model is given as
Let
Regression:
Software procedure:
Step by step procedure to obtain regression using MINITAB software is given as,
- Choose Stat > Regression > General Regression.
- In Response, enter the numeric column containing the response data Y.
- In Model, enter the numeric column containing the predictor variables X1, X2, X1*X2, X1*X1 and X2*X2.
- Click OK.
Output obtained from MINITAB is given below:
The ‘Coefficient’ column of the regression analysis MINITAB output gives the slopes corresponding to the respective variables stored in the column ‘Term’.
A careful inspection of the output shows that the fitted model is:
Hence, the multiple linear regression model for the given data is:
b.
Construct a reduced model by dropping the variables with large P- values.
Check whether the reduced model is plausible or not.
Answer to Problem 5SE
A multiple linear regression model for the given data is:
Yes, there is enough evidence to conclude that the reduced model is plausible.
Explanation of Solution
Calculation:
From part (a), it can be seen that the ‘P’ column of the regression analysis MINITAB output gives the slopes corresponding to the respective variables stored in the column ‘Term’.
By observing the P- values of the MINITAB output, it is clear that the largest P-value is 0.390 corresponding to the predictor variable
Now, the new regression has to be fitted after dropping the predictor variable
Regression:
Software procedure:
Step by step procedure to obtain regression using MINITAB software is given as,
- Choose Stat > Regression > General Regression.
- In Response, enter the numeric column containing the response data Y.
- In Model, enter the numeric column containing the predictor variables X1, X2, X1*X1 and X2*X2.
- Click OK.
Output obtained from MINITAB is given below:
The ‘Coefficient’ column of the regression analysis MINITAB output gives the slopes corresponding to the respective variables stored in the column ‘Term’.
A careful inspection of the output shows that the fitted model is:
Hence, the multiple linear regression model for the given data is:
The full model is,
The reduced model is,
The test hypotheses are given below:
Null hypothesis:
That is, the dropped predictor of the full model is not significant to predict y.
Alternative hypothesis:
That is, the dropped predictor of the full model is significant to predict y.
Test statistic:
Where,
n represents the total number of observations.
p represents the number of predictors on the full model.
k represents the number of predictors on the reduced model.
From the obtained MINITAB outputs, the value of error sum of squares for full model is
The total number of observations is
Number of predictors on the full model is
Degrees of freedom of F-statistic for reduced model:
In a reduced multiple linear regression analysis, the F-statistic is
In the ratio, the numerator is obtained by dividing the quantity
Thus, the degrees of freedom for the F-statistic in a reduced multiple regression analysis are
Hence, the numerator degrees of freedom is
Test statistic under null hypothesis:
Under the null hypothesis, the test statistic is obtained as follows:
Thus, the test statistic is
Since, the level of significance is not specified. The prior level of significance
P-value:
Software procedure:
- Choose Graph > Probability Distribution Plot choose View Probability > OK.
- From Distribution, choose F, enter 1 in numerator df and 24 in denominator df.
- Click the Shaded Area tab.
- Choose X-Value and Right Tail for the region of the curve to shade.
- Enter the X-value as 0.76638.
- Click OK.
Output obtained from MINITAB is given below:
From the output, the P- value is 0.39.
Thus, the P- value is 0.39.
Decision criteria based on P-value approach:
If
If
Conclusion:
The P-value is 0.39 and
Here, P-value is greater than the
That is
By the rejection rule, fail to reject the null hypothesis.
Hence, there is sufficient evidence to conclude that the dropped predictor variable is not significant to predict the response variable y.
Thus, the reduced model is useful than the full model to predict the response variable y.
c.
Plot the residuals versus fitted line plot for the reduced model.
Check whether the model is appropriate.
Answer to Problem 5SE
Residual plot:
Yes, the model seems to be appropriate.
Explanation of Solution
Calculation:
Residual plot:
Software procedure:
Step by step procedure to obtain regression using MINITAB software is given as,
- Choose Stat > Regression > General Regression.
- In Response, enter the numeric column containing the response data Y.
- In Model, enter the numeric column containing the predictor variables X1, X2, X1*X1 and X2*X2.
- In Graphs, Under Residuals for plots, select Regular.
- Under Residual plots select box Residuals versus fits.
- Click OK.
Conditions for the appropriateness of regression model using the residual plot:
- The plot of the residuals vs. fitted values should fall roughly in a horizontal band contended and symmetric about x-axis. That is, the residuals of the data should not represent any bend.
- The plot of residuals should not contain any outliers.
- The residuals have to be scattered randomly around “0” with constant variability among for all the residuals. That is, the spread should be consistent.
Interpretation:
In residual plot there is high bend or pattern, which can violate the straight line condition and there is change in the spread of the residuals from one part to another part of the plot.
However, it is difficult to determine about the violation of the assumptions without the data.
Thus, the model seems to be appropriate.
d.
Check whether the model with only two dependent variables
Answer to Problem 5SE
No, the model with only two dependent variables
Explanation of Solution
Calculation:
Regression:
Software procedure:
Step by step procedure to obtain regression using MINITAB software is given as,
- Choose Stat > Regression > General Regression.
- In Response, enter the numeric column containing the response data Y.
- In Model, enter the numeric column containing the predictor variables X2 and X2*X2.
- Click OK.
Output obtained from MINITAB is given below:
The ‘Coefficient’ column of the regression analysis MINITAB output gives the slopes corresponding to the respective variables stored in the column ‘Term’.
A careful inspection of the output shows that the fitted model is:
Hence, the multiple linear regression model for the given data is:
The full model is,
The reduced model is,
The test hypotheses are given below:
Null hypothesis:
That is, the dropped predictors of the full model are not significant to predict y.
Alternative hypothesis:
That is, at least one of the dropped predictors of the full model are significant to predict y.
Test statistic:
Where,
n represents the total number of observations.
p represents the number of predictors on the full model.
k represents the number of predictors on the reduced model.
From the obtained MINITAB outputs, the value of error sum of squares for full model is
The total number of observations is
Number of predictors on the full model is
Degrees of freedom of F-statistic for reduced model:
In a reduced multiple linear regression analysis, the F-statistic is
In the ratio, the numerator is obtained by dividing the quantity
Thus, the degrees of freedom for the F-statistic in a reduced multiple regression analysis are
Hence, the numerator degrees of freedom is
Test statistic under null hypothesis:
Under the null hypothesis, the test statistic is obtained as follows:
Thus, the test statistic is
Since, the level of significance is not specified. The prior level of significance
P-value:
Software procedure:
- Choose Graph > Probability Distribution Plot choose View Probability > OK.
- From Distribution, choose F, enter 3 in numerator df and 24 in denominator df.
- Click the Shaded Area tab.
- Choose X-Value and Right Tail for the region of the curve to shade.
- Enter the X-value as 15.702.
- Click OK.
Output obtained from MINITAB is given below:
From the output, the P- value is
Thus, the P- value is
Decision criteria based on P-value approach:
If
If
Conclusion:
The P-value is
Here, P-value is less than the
That is
By the rejection rule, reject the null hypothesis.
Hence, there is sufficient evidence to conclude that at least one of the dropped predictors of the full model are significant to predict y.
Thus, the model with only two dependent variables
e.
Find the two models with the highest
Obtain the values of mallows
Answer to Problem 5SE
The two models with the highest
First model with
The values of M Mallows’
Predictor variables | Mallows’ | Adjusted |
92.5 | 60.1 | |
97 | 58.6 | |
47.1 | 75.2 | |
53.3 | 73 | |
7.9 | 89.2 | |
15.5 | 86.4 | |
4.8 | 90.7 | |
9.2 | 89 | |
6 | 90.6 |
Explanation of Solution
Calculation:
Coefficient of multiple determination
The coefficient of multiple determination,
The subset with larger
Regression:
Software procedure:
Step by step procedure to obtain regression using MINITAB software is given as,
- Choose Stat > Regression > Regression> Best subsets.
- In Response, enter the numeric column containing the response data Y.
- In Model, enter the numeric column containing the predictor variables X1, X2, X1*X2, X1*X1 and X2*X2.
- Click OK.
Output obtained from MINITAB is given below:
For the one predictor case, the highest value of
For the two predictor case, the highest value of
For the three predictor case, the highest value of
For the four predictor case, the highest value of
For the five predictor case, the value of
The value of
Thus, depending upon the factors affecting the analysis it would be most preferable to use the regression equation corresponding to the predictors
The second highest value of
That is, 90.6 and 90.3 are not much distinct.
Therefore, the model with
Thus, the two best models are:
First model with
From the accompanying MINITAB output, the values of Mallows’
Predictor variables | Mallows’ | Adjusted |
92.5 | 60.1 | |
97 | 58.6 | |
47.1 | 75.2 | |
53.3 | 73 | |
7.9 | 89.2 | |
15.5 | 86.4 | |
4.8 | 90.7 | |
9.2 | 89 | |
6 | 90.6 |
f.
Select the variables for the model, using the Mallows’
Check whether both the models are same.
Answer to Problem 5SE
The variables for the model using the Mallows’
The variables for the model using the adjusted-
Yes, both the models are same.
Explanation of Solution
Mallows’
An important utility of the Mallows’
Mallows’
The predictor with the lowest value of
From part (e), the values of Mallows’
Predictor variables | Mallows’ | Adjusted |
92.5 | 60.1 | |
97 | 58.6 | |
47.1 | 75.2 | |
53.3 | 73 | |
7.9 | 89.2 | |
15.5 | 86.4 | |
4.8 | 90.7 | |
9.2 | 89 | |
6 | 90.6 |
For the one predictor case, the lowest value of
For the two predictor case, the lowest value of
For the three predictor case, the lowest value of
For the four predictor case, the lowest value of
For the five predictor case, the value of
The value of
Thus, depending upon the factors affecting the analysis it would be most preferable to use the regression equation corresponding to the predictors
Hence, the variables for the model using the Mallows’
Adjusted
An important utility of the adjusted coefficient of multiple determination or
The adjusted coefficient of multiple determination,
For the one predictor case, the highest value of
For the two predictor case, the highest value of
For the three predictor case, the highest value of
For the four predictor case, the highest value of
For the five predictor case, the value of
The value of adjusted
Thus, provided other factors do not affect the analysis it could be most preferable to use the regression equation corresponding to the predictors,
Hence, the variables for the model using the adjusted-
Both Mallows’
Want to see more full solutions like this?
Chapter 8 Solutions
Statistics for Engineers and Scientists
Additional Math Textbook Solutions
Math in Our World
Elementary Statistics ( 3rd International Edition ) Isbn:9781260092561
APPLIED STAT.IN BUS.+ECONOMICS
Introductory Statistics
Elementary Statistics: Picturing the World (7th Edition)
Mathematics for the Trades: A Guided Approach (11th Edition) (What's New in Trade Math)
- Percent 13 A car dealer specializing in minivan sales saibe conducts a survey to find out more about who its customers are. One of the variables at the company measures is gender; the results of this part of the survey are shown in the following bar graph. pow a. Interpret these results. b. Explain whether you think the bar graph is a fair and accurate representation of this data. 70 Gender of Customers 60 50 40 30 20 10 0 Males Femalesarrow_forwardThree cat- ency bar 10 Suppose that a health club asks 30 customers ad to rate the services as very good (1), good (2), fair (3), or poor (4). You can see the results in the following bar graph. What percentage of the customers rated the services as good? n; 2: pinion). of this to make a eople in ng ban?) Health Club Customer Ratings (1-very good,..., 4-poor) Frequency 10 8 00 6 11 A polling orga what voters t random samp for their opin no opinion. following ba a. Make a (includ bob. Evalua fairly tral 2 0 1 -2 3 4 540 480 420 360 300 240 Frequencyarrow_forward1 - Multiple Regression Equations and Predictions with XLMiner Analysis ToolPak (Structured) Video The owner of Showtime Movie Theaters, Inc., would like to predict weekly gross revenue as a function of advertising expenditures. Historical data for a sample of eight weeks are entered into the Microsoft Excel Online file below. Use the XLMiner Analysis ToolPak to perform your regression analysis in the designated areas of the spreadsheet. Due to a recent change by Microsoft you will need to open the XLMiner Analysis ToolPak add-in manually from the home ribbon. Screenshot of ToolPak X Open spreadsheet a. Develop an estimated regression equation with the amount of television advertising as the independent variable (to 2 decimals). JAN 27 Revenue = × TVAdv + b. Develop an estimated regression equation with both television advertising and newspaper advertising as the independent variables (to 2 decimals). Revenue = + TVAdy + NewsAdv c. Is the estimated regression equation coefficient for…arrow_forward
- Question 2: When John started his first job, his first end-of-year salary was $82,500. In the following years, he received salary raises as shown in the following table. Fill the Table: Fill the following table showing his end-of-year salary for each year. I have already provided the end-of-year salaries for the first three years. Calculate the end-of-year salaries for the remaining years using Excel. (If you Excel answer for the top 3 cells is not the same as the one in the following table, your formula / approach is incorrect) (2 points) Geometric Mean of Salary Raises: Calculate the geometric mean of the salary raises using the percentage figures provided in the second column named “% Raise”. (The geometric mean for this calculation should be nearly identical to the arithmetic mean. If your answer deviates significantly from the mean, it's likely incorrect. 2 points) Hint for the first part of question 2: To assist you with filling out the table in the first part of the question,…arrow_forwardConsider a sample with data values of 27, 25, 20, 15, 30, 34, 28, and 25. Compute the range, interquartile range, variance, and standard deviation (to a maximum of 2 decimals, if decimals are necessary). Range Interquartile range Variance Standard deviationarrow_forwardPerform a Step by step following tests in Microsoft Excel. Each of the following is 0.5 points, with a total of 6 points. Provide your answers in the following table. Median Standard Deviation Minimum Maximum Range 1st Quartile 2nd Quartile 3rd Quartile Skewness; provide a one sentence explanation of what does the skewness value indicates Kurtosis; provide a one sentence explanation of what does the kurtosis value indicates Make a labelled histogram; no point awarded if it is not labelled Make a labelled boxplot; no point awarded if it is not labelled Data 27 30 22 25 24 22 20 28 20 26 21 23 24 20 28 30 20 28 29 30 21 26 29 25 26 25 20 30 26 28 25 21 22 27 27 24 26 22 29 28 30 22 22 22 30 21 21 30 26 20arrow_forward
- Obtain the linear equation for trend for time series with St² = 140, Ey = 16.91 and Σty= 62.02, m n = 7arrow_forwardA quality characteristic of a product is normally distributed with mean μ and standard deviation σ = 1. Speci- fications on the characteristic are 6≤x≤8. A unit that falls within specifications on this quality characteristic results in a profit of Co. However, if x 8, the profit is -C2. Find the value ofμ that maximizes the expected profit.arrow_forwardA) The output voltage of a power supply is normally distributed with mean 5 V and standard deviation 0.02 V. If the lower and upper specifications for voltage are 4.95 V and 5.05 V, respectively, what is the probability that a power supply selected at random conform to the specifications on voltage? B) Continuation of A. Reconsider the power supply manufacturing process in A. Suppose We wanted to improve the process. Can shifting the mean reduce the number of nonconforming units produced? How much would the process variability need to be reduced in order to have all but one out of 1000 units conform to the specifications?arrow_forward
- der to complete the Case X T Civil Service Numerical Test Sec X T Casework Skills Practice Test Maseline Vaseline x + euauthoring.panpowered.com/DeliveryWeb/Civil Service Main/84589a48-6934-4b6e-a6e1-a5d75f559df9?transferToken-News NGSSON The table below shows the best price available for various items from 4 uniform suppliers. The prices do not include VAT (charged at 20%). Item Waterproof boots A1-Uniforms (£)Best Trade (£)Clothing Tech (£)Dress Right (£) 59.99 39.99 59.99 49.99 Trousers 9.89 9.98 9.99 11.99 Shirts 14.99 15.99 16.99 12.99 Hi-Vis vest 4.49 4.50 4.00 4.00 20.00 25.00 19.50 19.99 Hard hats A company needs to buy a set of 12 uniforms which includes 1 of each item. If the special offers are included which supplier is cheapest? OOO A1-Uniforms Best Trade Clothing Tech Q Search + ** 109 8 CO* F10 Home F11 F12 6arrow_forwardto complete the Case × T Civil Service Numerical Test Sec x T Casework Skills Practice Test + Vaseline euauthoring.panpowered.com/DeliveryWeb/Civil Service Main/84589a48-b934-4b6e-a6e1-a5d75f559df9?transferToken=MxNewOS NGFSPSZSMOMzuz The table below shows the best price available for various items from 4 uniform suppliers. The prices do not include VAT (charged at 20%). Item A1-Uniforms (£)Best Trade (£)Clothing Tech (£)Dress Right (£) Waterproof boots 59.99 39.99 59.99 49.99 Trousers 9.89 9.98 9.99 11.99 Shirts 14.99 15.99 16.99 12.99 Hi-Vis vest 4.49 4.50 4.00 4.00 20.00 25.00 19.50 19.99 Hard hats A company needs to buy a set of 12 uniforms which includes 1 of each item. If the special offers are included, which supplier is cheapest? O O O O A1-Uniforms Best Trade Clothing Tech Dress Right Q Search ENG L UK +0 F6 四吧 6 78 ㄓ F10 9% * CO 1 F12 34 Oarrow_forwardCritics review films out of 5 based on three attributes: the story, the special effects and the acting. The ratings of four critics for a film are collected in the table below.CriticSpecialStory rating Effects rating Acting rating Critic 14.44.34.5Critic 24.14.23.9Critic 33.943.4Critic 44.24.14.2Critic 1 also gave the film a rating for the Director's ability. If the average of Critic 1's ratings was 4.3 what rating did they give to the Director's ability?3.94.04.14.24.3arrow_forward
- Functions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage LearningHolt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGALBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt