
Baseball Pitcher Performance. Major League Baseball (MLB) consists of teams that play in the American League and the National League. MLB collects a wide variety of team and player statistics. Some of the statistics often used to evaluate pitching performance are as follows:
ERA: The average number of earned runs given up by the pitcher per nine innings. An earned run is any run that the opponent scores off a particular pitcher except for runs scored as a result of errors.
SO/IP: The average number of strikeouts per inning pitched.
HR/IP: The average number of home runs per inning pitched.
R/IP: The number of runs given up per inning pitched.
The following data show values for these statistics for a random sample of 20 pitchers from the American League for a full season.
- a. Develop an estimated regression equation that can be used to predict the average number of runs given up per inning given the average number of strikeouts per inning pitched.
- b. Develop an estimated regression equation that can be used to predict the average number of runs given up per inning given the average number of home runs per inning pitched.
- c. Develop an estimated regression equation that can be used to predict the average number of runs given up per inning given the average number of strikeouts per inning pitched and the average number of home runs per inning pitched.
- d. A. J. Burnett, a pitcher for the New York Yankees, had an average number of strikeouts per inning pitched of .91 and an average number of home runs per inning of .16. Use the estimated regression equation developed in part (c) to predict the average number of runs given up per inning for A. J. Burnett. (Note: The actual value for R/IP was .6.)
- e. Suppose a suggestion was made to also use the earned run average as another independent variable in part (c). What do you think of this suggestion?
a.

Find the estimated regression equation that could be used to predict the average number of runs given up per inning, when the average number of strikeouts per innings pitched is given.
Answer to Problem 10E
The estimated regression equation that could be used to predict the average number of runs given up per inning, when the average number of strikeouts per innings pitched is given is
Explanation of Solution
Calculation:
The Major League Baseball (MLB) dataset consist of statistics regarding the average number of earned runs given up by the pitcher per nine innings (ERA), the average number of strikeouts per innings pitched (SO/IP), the average number of home runs per innings pitched (HR/IP) and the number of runs given up per innings pitched (R/IP).
Multiple linear regression model:
A multiple linear regression model is given as
Regression:
Software procedure:
Step-by-step procedure to get the regression equation using EXCEL software:
- Open an EXCEL sheet and enter the data on W, L, ERA, SO/IP, HR/IP and R/IP corresponding to columns A, B, C, D, E and F, respectively.
- Select Data > Data Analysis > Regression.
- Click OK.
- Under Input Y Range enter $F$1:$F$21.
- Under Input X Range enter $D$1:$D$21.
- Click the box of Labels.
- Under Output Range enter $I$1.
- Click OK.
The output using EXCEL software is given as follows:
Thus, the estimated regression equation that could be used to predict the average number of runs given up per inning, when the average number of strikeouts per innings pitched is given is
b.

Find the estimated regression equation that could be used to predict the average number of runs given the average number of home runs per innings pitched.
Answer to Problem 10E
The estimated regression equation that could be used to predict the average number of runs given the average number of home runs per innings pitched is
Explanation of Solution
Calculation:
The regression equation can be obtained using EXCEL software.
Software procedure:
Step-by-step procedure to get regression equation using EXCEL software:
- Open an EXCEL sheet and enter the data of W, L, ERA, SO/IP, HR/IP and R/IP corresponding to columns A, B, C, D, E and F, respectively.
- Select Data > Data Analysis > Regression.
- Click OK.
- Under Input Y Range enter $F$1:$F$21.
- Under Input X Range enter $E$1:$E$21.
- Click the box of Labels.
- Under Output Range enter $L$1.
- Click OK.
The output using EXCEL software is given as,
Thus, the estimated regression equation that could be used to predict the average number of runs given the average number of home runs per innings pitched is
c.

Find the estimated regression equation that could be used to predict the average number of runs given the average number of strikeouts per innings pitched and the average number of home runs per innings pitched.
Answer to Problem 10E
The estimated regression equation that could be used to predict the average number of runs given the average number of strikeouts per innings pitched and the average number of home runs per innings pitched is
Explanation of Solution
Calculation:
The regression equation can be obtained using EXCEL software.
Software procedure:
Step-by-step procedure to get regression equation using EXCEL software:
- Open an EXCEL sheet and enter the data of W, L, ERA, SO/IP, HR/IP and R/IP corresponding to columns A, B, C, D, E and F, respectively.
- Select Data > Data Analysis > Regression.
- Click OK.
- Under Input Y Range enter $F$1:$F$21.
- Under Input X Range enter $D$1:$E$21.
- Click the box of Labels.
- Under Output Range enter $O$1.
- Click OK.
The output using EXCEL software is given as,
Thus, the estimated regression equation that could be used to predict the average number of runs given the average number of strikeouts per innings pitched and the average number of home runs per innings pitched is
d.

Predict the average number of runs given up per inning for Pitcher A using the regression equation of Part c.
Answer to Problem 10E
The predicted average number of runs given up per inning for Pitcher A is 0.4757.
Explanation of Solution
Calculation:
The average number of strikeouts and the average number of home runs per innings pitched for Pitcher A are 0.91 and 0.16, respectively.
From Part (c), it is found that estimated regression equation that could be used to predict the average number of runs given the average number of strikeouts per innings pitched and the average number of home runs per innings pitched is
Thus, using the given information, the predicted average number of runs given up per inning for Pitcher At is calculated below.
Thus, the predicted average number of runs given up per inning for Pitcher A is 0.4757.
It is to be noted that the actual average number of runs given up per inning for Pitcher A was 0.6.
Thus, the predicted value is slightly less than the actual value. Hence, the residual is positive, which underestimates the model slightly.
e.

Explain about the suggestion to use the earned run average as another independent variable in the regression model developed in Part c.
Explanation of Solution
Calculation:
It would be useful to use the earned run average or ERA as a third independent variable, only if it has a strong linear association with the response variable, R/IP. The correlation coefficient measures the strength of the linear relation between two variables.
Software procedure:
Step-by-step procedure to get correlation coefficient using EXCEL software:
- Open an EXCEL sheet and enter the data of W, L, ERA, SO/IP, HR/IP and R/IP corresponding to columns A, B, C, D, E and F, respectively.
- In cell I1 enter the formula =CORREL(F1:F21,C1:C21).
- Press Enter.
The EXCEL output is given as follows:
Thus, the correlation coefficient between R/IO and ERA is approximately 0.964, which is very close to 1, indicating a very strong positive linear relationship between the variables.
Due to the high correlation coefficient it can be said that if a pitcher gives up more runs per inning, then the pitcher’s run average also increases. Thus, automatically there will be high value of the coefficient of determination in the regression analysis, if ERA is used as a predictor variable.
Thus, it is reasonable to use the earned run average as another independent variable in the regression model developed in Part c.
Want to see more full solutions like this?
Chapter 15 Solutions
Bundle: Essentials Of Statistics For Business And Economics, Loose-leaf Version, 8th + Lms Integrated For Mindtap Business Statistics, 1 Term (6 Months) Printed Access Card
- You find out that the dietary scale you use each day is off by a factor of 2 ounces (over — at least that’s what you say!). The margin of error for your scale was plus or minus 0.5 ounces before you found this out. What’s the margin of error now?arrow_forwardSuppose that Sue and Bill each make a confidence interval out of the same data set, but Sue wants a confidence level of 80 percent compared to Bill’s 90 percent. How do their margins of error compare?arrow_forwardSuppose that you conduct a study twice, and the second time you use four times as many people as you did the first time. How does the change affect your margin of error? (Assume the other components remain constant.)arrow_forward
- Out of a sample of 200 babysitters, 70 percent are girls, and 30 percent are guys. What’s the margin of error for the percentage of female babysitters? Assume 95 percent confidence.What’s the margin of error for the percentage of male babysitters? Assume 95 percent confidence.arrow_forwardYou sample 100 fish in Pond A at the fish hatchery and find that they average 5.5 inches with a standard deviation of 1 inch. Your sample of 100 fish from Pond B has the same mean, but the standard deviation is 2 inches. How do the margins of error compare? (Assume the confidence levels are the same.)arrow_forwardA survey of 1,000 dental patients produces 450 people who floss their teeth adequately. What’s the margin of error for this result? Assume 90 percent confidence.arrow_forward
- The annual aggregate claim amount of an insurer follows a compound Poisson distribution with parameter 1,000. Individual claim amounts follow a Gamma distribution with shape parameter a = 750 and rate parameter λ = 0.25. 1. Generate 20,000 simulated aggregate claim values for the insurer, using a random number generator seed of 955.Display the first five simulated claim values in your answer script using the R function head(). 2. Plot the empirical density function of the simulated aggregate claim values from Question 1, setting the x-axis range from 2,600,000 to 3,300,000 and the y-axis range from 0 to 0.0000045. 3. Suggest a suitable distribution, including its parameters, that approximates the simulated aggregate claim values from Question 1. 4. Generate 20,000 values from your suggested distribution in Question 3 using a random number generator seed of 955. Use the R function head() to display the first five generated values in your answer script. 5. Plot the empirical density…arrow_forwardFind binomial probability if: x = 8, n = 10, p = 0.7 x= 3, n=5, p = 0.3 x = 4, n=7, p = 0.6 Quality Control: A factory produces light bulbs with a 2% defect rate. If a random sample of 20 bulbs is tested, what is the probability that exactly 2 bulbs are defective? (hint: p=2% or 0.02; x =2, n=20; use the same logic for the following problems) Marketing Campaign: A marketing company sends out 1,000 promotional emails. The probability of any email being opened is 0.15. What is the probability that exactly 150 emails will be opened? (hint: total emails or n=1000, x =150) Customer Satisfaction: A survey shows that 70% of customers are satisfied with a new product. Out of 10 randomly selected customers, what is the probability that at least 8 are satisfied? (hint: One of the keyword in this question is “at least 8”, it is not “exactly 8”, the correct formula for this should be = 1- (binom.dist(7, 10, 0.7, TRUE)). The part in the princess will give you the probability of seven and less than…arrow_forwardplease answer these questionsarrow_forward
- Selon une économiste d’une société financière, les dépenses moyennes pour « meubles et appareils de maison » ont été moins importantes pour les ménages de la région de Montréal, que celles de la région de Québec. Un échantillon aléatoire de 14 ménages pour la région de Montréal et de 16 ménages pour la région Québec est tiré et donne les données suivantes, en ce qui a trait aux dépenses pour ce secteur d’activité économique. On suppose que les données de chaque population sont distribuées selon une loi normale. Nous sommes intéressé à connaitre si les variances des populations sont égales.a) Faites le test d’hypothèse sur deux variances approprié au seuil de signification de 1 %. Inclure les informations suivantes : i. Hypothèse / Identification des populationsii. Valeur(s) critique(s) de Fiii. Règle de décisioniv. Valeur du rapport Fv. Décision et conclusion b) A partir des résultats obtenus en a), est-ce que l’hypothèse d’égalité des variances pour cette…arrow_forwardAccording to an economist from a financial company, the average expenditures on "furniture and household appliances" have been lower for households in the Montreal area than those in the Quebec region. A random sample of 14 households from the Montreal region and 16 households from the Quebec region was taken, providing the following data regarding expenditures in this economic sector. It is assumed that the data from each population are distributed normally. We are interested in knowing if the variances of the populations are equal. a) Perform the appropriate hypothesis test on two variances at a significance level of 1%. Include the following information: i. Hypothesis / Identification of populations ii. Critical F-value(s) iii. Decision rule iv. F-ratio value v. Decision and conclusion b) Based on the results obtained in a), is the hypothesis of equal variances for this socio-economic characteristic measured in these two populations upheld? c) Based on the results obtained in a),…arrow_forwardA major company in the Montreal area, offering a range of engineering services from project preparation to construction execution, and industrial project management, wants to ensure that the individuals who are responsible for project cost estimation and bid preparation demonstrate a certain uniformity in their estimates. The head of civil engineering and municipal services decided to structure an experimental plan to detect if there could be significant differences in project evaluation. Seven projects were selected, each of which had to be evaluated by each of the two estimators, with the order of the projects submitted being random. The obtained estimates are presented in the table below. a) Complete the table above by calculating: i. The differences (A-B) ii. The sum of the differences iii. The mean of the differences iv. The standard deviation of the differences b) What is the value of the t-statistic? c) What is the critical t-value for this test at a significance level of 1%?…arrow_forward
- Holt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGALGlencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin Harcourt
- Algebra & Trigonometry with Analytic GeometryAlgebraISBN:9781133382119Author:SwokowskiPublisher:CengageAlgebra: Structure And Method, Book 1AlgebraISBN:9780395977224Author:Richard G. Brown, Mary P. Dolciani, Robert H. Sorgenfrey, William L. ColePublisher:McDougal Littell



