Concept explainers
Income and Percent Audited. The Transactional Records Access Clearinghouse at Syracuse University reported data showing the odds of an Internal Revenue Service audit. The following table shows the average adjusted gross income reported and the percent of the returns that were audited for 20 selected IRS districts.
- a. Develop the estimated regression equation that could be used to predict the percent audited given the average adjusted gross income reported.
- b. At the .05 level of significance, determine whether the adjusted gross income and the percent audited are related.
- c. Did the estimated regression equation provide a good fit? Explain.
- d. Use the estimated regression equation developed in part (a) to calculate a 95% confidence interval for the expected percent audited for districts with an average adjusted gross income of $35,000.
a.

Find the estimated regression equation to predict the percent audited given the average adjusted gross income reported.
Answer to Problem 67SE
The estimated regression equation to predict the percent audited, given the average adjusted gross income reported is as follows:
Explanation of Solution
Calculation:
The data are related to the adjusted gross income ($) and percent audited for 20 selected IRS districts.
In the given problem, the percent audited is the dependent variable (y) and the adjusted gross income is the independent variable (x).
Regression:
Software procedure:
Step-by-step procedure to obtain the estimated regression equation using EXCEL:
- In Excel sheet, enter Adjusted Gross income and Percent Audited in different columns.
- In Data, select Data Analysis and choose Regression.
- In Input Y Range, select Percent Audited.
- In Input X Range, select Adjusted Gross income.
- Select Labels.
- Click OK.
Output obtained using EXCEL is given below:
Thus, the estimated regression equation to predict the percent audited, given the average adjusted gross income reported is as follows:
b.

Use
Answer to Problem 67SE
There is a significant relationship between the adjusted gross income and the percent audited.
Explanation of Solution
Calculation:
State the test hypotheses.
Null hypothesis:
That is, there is no significant relationship between the adjusted gross income and the percent audited.
Alternative hypothesis:
That is, there is a significant relationship between the adjusted gross income and the percent audited.
From the output in Part (a), it is found that the F-test statistic is 4.99.
Level of significance:
The given level of significance is
p-value:
From Part (a) in the output, it is found that the p-value is 0.038.
Rejection rule:
If the
Conclusion:
Here, the p-value is less than the level of significance.
That is,
Thus, the decision is “reject the null hypothesis”.
Therefore, the data provide sufficient evidence to conclude that there is a significant relationship between the adjusted gross income and the percent audited.
Thus, the adjusted gross income and the percent audited are related.
c.

Explain whether the estimated regression equation provides a good fit to the data.
Answer to Problem 67SE
The estimated regression equation does not provide a good fit to the data.
Explanation of Solution
The coefficient of determination (
In the given output of Part (a),
Thus, the percentage of variation in the observed values of percent audited that is explained by the regression is 21.71%, which indicates that only 21.71% of the variability in percent audited is explained by the variability in the adjusted gross income using the linear regression model.
Thus, the estimated regression equation does not provide a good fit to the data.
d.

Find a 95% confidence interval for the expected percent audited for districts with an average adjusted gross income of $35,000.
Answer to Problem 67SE
The 95% confidence interval for the expected percent audited for districts with an average adjusted gross income of $35,000 is
Explanation of Solution
Calculation:
The estimate of standard deviation of
From Part (a), the estimated regression equation is as follows:
Also, the mean square error (MSE) is 0.0436.
According to the regression equation
Thus, the possible value of the dependent variable y when
The standard error of the estimate is obtained as follows:
Thus, the standard error of the estimate is 0.2088.
It is known that for a sample of size n, the mean of a random variable x can be obtained as follows:
Thus, the mean of the random variable x is obtained as follows:
The mean of the random variable x is
The value of
36,664 | 3,028 | 9,168,784 |
38,845 | 5,209 | 27,133,681 |
34,886 | 1,250 | 1,562,500 |
32,512 | –1,124 | 1,263,376 |
34,531 | 895 | 801,025 |
35,995 | 2,359 | 5,564,881 |
37,799 | 4,163 | 17,330,569 |
33,876 | 240 | 57,600 |
30,513 | –3,123 | 9,753,129 |
30,174 | –3,462 | 11,985,444 |
30,060 | –3,576 | 12,787,776 |
37,153 | 3,517 | 12,369,289 |
34,918 | 1,282 | 1,643,524 |
33,291 | –345 | 119,025 |
31,504 | –2,132 | 4,545,424 |
29,199 | –4,437 | 19,686,969 |
33,072 | –564 | 318,096 |
30,859 | –2,777 | 7,711,729 |
32,566 | –1,070 | 1,144,900 |
34,296 | 660 | 435,600 |
For the adjusted gross income of (35,000), the standard deviation of
Thus, the standard deviation of
The confidence interval for the expected value of y
Degrees of freedom:
For a sample of size n, the degrees of freedom is given as
In this given problem, for a sample size 20, the degrees of freedom is as follows:
Thus, the degrees of freedom is 18.
Level of significance:
The given level of significance is
For both tails distribution:
Form Table 2 of “t Distribution” in Appendix B, it is found that the value of t test statistic with the level of significance 0.025 and degrees of freedom 18 is
Therefore, the required confidence interval is obtained as follows:
Thus, the 95% confidence interval for the expected percent audited for districts with an average adjusted gross income of $35,000 is
Want to see more full solutions like this?
Chapter 14 Solutions
Essentials of Statistics for Business and Economics
- You find out that the dietary scale you use each day is off by a factor of 2 ounces (over — at least that’s what you say!). The margin of error for your scale was plus or minus 0.5 ounces before you found this out. What’s the margin of error now?arrow_forwardSuppose that Sue and Bill each make a confidence interval out of the same data set, but Sue wants a confidence level of 80 percent compared to Bill’s 90 percent. How do their margins of error compare?arrow_forwardSuppose that you conduct a study twice, and the second time you use four times as many people as you did the first time. How does the change affect your margin of error? (Assume the other components remain constant.)arrow_forward
- Out of a sample of 200 babysitters, 70 percent are girls, and 30 percent are guys. What’s the margin of error for the percentage of female babysitters? Assume 95 percent confidence.What’s the margin of error for the percentage of male babysitters? Assume 95 percent confidence.arrow_forwardYou sample 100 fish in Pond A at the fish hatchery and find that they average 5.5 inches with a standard deviation of 1 inch. Your sample of 100 fish from Pond B has the same mean, but the standard deviation is 2 inches. How do the margins of error compare? (Assume the confidence levels are the same.)arrow_forwardA survey of 1,000 dental patients produces 450 people who floss their teeth adequately. What’s the margin of error for this result? Assume 90 percent confidence.arrow_forward
- The annual aggregate claim amount of an insurer follows a compound Poisson distribution with parameter 1,000. Individual claim amounts follow a Gamma distribution with shape parameter a = 750 and rate parameter λ = 0.25. 1. Generate 20,000 simulated aggregate claim values for the insurer, using a random number generator seed of 955.Display the first five simulated claim values in your answer script using the R function head(). 2. Plot the empirical density function of the simulated aggregate claim values from Question 1, setting the x-axis range from 2,600,000 to 3,300,000 and the y-axis range from 0 to 0.0000045. 3. Suggest a suitable distribution, including its parameters, that approximates the simulated aggregate claim values from Question 1. 4. Generate 20,000 values from your suggested distribution in Question 3 using a random number generator seed of 955. Use the R function head() to display the first five generated values in your answer script. 5. Plot the empirical density…arrow_forwardFind binomial probability if: x = 8, n = 10, p = 0.7 x= 3, n=5, p = 0.3 x = 4, n=7, p = 0.6 Quality Control: A factory produces light bulbs with a 2% defect rate. If a random sample of 20 bulbs is tested, what is the probability that exactly 2 bulbs are defective? (hint: p=2% or 0.02; x =2, n=20; use the same logic for the following problems) Marketing Campaign: A marketing company sends out 1,000 promotional emails. The probability of any email being opened is 0.15. What is the probability that exactly 150 emails will be opened? (hint: total emails or n=1000, x =150) Customer Satisfaction: A survey shows that 70% of customers are satisfied with a new product. Out of 10 randomly selected customers, what is the probability that at least 8 are satisfied? (hint: One of the keyword in this question is “at least 8”, it is not “exactly 8”, the correct formula for this should be = 1- (binom.dist(7, 10, 0.7, TRUE)). The part in the princess will give you the probability of seven and less than…arrow_forwardplease answer these questionsarrow_forward
- Selon une économiste d’une société financière, les dépenses moyennes pour « meubles et appareils de maison » ont été moins importantes pour les ménages de la région de Montréal, que celles de la région de Québec. Un échantillon aléatoire de 14 ménages pour la région de Montréal et de 16 ménages pour la région Québec est tiré et donne les données suivantes, en ce qui a trait aux dépenses pour ce secteur d’activité économique. On suppose que les données de chaque population sont distribuées selon une loi normale. Nous sommes intéressé à connaitre si les variances des populations sont égales.a) Faites le test d’hypothèse sur deux variances approprié au seuil de signification de 1 %. Inclure les informations suivantes : i. Hypothèse / Identification des populationsii. Valeur(s) critique(s) de Fiii. Règle de décisioniv. Valeur du rapport Fv. Décision et conclusion b) A partir des résultats obtenus en a), est-ce que l’hypothèse d’égalité des variances pour cette…arrow_forwardAccording to an economist from a financial company, the average expenditures on "furniture and household appliances" have been lower for households in the Montreal area than those in the Quebec region. A random sample of 14 households from the Montreal region and 16 households from the Quebec region was taken, providing the following data regarding expenditures in this economic sector. It is assumed that the data from each population are distributed normally. We are interested in knowing if the variances of the populations are equal. a) Perform the appropriate hypothesis test on two variances at a significance level of 1%. Include the following information: i. Hypothesis / Identification of populations ii. Critical F-value(s) iii. Decision rule iv. F-ratio value v. Decision and conclusion b) Based on the results obtained in a), is the hypothesis of equal variances for this socio-economic characteristic measured in these two populations upheld? c) Based on the results obtained in a),…arrow_forwardA major company in the Montreal area, offering a range of engineering services from project preparation to construction execution, and industrial project management, wants to ensure that the individuals who are responsible for project cost estimation and bid preparation demonstrate a certain uniformity in their estimates. The head of civil engineering and municipal services decided to structure an experimental plan to detect if there could be significant differences in project evaluation. Seven projects were selected, each of which had to be evaluated by each of the two estimators, with the order of the projects submitted being random. The obtained estimates are presented in the table below. a) Complete the table above by calculating: i. The differences (A-B) ii. The sum of the differences iii. The mean of the differences iv. The standard deviation of the differences b) What is the value of the t-statistic? c) What is the critical t-value for this test at a significance level of 1%?…arrow_forward
- Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw HillBig Ideas Math A Bridge To Success Algebra 1: Stu...AlgebraISBN:9781680331141Author:HOUGHTON MIFFLIN HARCOURTPublisher:Houghton Mifflin HarcourtFunctions and Change: A Modeling Approach to Coll...AlgebraISBN:9781337111348Author:Bruce Crauder, Benny Evans, Alan NoellPublisher:Cengage Learning
- Holt Mcdougal Larson Pre-algebra: Student Edition...AlgebraISBN:9780547587776Author:HOLT MCDOUGALPublisher:HOLT MCDOUGALCollege Algebra (MindTap Course List)AlgebraISBN:9781305652231Author:R. David Gustafson, Jeff HughesPublisher:Cengage Learning




