
Predicting software defects. Refer to the PROMISE Software Engineering Repository data on 498 modules of software code written in “C” language for a NASA spacecraft instrument, saved in the file. (See Exercise 3.132, p. 185). Recall that the software code in each module was evaluated for defects; 49 were classified as “true” (i.e., module has defective code), and 449 were classified as “false” (i.e., module has correct code). Consider these to be independent random samples of software code modules. Researchers predicted the defect status of each module using the simple algorithm, “If number of lines of code in the module exceeds 50, predict the module to have a defect.” The accompanying SPSS printout shows the number of modules in each of the two samples that were predicted to have defects (PRED_LOC = _“yes”) and predicted to have no defects (PRED_LOC = _“no”). Now, define the accuracy rate of the algorithm as the proportion of modules that were correctly predicted. Compare the accuracy rate of the algorithm when applied to modules with defective code with the accuracy rate of the algorithm when applied to modules with correct code. Use a 99% confidence interval.
DEFECT PRED_LOC Crosstabulation
Count
PRED_LOC | Total | |||
no | yes | |||
DEFECT | false | 400 | 49 | 449 |
true | 29 | 20 | 49 | |
total | 429 | 69 | 496 |

Want to see the full answer?
Check out a sample textbook solution
Chapter 8 Solutions
Statistics for Business and Economics (13th Edition)
- What would you say about a set of quantitative bivariate data whose linear correlation is -1? What would a scatter diagram of the data look like? (5 points)arrow_forwardBusiness discussarrow_forwardAnalyze the residuals of a linear regression model and select the best response. yes, the residual plot does not show a curve no, the residual plot shows a curve yes, the residual plot shows a curve no, the residual plot does not show a curve I answered, "No, the residual plot shows a curve." (and this was incorrect). I am not sure why I keep getting these wrong when the answer seems obvious. Please help me understand what the yes and no references in the answer.arrow_forward
- a. Find the value of A.b. Find pX(x) and py(y).c. Find pX|y(x|y) and py|X(y|x)d. Are x and y independent? Why or why not?arrow_forwardAnalyze the residuals of a linear regression model and select the best response.Criteria is simple evaluation of possible indications of an exponential model vs. linear model) no, the residual plot does not show a curve yes, the residual plot does not show a curve yes, the residual plot shows a curve no, the residual plot shows a curve I selected: yes, the residual plot shows a curve and it is INCORRECT. Can u help me understand why?arrow_forwardYou have been hired as an intern to run analyses on the data and report the results back to Sarah; the five questions that Sarah needs you to address are given below. please do it step by step on excel Does there appear to be a positive or negative relationship between price and screen size? Use a scatter plot to examine the relationship. Determine and interpret the correlation coefficient between the two variables. In your interpretation, discuss the direction of the relationship (positive, negative, or zero relationship). Also discuss the strength of the relationship. Estimate the relationship between screen size and price using a simple linear regression model and interpret the estimated coefficients. (In your interpretation, tell the dollar amount by which price will change for each unit of increase in screen size). Include the manufacturer dummy variable (Samsung=1, 0 otherwise) and estimate the relationship between screen size, price and manufacturer dummy as a multiple…arrow_forward
- Here is data with as the response variable. x y54.4 19.124.9 99.334.5 9.476.6 0.359.4 4.554.4 0.139.2 56.354 15.773.8 9-156.1 319.2Make a scatter plot of this data. Which point is an outlier? Enter as an ordered pair, e.g., (x,y). (x,y)= Find the regression equation for the data set without the outlier. Enter the equation of the form mx+b rounded to three decimal places. y_wo= Find the regression equation for the data set with the outlier. Enter the equation of the form mx+b rounded to three decimal places. y_w=arrow_forwardYou have been hired as an intern to run analyses on the data and report the results back to Sarah; the five questions that Sarah needs you to address are given below. please do it step by step Does there appear to be a positive or negative relationship between price and screen size? Use a scatter plot to examine the relationship. Determine and interpret the correlation coefficient between the two variables. In your interpretation, discuss the direction of the relationship (positive, negative, or zero relationship). Also discuss the strength of the relationship. Estimate the relationship between screen size and price using a simple linear regression model and interpret the estimated coefficients. (In your interpretation, tell the dollar amount by which price will change for each unit of increase in screen size). Include the manufacturer dummy variable (Samsung=1, 0 otherwise) and estimate the relationship between screen size, price and manufacturer dummy as a multiple linear…arrow_forwardExercises: Find all the whole number solutions of the congruence equation. 1. 3x 8 mod 11 2. 2x+3= 8 mod 12 3. 3x+12= 7 mod 10 4. 4x+6= 5 mod 8 5. 5x+3= 8 mod 12arrow_forward
- Scenario Sales of products by color follow a peculiar, but predictable, pattern that determines how many units will sell in any given year. This pattern is shown below Product Color 1995 1996 1997 Red 28 42 21 1998 23 1999 29 2000 2001 2002 Unit Sales 2003 2004 15 8 4 2 1 2005 2006 discontinued Green 26 39 20 22 28 14 7 4 2 White 43 65 33 36 45 23 12 Brown 58 87 44 48 60 Yellow 37 56 28 31 Black 28 42 21 Orange 19 29 Purple Total 28 42 21 49 68 78 95 123 176 181 164 127 24 179 Questions A) Which color will sell the most units in 2007? B) Which color will sell the most units combined in the 2007 to 2009 period? Please show all your analysis, leave formulas in cells, and specify any assumptions you make.arrow_forwardOne hundred students were surveyed about their preference between dogs and cats. The following two-way table displays data for the sample of students who responded to the survey. Preference Male Female TOTAL Prefers dogs \[36\] \[20\] \[56\] Prefers cats \[10\] \[26\] \[36\] No preference \[2\] \[6\] \[8\] TOTAL \[48\] \[52\] \[100\] problem 1 Find the probability that a randomly selected student prefers dogs.Enter your answer as a fraction or decimal. \[P\left(\text{prefers dogs}\right)=\] Incorrect Check Hide explanation Preference Male Female TOTAL Prefers dogs \[\blueD{36}\] \[\blueD{20}\] \[\blueE{56}\] Prefers cats \[10\] \[26\] \[36\] No preference \[2\] \[6\] \[8\] TOTAL \[48\] \[52\] \[100\] There were \[\blueE{56}\] students in the sample who preferred dogs out of \[100\] total students.arrow_forwardBusiness discussarrow_forward
- Mathematics For Machine TechnologyAdvanced MathISBN:9781337798310Author:Peterson, John.Publisher:Cengage Learning,Glencoe Algebra 1, Student Edition, 9780079039897...AlgebraISBN:9780079039897Author:CarterPublisher:McGraw Hill

