Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner
3rd Edition
ISBN: 9781118729274
Author: Galit Shmueli, Peter C. Bruce, Nitin R. Patel
Publisher: WILEY
expand_more
expand_more
format_list_bulleted
Concept explainers
Expert Solution & Answer
thumb_up100%
Want to see the full answer?
Check out a sample textbook solutionStudents have asked these similar questions
What is the best way to decide how many epochs of training to
perform?
It is always obvious looking at the decision boundary when the
model begins to overfit.
None of the others.
As soon as the value of the Testing dataset performance begins
to decrease.
As soon as the value of the Tuning dataset performance begins
to decrease.
As soon as the value of the Training dataset performance
(accuracy, F1.) begins to decrease.
As soon as the value of the Testing dataset loss begins to
increase.
As soon as the value of the Tuning dataset loss begins to
increase.
As soon as the value of the Training dataset loss begins to
increase.
Medical records show a sample population of 1000 people, of those 1000 people, 98% do not have a terminal illness and 2% do have a terminal illness. A Health Insurance company would like try out a new cheaper test for terminal illness. Their results show that 98% of the people that do have a terminal illness test positive, while 1% of the people who do not have a terminal illness test positive for one. A corporation known as Ken’s Kids is concerned about patients that are slipping through the cracks with this new medical testing. If the new medical testing is adopted, what % of the people will be misdiagnosed as not having a terminal illness, but really have one? Assuming a population of 200 million people, how many people that have a terminal illness, given this new testing will never know that they do? (Please show all work , and have a legend for symbols).
Solve the problem using STEPWISE Method. Remember to use the step by step procedures (Step 1-7)
Chapter 5 Solutions
Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Similar questions
- Three classifiers are to be benchmarked. To this end, using the same data, the classifiers were trained and the following table shows the validation results obtained with n = 16 observations. 1 1 0 2 0 3 1 4 1 5 1 6 1 7 0 8 9 10 11 12 13 14 15 16 OTTOTOO 0 1 1 1 1 0 ZOOOoooo Hooo o Ytrue Y1 Y2 Y3 1 0 0 0 1 0 1 0 1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 1 1 0 0 1 0 0 1 1 0 1 0 0 0 1 1 0 11O 1 1 1 1 0001 Match the classifiers with the performance measures. Accuracy and Error rate for Y3 Choose... Accuracy and Error rate for Y2 Choose... TPR and FPR for Y1 Choose...arrow_forwardPython Regression Model 1: train MSE = 0.423, test MSE = 0.978 Model 2: train MSE = 0.572, test MSE = 0.644 Model 3: train MSE = 0.218, test MSE = 1.103 Based on this information, which of these models generalises the best to unseen data?arrow_forwardWhat are the limitations of using the weighted scoring model?arrow_forward
- You decide to run a simpler model to predict churn, using only the variables tenure (in months) and TotalCharges (in US$). The output is given below. The AIC of this model is 4727.6 (in contrast to the AIC of 4240 for the full model). On the basis of this which model would be expected to give superior predictive performance? Actual ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) 2.471e-01 5.360e-02 4.611 4.01e-06 *** ## tenure < 2e-16 *** -1.124e-01 5.816e-03 -19.334 ## TotalCharges 8.236e-04 5.618e-05 14.660 < 2e-16 *** ## No --- ## Signif. codes: 0 ## Yes Yes ## Null deviance: 5701.5 on 4921 ## Residual deviance: 4721.6 on 4919 ## AIC: 4727.6 515 345 ## (Dispersion parameter for binomial family taken to be 1) ## Predicted ***** No 795 3267 0.001 Confusion Matrix (Training) **** Actual 0.01 Yes No degrees of freedom degrees of freedom Yes The simpler model (with just tenure and TotalCharges) The full model (with all variables) 0.05 0.1 220 145 Predicted No 339…arrow_forwardElectronic Spreadsheet Applications Compare What-If Analysis using Trial and Error and Goal Seek to the given scenario: Let's say a student is enrolled in an online class at a learning institution for a semester. His overall average grade stands at 43% in the course (Term Grade is 45%, Midterm Grade is 65%, Class Participation is 62% and Final Exam is 0%). Unfortunately, he missed his Final Exam and was given 0%. However, he has the opportunity to redo his Final Exam and needs at least an overall average of 60% to pass the course. How can you use Trial and Error and Goal Seek to find out what is the lowest grade he needs on the Final Exam to pass the class? Which method worked best for you and why?arrow_forwardA number of parents have volunteered their children to participate in a developmental study administered by a local child psychologist. The following chart summarizes the results of the psychologist's assessments. Developmental Test Scores 40- 35 30 25 20 15 10 1 2 4 5 6 7 8 9 10 Age (years) Which of the following statistical techniques can the psychologist use to determine the developmental score of a typical 4-year- old child despite the fact that no 4-year- old children participated in the study? O Regression O Clustering Classification O Outlier/anomaly detection Score on Developmental Testarrow_forward
- Draw an ER model based on following scenario.A travel agency has different bus stations at different location. Each station has a unique name, locationand buses. Each bus has a unique bus number, color and type. The bus takes routes between stations.Each route has a name, arrival time and departure time. A single bus can take routes to different stations.Some buses may not take any route in some days however a station must have a route each day. Eachbus contains seats. Each seat has unique number, row and seat type. There are three different types ofseats as first class, standard and economy. Passengers can make reservations. Passenger has id number,full name, age, gender and address. Each reservation will have a unique reservation number, fare, dateand time. Each reservation can have any numbers of seats. The reservation is dependent on passenger.arrow_forwardSuppose you have to select one project partner from a set of four classmates, who have different GPAs. Assume you do not know any student’s GPA in advance but can get to know it after you have picked a student from that group (a) Suppose you pick one of the four students at random and accept that student as your project partner. What is the probability that your partner is the one with the highest GPA? (b) Suppose you decide to reject the first student and to then accept the next student if and only if that student has a higher GPA. Note that you MUST have a partner, so if the first three are rejected by you, then you have to accept the fourth student. What is the probability that your partner will be the one with the highest GPA.arrow_forwardThe predictive performance of a model is the measure of how close the model’s prediction values are to the actual values. A close-to-ideal model would have the minimum error in the predicted and actual values. The validation set is used to assess the predictive ability of the model which has been trained using the training set. True Falsearrow_forward
- You are interested in looking at how competition affects gas prices, where you think that if a gas station has more competitors nearby, it will tend to have lower prices. To test this, you decide to use a natural experiment in which a chain of gas stations ("Thrifty") suddenly closed. This closure occurred between June and October of 2012, and meant that gas stations that were previous close to a Thrifty station experienced less competition than before. On the other hand, those stations that were not close to a Thrifty station did not experience any change in competition. You decide to use a differences in differences design, in which gas stations located near a Thrifty station at the start of 2012 are the treatment group and those that were not located near a Thrifty at the start of 2012 are the control group. You look at how prices change before and after the exit of Thrifty. This is plotted in the below graph, where the solid line represents the average price of stations that were…arrow_forwardConsider a project in which a hospital emergency room aims to predict times of heavy/light demand, so they can schedule staff to handle the demand. They’ll use past data about demand to do so. Explain what underfitting a model to the past data would mean in this context, and how it would harm their predictions. Explain what overfitting would mean, and how it would harm their predictions.arrow_forwardWhen assessing the accuracy of a logistic regression model, the percent of incorrectly classified observations out of the total observations in the validation or test data is called the: Question 14 options: Accuracy Overall error rate Class Errorarrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education