Concept explainers
a.
Explanation of Solution
Given: The insurance records are evaluated to build the predicting model for the fraudulent claims. Only 1% is considered as fraudulent on the basis of the historic data.
The sample, n, which is applied, is 800. It classifies the values 310 and 270 as frauds and non-frauds, respectively. It misses 90 as frauds, where 130 records are found incorrect which are marked as fraud...
b.
Explanation of Solution
Given: The insurance records are evaluated to build the predicting model for the fraudulent claims. Only 1% is considered as fraudulent on the basis of the historic data.
The sample, n, which is applied, is 800. It classifies the values 310 and 270 as frauds and non-frauds, respectively. It misses 90 as frauds, where 130 records are found incorrect which are marked as fraud.
To find:Â The adjusted misclassification from the record of the predicating model.
Solution:
By analyzing the records from the classification matrix,
Predicted records of fraudulent without any record of non-fraudulent=310-90=220...
c.
Explanation of Solution
Given: The insurance records are evaluated to build the predicting model for the fraudulent claims. Only 1% is considered as fraudulent on the basis of the historic data.
The sample, n, which is applied, is 800. It classifies the values 310 and 270 as frauds and non-frauds, respectively. It misses 90 as frauds, where 130 records are found incorrect which are marked as fraud...
Want to see the full answer?
Check out a sample textbook solutionChapter 5 Solutions
Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner
- Find the dissimilarity matrix of the following dataset: Student Test-1 (nominal) Test-2 (nominal) Test-3 (ordinal) Test-4 (numeric) 1 Code-A Code - I Excellent 80 Code-B Code - I Fail 20 3 Code-C Code - II Fail 100 4 Code-A Code - II Pass 60arrow_forwardA high bias model has a high unpredictable error, while a high variance model has a high systematic error. Select: True or False?arrow_forwardThe benefits of all-subsets regression over stepwise regression are discussed in detail below.arrow_forward
- A group of researchers conducted a study to investigate the effectiveness of a new teaching method for a particular subject. They randomly assigned 100 students to two groups: one group received the new teaching method, and the other group received the traditional teaching method. At the end of the semester, they measured the students' performance on a standardized test. The researchers found that the mean score for the group that received the new teaching method was higher than the mean score for the group that received the traditional teaching method. How can the researchers test the hypothesis that the new teaching method is more effective than the traditional teaching method? What statistical test should they use?arrow_forwardThe paper "The Effects of Adolescent Volunteer Activities on the Perception of Local Society and Community Spirit Mediated by Self-Conception"† describes a survey of a large representative sample of middle school children in South Korea. One question in the survey asked how much time per year the children spent in volunteer activities. The sample mean was 14.76 hours and the sample standard deviation was 16.54 hours. USE SALT Based on the reported sample mean and sample standard deviation, explain why it is not reasonable to think that the distribution of volunteer times for the population of South Korean middle school students is approximately normal. (Round your answer to nearest percent.) If the distribution of volunteer times is approximately normal, for the sample standard deviation of s = 16.54 hours and the sample mean of x = 14.76 hours, approximately 19 negative. Therefore, it is not reasonable to think that the distribution of volunteer times is approximately normal. % of…arrow_forwardIn a database describing 100 examples of printer failures, 75 are hardware failures and 25 are driver failures. Of the hardware failures, 15 had Windows. Of the driver failures, 15 had Windows. If the probability of a driver failure is 25/100, the probability that the system where a failure occurred was Windows is 30/100 and the probability of a failure in the Windows system given it has been caused by the driver is 15/25, what is the probability that a failure has been caused by a driver knowing that the system is Windows?arrow_forward
- Give two examples of unstructured data. Answer: An exam evaluator has randomly selected ten answer papers from a bundle of 100 student submissions. 9 out of these 10 students scored above 80 in their exam. He concludes that most of the students should have done well in their exams as well. What kind of statistical modelling has been done here? Answer:arrow_forwardExplain why you should use all-subsets regression instead of stepwise regression.arrow_forwardExplain why all-subsets regression is preferable than stepwise regression.arrow_forward
- Explain what autocorrelation indicates . What are the main problems that autocorrelation creates for OLS estimation results ? Give two ways to detect autocorrelation problem and the hypothesis that are tested ?arrow_forwardIn piecewise regression, what are breakpoints?arrow_forwardThe data mining technique involved in predicting a categorical response is called as. A. Regression B. Classification C. Clustering D. Summarizationarrow_forward
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education