4. Consider a node, t, in a decision tree. The set of training records in node t is Dt, which contains records from three different classes C₁, C2, and C3. The class distribution is as follow: Class Number of records C₁ X x 2x C2 C3 What are the impurity measures of this node using Gini index and misclassification error?
4. Consider a node, t, in a decision tree. The set of training records in node t is Dt, which contains records from three different classes C₁, C2, and C3. The class distribution is as follow: Class Number of records C₁ X x 2x C2 C3 What are the impurity measures of this node using Gini index and misclassification error?
Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
Related questions
Question

Transcribed Image Text:4. Consider a node, t, in a decision tree. The set of training records in node t is D, which contains
records from three different classes C₁, C2, and C3. The class distribution is as follow:
Class
Number of records
C₁
C2
C3
What are the impurity measures of this node using Gini index and misclassification error?
Gini index =
Misclassification error =
X
x
2x
Expert Solution

Step 1 Explanation
Dear Student,
Formulae for finding Misclassification error = 1-max(p,1-p) //for two classes where probability of one class is p
Formulae for finding Gini Index = 1-p2(j/t)
Here, The estimated probability that the item is actually in class j is p(j/t).
Here if total records are 4x then class 1 and class2 will have x records and class 3 will have 2x records.
Thus, there probability = class1 = class2 = x/4x
class3=2x/4x
Step by step
Solved in 2 steps

Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.Recommended textbooks for you

Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education

Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON

Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON

Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education

Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON

Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON

C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON

Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning

Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education