76CA5FB9-CD26-4E86-9694-ADC1AC2152F8

pdf

School

New Jersey Institute Of Technology *

*We aren’t endorsed by this school

Course

634

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

8

Uploaded by DeaconLightning13472

Report
HomeWork 2 Name:TARUN TARIKERE VENKATESHA UCID: tt383 1)What is the entropy of this collection of training set? The entropy of the training set is calculated using the formula H(s)=-p1(log2(p1))-p2(log2(p2)) Where p1-yes-4 P2-no-6 H(s) = -(4/10)*log2(4/10) - (6/10)*log2(6/10) = -0.4*log2(0.4) - 0.6*log2(0.6) = -0.4*(-1.322) - 0.6*(-0.737) = (0.528) +(0.442) H(s) = 0.971. B.What are the information gain of splitting Body temperature and gives birth? Gain=Entropy(Dataset) – (Count(Group1) / Count(Dataset) * Entropy(Group1) + Count(Group2) / Count(Dataset) * Entropy(Group2)) Entropy(warm blooded)=-4/5 * log2(4/5) - 1/5 * log2(1/5) = - (0.8 * (-0.32) + 0.2 * (-2.322)) = - (-0.25 - 0.46) = 0.72 Entropy(cold boolded)=- (5/5 * log2(5/5) + 0/5 * log2(0/5)) = - (1 * 0) = 0
gain=0.97-(0.5*0.72)-(0.5*0) =0.97-0.36 =0.61 Body Temperature,IG=0.6103. Gives birth: Gives birth,IG Gives birth-5,yes 5,no gain=0.971-(5/10)*entropy(s[yes])-(5/10)*entropy(s[no]) entropy(s[yes])=-( *log2( )+ *log2( )) = - (0.8 * (-0.3219) + 0.2*(-2.322)) = 0.25 + 0.46 = 0.72 entropy(s no)= - (5/5 * log2(5/5) + 0/5 * log2(0/5)) =-(log2(1)) =-(1*0) =0 gain=0.9-(0.5*0.72)-(0.5*0) 0.97-0.36 =0.6103 Both attributes "Body Temperature" and "Gives Birth" have approximately the same information gain of 0.6103. C.Between "Gives Birth" and "Four legged" what is the best split according to the classification error rate? Split based On gives birth or not Group1-animals (gives birth)-5 Group2-animals (doesn’t give birth)-5 Grp-1 =1-( )=1-0.8 =0.2 Grp2 =1-(0/5)=1 Total classification error=(5/10)*0.2+(5/10)*1 =0.5*0.2+0.5=0.6
Based on four legged Grp1:four legged animals-4 Grp2:not four legged animals-6 classification error rate Grp 1: =1-(2/4)=1-0.5=0.5 Grp 2: =1-(1/3) =0.6 Total classification error rate =(4/10)*0.5+(6/10)*0.6 =0.2+0.4=0.6 Both split have same low error classification error rate i.e.,0.6 D.Between "Gives Birth" and "Four legged" what is the best split according to the Gini index? Gini index,Ig=1-Σ(Pj)^2 gini(y)=1-( )^2-( )^2=1-0.6-0.04=0.32 gini(no)=1-(5/5)^2 =0 E. Using "Gives Birth" as the first split parameter and "Four Legged" as the second split parameter, you can draw the decision tree and calculate the information gain to determine the best split for the given data.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Q2) a - What does zero entropy mean? What’s the possible minimum of entropy? Entropy describes the uncertainty in dataset class distribution . When the entropy is zero ,means there is no disorder/randomness in the system .i.e.,Zero entropy occurs when all data points belongs to same classes.It is utilized to make decision in building decision trees . The minimum possible entropy is zero.If its impure, entropy will increase accordingly. b - Describe pre-pruning and post-pruning techniques for dealing with decision tree overfitting . Pre-pruning:It is also known as early stopping. Pre pruning is setting constraints while preparing decision tree such as limiting tree depth, minimum sample per leaf or max leaf nodes.This make decision tree less complex.
Q3. a) Compute the generalization error rate of the trees using optimistic approach b) Compute the generalization error rate of the trees using pessimistic approach. As penalty term (hyperparameter) use 𝛼 = 0.5, 𝛼 = 0.75, 𝛼 =1. Use the depth of the tree as complexity. c) Based on the generalization error part (a) and (b), which tree is preferred and why? d) Based on Occam's Razor, which tree is better and why? a) Generalization error optimistic approach: In given decision tree we are given with two classes -> Correct results(represented with +) and Mistakes(represented with - ) In optimistic approach, we consider the Class having lower count to be the error, i.e i) if the count of ‘Correct Results’ is greater than ‘Mistakes’, then we consider ‘Mistakes’ to be the error and we consider that the tree has correctly classified the ‘Correct Result’ instances and the instances of ‘Mistakes’ are incorrectly classified.
ii) If the no. of occurrences labelled as ‘mistakes’ is higher than those labelled as correct results .considered as correct results as erroneous category .i.e., we assume tree has accurately categorized the instances labeled as ‘mistakes’ while incorrectly categorizing the instances labelled as correct results iii) if one class has zero occurrences and the other class has full count.means that tree has correctly classified both classes and we disregard that particular node. Optimistic Generalization error =(Incorrect Classifications)/(Total Classifications) For Tree 1, Generalization error rate =15/73 = 0.205 For Tree 2, Generalization error rate =20/73 = 0.273 b) Pessimistic Generalization error = [Optimistic error classification+{depth of tree * penalty hypermeter}]/(Total Classifications) For Tree 1, Depth of tree = 4 Penalty hypermeter = 0.5 Error rate = [15 + {4 * 0.5}]/73 = 21.5/73 = 0.232 For Tree 2, Depth of tree = 3 Penalty hypermeter = 0.5 Error rate = [20 + {3 * 0.5}]/73 = 0.294
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
c) After observing the results from both a) and b), we see that the generalization error for Tree 1 is less in both optimistic and pessimistic approach. Hence Tree 1 is preferred because it makes less errors while classification. d) According to Occams Razor, the simplest among all tree is the best for generalization. Hence, from the give two trees Tree 2 is the best as per Occams Razor. This is because Tree 1 has more leaf nodes as a result it is more specific than general, whereas Tree 2 has lesser number of leaf nodes as a result it tends ti generalize more . a) Generalization error rate using optimistic approach In given decision tree we are given with two classes -> Correct results(represented with +) and Mistakes(represented with - ) In optimistic approach, we consider the Class having lower count to be the error, i.e i) if the count of ‘Correct Results’ is greater than ‘Mistakes’, then we consider ‘Mistakes’ to be the error and we consider that the tree has correctly classified the ‘Correct Result’ instances and the instances of ‘Mistakes’ are incorrectly classified. ii) if the count of ‘Mistakes’ is greater than ‘Correct Results’, then we consider ‘Correct Results’ to be the error and we consider that the tree has correctly classified the ‘Mistake’ instances and the instances of ‘Correct Results’ are incorrectly classified. iii) if in case one of the classes has a count of 0 and the other class has a count of FULL, it means that the tree has correctly classified both classes, and we ignore that node. Optimistic Gen error =(Incorrect Classifications)/(Total Classifications) For Tree 1, Gen error rate =15/73= 0.205 For Tree 2, Gen error rate =20/73 = 0.273 b) Pessimistic Gen error = [Optimistic error classification+{depth of tree * penalty hyparmeter}]/(Total Classifications)
For Tree 1, Depth of tree = 4,Penalty hypermeter = 0.5 Error rate = [15 + {4 * 0.5}]/73 = 21.5/73 = 0.232 For Tree 2, Depth of tree = 3 Penalty hypermeter = 0.5 Error rate =( [20 + (3 * 0.5)]/73= 0.294 c) After observing the results from both a) and b), we see that the generalization error for Tree 1 is less in both optimistic and pessimistic approach. Hence Tree 1 is preferred because it makes less errors while classification. d) According to Occams Razor, the simplest among all tree is the best for generalization. Hence, from the give two trees Tree 2 is the best as per Occams Razor. This is because Tree 1 has more leaf nodes as a result it is more specific than general, whereas Tree 2 has lesser number of leaf nodes as a result it tends to generalize more.