Homework 8C

pdf

School

University of Arkansas *

*We aren’t endorsed by this school

Course

4143

Subject

Computer Science

Date

Dec 6, 2023

Type

pdf

Pages

Uploaded by SuperBatMaster87

1 Homework 8 Problem 3 Suppose that we use bootstraping (i.e., sample with replacement) to sample 5 data points out of the training dataset { x 1 ,x 2 ,...,x 10 }. Answer the following questions 1. Is it possible for the sampled bootstrap to be { x 1 ,x 2 ,x 3 ,x 4 ,x 5}? If no, please explain why; otherwise calculate the probability of getting this specific bootstrap sample. Yes, x 1 ,x 2 ,x 3 ,x 4 ,x 5 can be obtained as a bootstrap sample because bootstrapping involves sampling with replacement, which means that each data point has a chance of being selected multiple times. This formula can be used to calculate the likelihood of receiving this specific bootstrap sample: • Probability of selecting x 1 in one draw = 1/10 (since there are 10 data points) • Probability of selecting x 2 ,x 3 ,x 4 and x 5 in subsequent draws = 1/10 each • Probability of getting {x 1 ,x 2 ,x 3 ,x 4 ,x 5 } = (1/10) * (1/10) * (1/10) * (1/10) * (1/10) = (1/10)^5 = 1/100,000 2. Is it possible for the sampled bootstrap to be {x 1, x 1, x 1, x 1, x 1 }? If no, please explain why; otherwise calculate the probability of getting this specific bootstrap sample. No, the sampled bootstrap cannot be x 1, x 1, x 1, x 1, x 1 . This is because bootstrap sampling is done with replacement, which means that each data point can be chosen multiple times. However, in this case, the sample contains only one unique data point (x1) repeated five times. Because the sampling is done with replacement, this specific bootstrap sample cannot be obtained. 3. Compute the probability of a data sample, say x 1 , NOT being included in the bootstrap. Because there are 9 other data points besides x1, the probability of x1 not being included in a single draw is 9/10. The probability of x1 not being included in any of the five draws in bootstrapping with replacement is: • Probability of x1 not being selected in one draw = 9/10

2 • Probability of x1 not being selected in all five draws = (9/10)^5 = 59049/100000 = 0.59049 = 59.05 %

3 Problem 4 In a trained classification tree, each leaf node may contain data points from different classes. For instance, a leaf node might have 10 data points from the positive class (+1) and 5 data points from the negative class (-1). Suppose that we use this classification tree to predict a test input x , and as the tree’s decision process unfolds, this input eventually arrives at a specific leaf node L . 1. If the leaf node L contains 2 positive training samples and 5 negative samples, how will this tree classify x ? What is the estimated probability, P[ x is +1], that x is positive? • Since the leaf node L contains 2 positive training samples and 5 negative samples, the classification would be determined by the majority class at that leaf node. • For instance, since there are more negative samples (5) than positive samples (2), the tree would classify the test input x as belonging to the negative class (-1). 𝑃[ 𝑥 + 1] = 𝑁????? ?? ???𝑖?𝑖?? ??????? ?? ???? ???? 𝑇???? ?????? ?? ??????? ?? ???? ???? = 2 2 + 5 = 2 7 2. Suppose we produce ten bootstrapped samples from a data set containing +1 and -1 classes. We then apply a classification tree to each bootstrapped sample and, for a specific test input x , produce 10 estimates of P[ x is +1]: 0 . 1 , 0 . 15 , 0 . 2 , 0 . 2 , 0 . 55 , 0 . 6 , 0 . 6 , 0 . 65 , 0 . 7 , 0 . 75 . There are two common ways to combine these results together into a single class prediction. One is the majority vote approach discussed in this chapter. The second approach is to classify based on the average probability. In this example, what is the final classification under each of these two approaches? Majority Vote Approach: • In the majority vote approach, the final classification is determined by the majority of the individual classifications. • Counting the number of estimates grater than or equal to 0.5, we have 6 estimates greater than or equal to 0.5 (0.55, 0.6, 0.6, 0.65, 0.7, 0.75) • Therefore, the majority vote would result in a classification of +1. Average Probability Approach:

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

4 • In the average approach, the final classification is based on the average of the individual probabilities. • 𝑨𝒗𝒆𝒓?𝒈𝒆 𝑷𝒓𝒐???𝒊𝒍𝒊𝒕𝒚 = (0.1+0.15+0.2+0.2+0.55+0.6+0.6+0.65+0.7+0.75) 10 ≈ 0.47 • Since 0.46 is less than 0.5, the final classification under the average probability approach is -1. As a result, the final classification under the majority vote approach is +1, whereas the final classification under the average probability approach is -1 for the specific test input x based on the given set of probability estimates.

5 Problem 5 Single-layer perceptrons are simplified neural networks. In this exercise, we use the sign function (the “hard” version of the Logistic function) as the activation function and assume that the input layer has two variables x 1 and x 2 only. , Formally, such perceptron can be expressed in the following way: G p ( x 1 ,x 2 ) := sign ( w 1 x 1 + w 2 x 2 + w 3 ) , where w 1 ,w 2 ,w 3 represent the parameters. 1. Given two binary variables x 1 and x 2 , the AND operator is defined as: {1 if x 1 = x 2 = 1, AND( x 1 ,x 2 ) := (0 otherwise. Is it possible to find parameters w 1 ,w 2 ,w 3 such that G p ( x 1 ,x 2 ) = AND( x 1 ,x 2 ) for all x 1 ,x 2 ∈ {0 , 1}? If no, please explain why; otherwise identify the parameters and show that your answer is correct. Yes, it is possible to find w 1 ,w 2 ,w 3 parameters such that G p ( x 1 ,x 2 ) = AND( x 1 ,x 2 ) for all x 1 ,x 2 ∈ {0 , 1}. Set the parameters to w 1 = 1, w 2 = 1, and w 3 = -1.5. The perceptron will output 1 when both x 1 and x 2 are 1, and 0 otherwise with these parameter values. This is consistent with the behavior of the AND operator. 2. Given two binary variables x 1 and x 2 , the OR operator is defined as: {1 if x 1 = 1 or x 2 = 1, OR( x 1 ,x 2 ) := 0 otherwise.

6 Is it possible to find parameters w 1 ,w 2 ,w 3 such that G p ( x 1 ,x 2 ) = OR( x 1 ,x 2 ) for all x 1 ,x 2 ∈ {0 , 1}? If no, please explain why; otherwise identify the parameters and show that your answer is correct. Yes, it is possible to find parameters w 1 ,w 2 ,w 3 such that G p ( x 1 ,x 2 ) = OR( x 1 ,x 2 ) for all x 1 ,x 2 ∈ {0 , 1}. The parameters can be set as w 1 = 1, w 2 = 1, w 3 = -0.5. With these parameter values, the perceptron will output 1 only when both x 1 and x 2 are 1, and 0 otherwise. This matches the behavior of the OR operator. 3. Given two binary variables x 1 and x 2 , the XOR operator is defined as: {1 if x 1 ≠ x 2 , XOR( x 1 ,x 2 ) := 0 otherwise. Is it possible to find parameters w 1 ,w 2 ,w 3 such that G p ( x 1 ,x 2 ) = XOR( x 1 ,x 2 ) for all x 1 ,x 2 ∈ {0 , 1}? If no, please explain why; otherwise identify the parameters and show that your answer is correct. No, it is not possible to find parameters w 1 ,w 2 ,w 3 such that G p ( x 1 ,x 2 ) = XOR( x 1 ,x 2 ) for all x 1 ,x 2 ∈ {0 , 1}. The XOR operator is not linearly separable, meaning it cannot be represented by a single-layer perceptron. The XOR operator requires a more complex network structure, such as a multi-layer perceptron, to accurately represent its behavior.

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

lb 3.pdf

Lab1-EIGRP.pdf

Lab 9 - Ethernet and ARP.docx

Assignment 3 UC.docx

Tech 4010 - Assignment 3-2.pdf

Homework 6C.pdf

MIS QUIZ 2.docx

Homework 3C.pdf

206 Lab G - Struct and Dynamic memory.pdf

HW3 (2).pdf

Homework2_Solution.pdf

LAB 4_instruction.pdf

Recommended textbooks for you

Operations Research : Applications and Algorithms

Computer Science

ISBN:9780534380588

Author:Wayne L. Winston

Publisher:Brooks Cole

C++ Programming: From Problem Analysis to Program...

Computer Science

ISBN:9781337102087

Author:D. S. Malik

Publisher:Cengage Learning

Oracle 12c: SQL

Computer Science

ISBN:9781305251038

Author:Joan Casteel

Publisher:Cengage Learning

Principles of Information Systems (MindTap Course...

Computer Science

ISBN:9781285867168

Author:Ralph Stair, George Reynolds

Publisher:Cengage Learning

Fundamentals of Information Systems

Computer Science

ISBN:9781305082168

Author:Ralph Stair, George Reynolds

Publisher:Cengage Learning

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

SEE MORE TEXTBOOKS

Recommended textbooks for you

Operations Research : Applications and Algorithms
Computer Science
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Brooks Cole
C++ Programming: From Problem Analysis to Program...
Computer Science
ISBN:9781337102087
Author:D. S. Malik
Publisher:Cengage Learning
Oracle 12c: SQL
Computer Science
ISBN:9781305251038
Author:Joan Casteel
Publisher:Cengage Learning
Principles of Information Systems (MindTap Course...
Computer Science
ISBN:9781285867168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning
Fundamentals of Information Systems
Computer Science
ISBN:9781305082168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage