PS#8

pdf

School

California Lutheran University *

*We aren’t endorsed by this school

Course

IDS575

Subject

Computer Science

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by SuperHumanCrabPerson1153

Q1 Gaussian Discriminant Analysis 60 Points Q1.1 5 Points Which is the goal of linear discriminant analysis? Q1.2 5 Points In GDA, the are defined as Q1.3 5 Points Which of the following is not true when using GDA? to minimize the variance between the classes  to maximize the within class variance  to maximize the variance between the classes  None of above  μ y the median for independent variables for a group  the average for independent variables for a group  the center values for independent variables for all group  the averages for independent variables for all groups 

Q1.4 5 Points There are two classes following Gaussian distribution which are centered at and . They have identical covariance matrix. Which is the separating decision boundary? Q1.5 5 Points We have four data points with two classes in different color as shown below. Which of the following models (without additional complexity) can achieve zero training error for classification? The prediction of GDA is the same as Logistic Regression  Discriminant analysis is one of generative models  Two Gaussian distributions must share the same covariance  If p(x|y) is Gaussian, GDA is better than logistic regression given the same data  (−1, 2) (1, 4)  y − x = 3  x + y = 3  x + y = 6 and are possible  x + y = 3 x + y = 6 None of above 

Q1.6 5 Points A teacher wants to evaluate high school students based on their Math and English grades. Students are classified as either successful(group1) or not-successful(group2) in their college applications. The teacher has data on 20 students as following: Logistic regression  PCA  LDA  None of above 

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Assume the data follows Gaussian and these two groups share the same shape. Please use MLE to estimate the parameters: 0.5 Q1.7 10 Points ϕ = μ = group 1

[683.8 654.2] Q1.8 10 Points [610.7 605.7] Q1.9 10 Points [1094.35 -296.15] [-296.15 2867.55] Q2 Naive Bayes Model 40 Points Q2.1 5 Points What is the number of parameters needed to represent a Bernoulli Naive Bayes classifier with n Boolean variables and a Boolean label ? Q2.2 10 Points μ = group 2 Σ =  2 n + 1  n + 1  2 n  n

Our data has three boolean input variables a,b,c and a single boolean output K as shown below: According to the Naive Bayes classifier, what is 0.1875 Q2.3 10 Points Assume three word types a,b,c. Consider a Naive Bayes model with the following conditional probability table: Giving a new sample Compute the probability of predicting x as positive. (Please write detailed steps rather than just the result) Q2.3.pdf  Download P ( a = 1 ∧ b = 0 ∧ c = 1∣ K = 1) x = (1, 0, 1) 

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

1 of 2 Q2.4 5 Points Which class should it be predicted? positive  negative 

Q2.5 5 Points Consider a 5 letter long string (only contains lowercase letters). If we want to classify whether it's a valid English word using multinomial naive Bayes model. This task is equivalent to prepare a dice with faces, tossing times. (Assume the whole English vocabulary has 5000 words.) Q2.6 5 Points When applying Laplace smoothing on multinomial Naive Bayes model, if we add to the numerator instead of , like What should be added in the denominator? In this case, we should add 0.5 times number of classes in target (0.5 * |v|) in denominator. Q3 Kernel Methods (from PS6) 35 Points Q3.1 4 Points N L N=5000, L = 5  N=26, L= 5000  N=26, L=5  N=5, L=26  0.5 1

If we consider all M-order interaction as features giving original X having N attributes: , what is the computational complexity in general to naively compute feature maps? Q3.2 7 Points Giving and are valid kernels, select all valid kernels from : ( x , x , ..., x ) 1 2 n  O ( N ) 2  O ( M ) 2  O ( N ) M  O ( M ) N K 1 K 2  K ( x , z ) := K ( x , z ) + 1 K ( x , z ) 2 K ( x , z ) := K ( x , z ) − 1 K ( x , z ) 2 (a>0)  K ( x , z ) := aK ( x , z ) 1 (b<0) K ( x , z ) := bK ( x , z ) 1 Note that the matrix K is the Hadamard product (or element-by-element product) of K1 and K2.  K ( x , z ) := K ( x , z ) K ( x , z ) 1 2 p(x): a polynomial function with positive coeﬃcients  K ( x , z ) := p ( K ( x , z )) 1 ( a real valued function) K ( x , z ) := f ( x ) f ( z ) f : R ⟶ n R . ( : another valid kernel over  K ( x , z ) := K ( ϕ ( x ), ϕ ( z )) 3 K 3 R ⟶ d R d

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Q3.3 10 Points Given a uni-variance radial basis kernel , prove the feature mapping of x and z are distanced at most . (Hint: Think about ) Q3.3.pdf  Download 1 of 1 Q3.4 7 Points K ( x , z ) = exp {− } 2 ∣∣ x − z ∣∣ 2 2 ∣∣ ϕ ( x ) − ϕ ( x )∣∣ 2 

Choose all the factors that affects the eﬃciency of SVM to reduce errors and overfitting: Q3.5 7 Points Select all true about kernel in SVM: selection of kernel  kernel parameters  soft-margin parameter C  how to assign {-1,1} to different labels Kernel function map low dimensional data to high dimensional space  a valid kernel has only one feature mapping To transform the problem from nonlinear to linear  GRADED Problem Set (PS) #08 STUDENT Urvashiben Patel TOTAL POINTS 

128 / 135 pts QUESTION 1 Gaussian Discriminant Analysis 60 / 60 pts 1.1 (no title) 5 / 5 pts 1.2 (no title) 5 / 5 pts 1.3 (no title) 5 / 5 pts 1.4 (no title) 5 / 5 pts 1.5 (no title) 5 / 5 pts 1.6 (no title) 5 / 5 pts 1.7 (no title) 10 / 10 pts 1.8 (no title) 10 / 10 pts 1.9 (no title) 10 / 10 pts QUESTION 2 Naive Bayes Model 40 / 40 pts 2.1 (no title) 5 / 5 pts 2.2 (no title) 10 / 10 pts 2.3 (no title) 10 / 10 pts 2.4 (no title) 5 / 5 pts 2.5 (no title) 5 / 5 pts 2.6 (no title) 5 / 5 pts QUESTION 3 Kernel Methods (from PS6) 28 / 35 pts 3.1 (no title) 4 / 4 pts 3.2 (no title) 0 / 7 pts 3.3 (no title) 10 / 10 pts 3.4 (no title) 7 / 7 pts 3.5 (no title) 7 / 7 pts

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

CNET 237 Chapter 9 Exam.docx

Assignment1_Article_summary.docx

PRIYA LR&CFS QUIZ-7.docx

PRIYA LR&CFS ASSIGNMENT-7.docx

PRIYA LR&CFS QUIZ-2.docx

PS#10.pdf

CNET 237 Ch 4.docx

CNET 237 Chapter 10-12 Exam.docx

Recommended textbooks for you

Operations Research : Applications and Algorithms

Computer Science

ISBN:9780534380588

Author:Wayne L. Winston

Publisher:Brooks Cole

C++ Programming: From Problem Analysis to Program...

Computer Science

ISBN:9781337102087

Author:D. S. Malik

Publisher:Cengage Learning

Programming Logic & Design Comprehensive

Computer Science

ISBN:9781337669405

Author:FARRELL

Publisher:Cengage

LINUX+ AND LPIC-1 GDE.TO LINUX CERTIF.

Computer Science

ISBN:9781337569798

Author:ECKERT

Publisher:CENGAGE L

Np Ms Office 365/Excel 2016 I Ntermed

Computer Science

ISBN:9781337508841

Author:Carey

Publisher:Cengage

SEE MORE TEXTBOOKS

Recommended textbooks for you

Operations Research : Applications and Algorithms
Computer Science
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Brooks Cole
C++ Programming: From Problem Analysis to Program...
Computer Science
ISBN:9781337102087
Author:D. S. Malik
Publisher:Cengage Learning
Programming Logic & Design Comprehensive
Computer Science
ISBN:9781337669405
Author:FARRELL
Publisher:Cengage
LINUX+ AND LPIC-1 GDE.TO LINUX CERTIF.
Computer Science
ISBN:9781337569798
Author:ECKERT
Publisher:CENGAGE L
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage