(b). Derive the maximum likelihood estimate of the parameter a in terms of the training example X's and Y's. We recommend you start with the simplest form of the problem you found above.

icon
Related questions
Question
Question 3. Regression
need answer of part b
Consider real-valued variables X and Y. The Y variable is generated, conditional on X, from the fol-
lowing process:
E~N(0,0²)
YaX+e
where every e is an independent variable, called a noise term, which is drawn from a Gaussian distri-
bution with mean 0, and standard deviation σ. This is a one-feature linear regression model, where a
is the only weight parameter. The conditional probability of Y has distribution p(YX, a) ~ N(aX, 0²),
so it can be written as
p(YX,a) = exp(-
(-202 (Y-ax)²)
1
ν2πσ
The following questions are all about this model.
MLE estimation
(a)
Assume we have a training dataset of n pairs (X, Y) for i = 1..n, and σ is known.
Which ones of the following equations correctly represent the maximum likelihood problem for
estimating a? Say yes or no to each one. More than one of them should have the answer "yes."
a
1
[Solution: no] arg max >
2πσ
1
[Solution: yes] arg max II
a
[Solution: no] arg max
a
[Solution: yes] arg max
a
1
exp(-2(Y-ax.)²)
2020
√270 exp(-272 (Yi - ax₁)²)
i
1
exp(- 202 (Yi-aX;)²)
I exp(-2 (Y-ax)²)
1
(b).
[Solution: no] arg maxi
a
[Solution: yes] arg min
Σα
(Y-ax₁)²
Derive the maximum likelihood estimate of the parameter a in terms of the training
example X,'s and Y's. We recommend you start with the simplest form of the problem you found
above.
Transcribed Image Text:Question 3. Regression need answer of part b Consider real-valued variables X and Y. The Y variable is generated, conditional on X, from the fol- lowing process: E~N(0,0²) YaX+e where every e is an independent variable, called a noise term, which is drawn from a Gaussian distri- bution with mean 0, and standard deviation σ. This is a one-feature linear regression model, where a is the only weight parameter. The conditional probability of Y has distribution p(YX, a) ~ N(aX, 0²), so it can be written as p(YX,a) = exp(- (-202 (Y-ax)²) 1 ν2πσ The following questions are all about this model. MLE estimation (a) Assume we have a training dataset of n pairs (X, Y) for i = 1..n, and σ is known. Which ones of the following equations correctly represent the maximum likelihood problem for estimating a? Say yes or no to each one. More than one of them should have the answer "yes." a 1 [Solution: no] arg max > 2πσ 1 [Solution: yes] arg max II a [Solution: no] arg max a [Solution: yes] arg max a 1 exp(-2(Y-ax.)²) 2020 √270 exp(-272 (Yi - ax₁)²) i 1 exp(- 202 (Yi-aX;)²) I exp(-2 (Y-ax)²) 1 (b). [Solution: no] arg maxi a [Solution: yes] arg min Σα (Y-ax₁)² Derive the maximum likelihood estimate of the parameter a in terms of the training example X,'s and Y's. We recommend you start with the simplest form of the problem you found above.
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 4 steps with 6 images

Blurred answer
Similar questions