IDS575_PS2_Q3
pdf
keyboard_arrow_up
School
University of Illinois, Chicago *
*We aren’t endorsed by this school
Course
575
Subject
Mathematics
Date
Apr 3, 2024
Type
Pages
7
Uploaded by maabba
Q3 Linear Regression
48 Points
In class, we derived linear regression and various learning algorithms based on gradient descent. In addition to the least square objective, we also learned its probabilistic perspective where each observation is assumed to have a Gaussian noise. (i.e., Noise of each example is an independent and identically distributed sample from a normal distribution) In this problem, you are supposed to deal with the following regression model that includes one feature that relates linearly/quadratically and another linear feature . where Q3.1
5 Points
The above equation says that linear regresion assumes is also a random variable due the amount of uncertainty given by the noise term . What distribution would the random output variable follow?
Tiny change in by (with the fixed ) will change by .
x
1
ϵ
x
2
h
s
Tiny change in by (with the fixed ) will change by .
x
2
ϵ
x
1
h
s
Tiny change in by (with the fixed ) will change by .
x
1
ϵ
x
2
h
ϵs
Tiny change in by (with the fixed ) will change by .
x
2
ϵ
x
1
h
ϵs
None of the above
x
1
x
2
y
=
θ
+
0
θ
x
+
1
1
θ
x
+
2
2
θ
x
+
3
1
2
ϵ
ϵ
∼
N
(0,
σ
)
2
y
ϵ
y
uniform distribution
poisson distribution
normal distribution
Bernoulli distribution
multinomial distribution
Q3.2
5 Points
Which of the following equations correspond to the mean of the distribution that follows? (i.e., the expectation )?
Q3.3
6 Points
You are provided with a training observations .Derive the conditional log-likelihood that will be later maximized to make most likely.
Q3.4
6 Points
If you omit all the constant that does not relate to our parameters , what will be the objective function that you are y
E
[
y
∣
x
,
x
]
1
2
θ
0
θ
+
0
θ
x
2
2
θ
+
0
θ
x
+
1
1
θ
x
3
1
2
θ
+
0
θ
x
+
1
1
θ
x
+
2
2
θ
x
3
1
2
θ
+
0
θ
x
+
1
1
θ
x
+
2
2
θ
x
3
1
θ
+
0
θ
x
+
1
1
θ
x
+
2
2
2
θ
x
3
1
D
=
{(
x
,
x
,
y
∣1 ≤
1
(
i
)
2
(
i
)
(
i
)
i
≤
m
}
D
θ
J
(
θ
,
θ
,
θ
,
θ
)
0
1
2
3
going to perform Maximum Likelihood Estimation?
Q3.5
10 Points
What is the probability of given a certain point ? (Hint: The answer must use the probability mass/density function of the distribution that you chose for in Q3.1) (Free Response)
Question 3.5.pdf
Download
y
(
i
)
x
(
i
)
y
∣
x
,
x
;
θ
1
2
P
(
y
∣
x
;
θ
) =
(
i
)
(
i
)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
1
of 1
Q3.6
10 Points
Compute the gradient of by evaluating the partial derivatives with respect to each parameter . (Hint: You should evaluate the partial derivatives based on your Q3.4 answer) Please upload a picture or pdf file showing all the detailed steps.
Question 3.6.pdf
Download
J
(
θ
)
θ
j
(0 ≤
j
≤ 3)
1
of 5
Q3.7
6 Points
Now you are ready to train your linear regression model by Batch Gradient Descent (BGD) and Stochastic Gradient Descent (SGD). When you consume all training examples in once, how many times each parameter gets updated?
m
D
BGD 1 time / SGD 1 time
BGD times / SGD 1 time
m
BGD 1 times / SGD times
m
BGD times / SGD times
m
m
GRADED
Problem Set (PS) #02
STUDENT
Urvashiben Patel
TOTAL POINTS
100 / 100 pts
QUESTION 1
Linear Regression Basic
22
/ 22 pts
1.1
(no title)
5
/ 5 pts
1.2
(no title)
5
/ 5 pts
1.3
(no title)
7
/ 7 pts
1.4
(no title)
5
/ 5 pts
QUESTION 2
Multivariate Calculus Basic
30
/ 30 pts
2.1
(no title)
10
/ 10 pts
2.2
(no title)
5
/ 5 pts
2.3
(no title)
10
/ 10 pts
2.4
(no title)
5
/ 5 pts
QUESTION 3
Linear Regression
48
/ 48 pts
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
3.1
(no title)
5
/ 5 pts
3.2
(no title)
5
/ 5 pts
3.3
(no title)
6
/ 6 pts
3.4
(no title)
6
/ 6 pts
3.5
(no title)
10
/ 10 pts
3.6
(no title)
10
/ 10 pts
3.7
(no title)
6
/ 6 pts