EECS189 Dis10 Solution
pdf
keyboard_arrow_up
School
University of California, Berkeley *
*We aren’t endorsed by this school
Course
189
Subject
Computer Science
Date
Dec 6, 2023
Type
Pages
5
Uploaded by DoctorDugong8581
CS 189/289A
Introduction to Machine Learning
Fall 2023
Jennifer Listgarten, Jitendra Malik
DIS10
1
Probabilistic Graphical Models
Recall that we can represent joint probability distributions with directed acyclic graphs (DAGs).
Let
G
be a DAG with vertices
X
1
, ...,
X
k
. If
P
is a (joint) distribution for
X
1
, ...,
X
k
with (joint)
probability mass function
p
, we say that
G
represents
P
if
p
(
x
1
,
· · ·
,
x
k
)
=
k
ZZZZZZZ
i
=
1
P
(
X
i
=
x
i
|
pa(
X
i
))
,
(1)
where pa(
X
i
) denotes the parent nodes of
X
i
. (Recall that in a DAG, node
Z
is a parent of node
X
i
ff
there is a directed edge going out of
Z
into
X
.)
Consider the following DAG
Z
Y
X
S
Figure 1:
G
, a DAG
(a) Write down the joint factorization of
P
S
,
X
,
Y
,
Z
(
s
,
x
,
y
,
z
) implied by the DAG
G
shown in Figure 1.
Solution:
P
S
,
X
,
Y
,
Z
(
s
,
x
,
y
,
z
)
=
P
(
X
=
x
)
P
(
Z
=
z
)
P
(
Y
=
y
|
X
=
x
,
Z
=
z
)
P
(
S
=
s
|
X
=
x
,
Y
=
y
)
.
(b) Is
S
⊥
Z
|
Y
?
Solution:
No. As a counterexample, consider the case where all nodes represent binary ran-
dom variables,
P
(
X
=
1)
=
P
(
Z
=
1)
=
0
.
5,
Y
=
X
⊗
Z
, and
S
=
X
⊗
Y
, where
⊗
is the XOR
operator. Then we can see that
S
=
Z
, whereas knowing
Y
does not fully determine
S
or
Z
.
A version of these solutions from a previous semester erroneously said that this conditional
independence did hold. As a result, you may have wrongly heard in section that this statement
is true, via faulty algebraic manipulation and
/
or other algorithms such as the Bayes ball (d-
separation).
Running these algorithms correctly should show that
S
and
Z
are indeed not
conditionally independent given
Y
.
DIS10,
©
UCB CS 189
/
289A, Fall 2023. All Rights Reserved. This may not be publicly shared without explicit permission.
1
If
X
is fully removed from
G
, then we do indeed have
S
⊥
Z
|
Y
. This is left as an exercise in
algebraic manipulation of probability distributions.
(c) Is
S
⊥
X
|
Y
?
Solution:
No. Consider the same example from above with binary random variables. Knowing
Y
does not determine
S
, but knowing both
X
and
Y
does.
DIS10,
©
UCB CS 189
/
289A, Fall 2023. All Rights Reserved. This may not be publicly shared without explicit permission.
2
2
PGMs: Sleeping in Class
In this question, you’ll be reasoning about a Dynamic Bayesian Network (DBN), a form of a Prob-
abilistic Graphical Model.
Your favorite discussion section TA wants to know if their students are getting enough sleep. Each
day, the TA observes the students in their section, noting if they fall asleep in class or have red
eyes. The TA makes the following conclusions:
1. The prior probability of getting enough sleep,
S
, with no observations, is 0.7.
2. The probability of getting enough sleep on night
t
is 0.8 given that the student got enough
sleep the previous night, and 0.3 if not.
3. The probability of having red eyes
R
is 0.2 if the student got enough sleep, and 0.7 if not.
4. The probability of sleeping in class
C
is 0.1 if the student got enough sleep, and 0.3 if not.
(a) Formulate this information as a dynamic Bayesian network that the professor could use to
filter or predict from a sequence of observations. If you were to reformulate this network as
a hidden Markov model instead (that has only a single observation variable), how would you
do so?
Give a high-level description (probability tables for the HMM formulation are not
necessary.)
Solution:
Our Bayesian Network has three variables:
S
t
, whether the student gets enough
sleep,
R
t
, whether they have red eyes in class, and
C
t
, whether the student sleeps in class.
S
t
is a parent of
S
t
+
1
,
R
t
, and
C
t
. The network can be provided pictorally, or fully through condi-
tional probability tables (CPTs.) The CPTs for this problem are given by:
P
(
s
0
)
=
0
.
7
P
(
s
t
+
1
|
s
t
)
=
0
.
8
P
(
s
t
+
1
|¬
s
t
)
=
0
.
3
P
(
r
t
|
s
t
)
=
0
.
2
P
(
r
t
|¬
s
t
)
=
0
.
7
P
(
c
t
|
s
t
)
=
0
.
1
P
(
c
t
|¬
s
t
)
=
0
.
3
To reformulate this problem as an HMM with a single observation node, we can combine the
2-valued variables
r
t
and
c
t
into a single 4-valued variable, multiplying together the emission
probabilities.
(b) Consider the following evidence values at timesteps 1, 2, and 3:
(a)
e
1
=
not red eyes, not sleeping in class
(b)
e
2
=
red eyes, not sleeping in class
(c)
e
3
=
red eyes, sleeping in class
DIS10,
©
UCB CS 189
/
289A, Fall 2023. All Rights Reserved. This may not be publicly shared without explicit permission.
3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Compute state estimates for timesteps
t
at 1, 2, and 3; that is, calculate
P
(
S
t
|
e
1:
t
). Assume
a prior on
P
(
s
0
) that is consistent with the prior in the previous part; that is,
P
(
s
0
)
=
0
.
7.
Solution:
We can apply the filtering (forward computation) method. We walk through the
computation step-by-step:
P
(
S
0
)
=
⟨
0
.
7
,
0
.
3
⟩
P
(
S
1
)
=
YYYYYYY
s
0
P
(
S
1
|
s
0
)
P
(
s
0
)
=
⟨
0
.
8
,
0
.
2
⟩
0
.
7
+
⟨
0
.
3
,
0
.
7
⟩
0
.
3
=
⟨
0
.
65
,
0
.
35
⟩
P
(
S
1
|
e
1
)
=
α
P
(
e
1
|
S
1
)
P
(
S
1
)
=
α
⟨
0
.
8
∗
0
.
9
,
0
.
3
∗
0
.
7
⟩⟨
0
.
65
,
0
.
35
⟩
After normalizing, we get the following (the rest of the solution(s) will normalize implicitly.)
=
⟨
0
.
8643
,
0
.
1357
⟩
P
(
S
2
|
e
1
)
=
YYYYYYY
s
1
P
(
S
2
|
s
1
)
P
(
s
1
|
e
1
)
=
⟨
0
.
7321
,
0
.
2679
⟩
P
(
S
2
|
e
1
:
e
2
)
=
α
P
(
e
2
|
S
2
)
P
(
S
2
|
e
1
)
=
⟨
0
.
5010
,
0
.
4990
⟩
P
(
S
3
|
e
1
:
e
2
)
=
YYYYYYY
s
2
P
(
S
3
|
s
2
)
P
(
s
1
|
e
1
:
e
2
)
=
⟨
0
.
5505
,
0
.
4495
⟩
P
(
S
3
|
e
1
:
e
3
)
=
α
P
(
e
3
|
S
3
)
P
(
S
3
|
e
1
:
e
2
)
=
⟨
0
.
1045
,
0
.
8955
⟩
(c) Compute smoothing estimates
P
(
S
t
|
e
1:3
) for each timestep, using the same evidence as the
previous part.
Solution:
First, we do the backwards computations:
P
(
e
3
|
S
3
)
=
⟨
0
.
2
∗
0
.
1
,
0
.
7
∗
0
.
3
⟩
=
⟨
0
.
02
,
0
.
21
⟩
P
(
e
3
|
S
2
)
=
YYYYYYY
s
3
P
(
e
3
|
s
3
)
P
(
s
3
|
S
2
)
=
⟨
0
.
02
∗
0
.
8
+
0
.
21
∗
0
.
2
,
0
.
02
∗
0
.
3
+
0
.
21
∗
0
.
7
⟩
=
⟨
0
.
0588
,
0
.
153
⟩
DIS10,
©
UCB CS 189
/
289A, Fall 2023. All Rights Reserved. This may not be publicly shared without explicit permission.
4
P
(
e
2
:
e
3
|
S
1)
=
YYYYYYY
s
2
P
(
e
2
|
s
2
)
P
(
e
3
|
s
2
)
P
(
s
2
|
S
1
)
=
⟨
0
.
0233
,
0
.
0556
⟩
Now, we can combine them with the forwards computation and normalize.
P
(
S
1
|
e
1
:
e
3
)
=
α
P
(
S
1
|
e
1
)
P
(
e
2
:
e
3
|
S
1
)
=
⟨
0
.
7277
,
0
.
2723
⟩
P
(
S
2
|
e
1
:
e
3
)
=
α
P
(
S
2
|
e
1
:
e
2
)
P
(
e
3
|
S
2
)
=
⟨
0
.
2757
,
0
.
7243
⟩
P
(
S
3
|
e
1
:
e
3
)
=
⟨
0
.
1045
,
0
.
8955
⟩
(d) Compare, in plain English, the filtered estimates you computed for timesteps 1 and 2 with the
smoothed estimates. How do the two analyses di
ff
er?
Solution:
The smoothed analysis shows that the time the student started sleeping poorly is one timestep
earlier than filtering only computation by incorporating future observations that indicated lack
of sleep at the last step.
DIS10,
©
UCB CS 189
/
289A, Fall 2023. All Rights Reserved. This may not be publicly shared without explicit permission.
5
Related Documents
Recommended textbooks for you

Operations Research : Applications and Algorithms
Computer Science
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Brooks Cole

Fundamentals of Information Systems
Computer Science
ISBN:9781305082168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning
Recommended textbooks for you
- Operations Research : Applications and AlgorithmsComputer ScienceISBN:9780534380588Author:Wayne L. WinstonPublisher:Brooks ColeFundamentals of Information SystemsComputer ScienceISBN:9781305082168Author:Ralph Stair, George ReynoldsPublisher:Cengage Learning

Operations Research : Applications and Algorithms
Computer Science
ISBN:9780534380588
Author:Wayne L. Winston
Publisher:Brooks Cole

Fundamentals of Information Systems
Computer Science
ISBN:9781305082168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning