Assignment_2
pdf
keyboard_arrow_up
School
University of Toronto *
*We aren’t endorsed by this school
Course
1508
Subject
Electrical Engineering
Date
Feb 20, 2024
Type
Pages
14
Uploaded by ProfessorProton19692
University of Toronto
Department of Electrical and Computer Engineering
ECE 1508S2: Applied Deep Learning
A. Bereyhi - Winter 2024
Assignment 2
Feedforward Neural Networks
D
ATE
:
Feb 2, 2024
D
UE
:
Feb 16, 2024
P
REFACE
This is the second series of assignments for the course
Special Topics in Communica-
tions: Applied Deep Learning
. The exercises are aimed to review the topics of Chapter
2, i.e., Feedforward Neural Networks. Below, you can find the information regarding
the contents of these exercises, as well as instructions on how to submit them.
G
ENERAL
I
NFORMATION
The assignments are given in two sections. In the first section, you have written ques-
tions that you can answer in words or by derivation. The questions are consistent with
the material of Chapter 2 and you do not need any further resources to answer them.
The second section includes the programming assignments. For these assignments,
you need to use the package
torch
in Python. For those who are beginners, an intro-
duction has been given and some useful online resources have been cited.
In the case that a question is unclear or there is any flaws, please contact over Piazza.
Also, in case that any particular assumption is required to solve the problem, feel free
to consider the assumption and state it in your solution.
The total mark of the assignments is
100 points
with written questions having the
following mark distribution:
• Question 1:
10 points
• Question 2:
5 points
Assignment 2
Deadline: Feb 16, 2024
Page 1 of 14
• Question 3:
10 points
• Question 4:
5 points
The mark distribution of the programming assignments are further as follows:
50
points
for the first assignment and
20 points
for the second assignment.
There-
fore, the total mark of written questions adds to 30 points and the total mark of the
programming assignments adds to 70 points.
H
OW TO
S
UBMIT
Please submit the answer to written exercises as a PDF or image file.
It does not
require to be machine-typed, and you can submit the photo of your handwritten solu-
tions. For the programming tasks, it is
strongly
suggested to use the Python notebook
Assgn
_
2.ipynb
that is available on Quercus and you can use it for submission.
Note
that most of the codes for programming assignments are already given in the
Python notebook and you are only asked to complete the code in the indicated
lines.
Nevertheless, this is not mandatory to use this file and you can use any other file
format for your submission. Regardless of what format or template you choose, your
submission for the programming assignments should be included in a single file.
1
The deadline for your submission is on
February 16, 2024 at 11:59 PM
.
• You can delay up to three days, i.e., until
February 19, 2024 at 11:59 PM
. After
this extended deadline no submission is accepted.
• In case of your delay, you lose one of your
two
penalty-free
delays. After two
penalty-free delays, each day of delay deducts 10% of the assignment mark.
Please submit your assignment
only through Quercus, and not by email.
1 W
RITTEN
E
XERCISES
Q
UESTION
1: F
ORWARD AND
B
ACKWARD
P
ASS
In this exercise, we try forward and
backward propagation for the simple feedforward neural network (FNN), we had in the
first series of assignments. This FNN has been shown in
Figure. 1.1
. In this FNN, we
have used
soft-ReLU
function for activation in the hidden layer. This means that
f
(
·
) in
Figure. 1.1
is
f
(
z
) = log
(
1 +
e
z
)
with log taken in natural base.
The output layer is further activated via the sigmoid
function, i.e.,
σ
(
z
) =
1
1 +
e
−
z
.
For training of this FNN, we use the
cross-entropy function
as the loss function. We
are given with the data-point
x
=
"
1
1
#
1
A zip file of multiple executable files is also accepted.
Assignment 2
Deadline: Feb 16, 2024
Page 2 of 14
x
1
x
2
f
f
σ
y
Figure 1.1:
Fully-connected FNN with two-dimensional input
x
= [
x
1
,
x
2
]
T
.
whose true label is
v
0
= 1. We intend to perform one forward and backward pass by
hand. To this end, assume that all weights and biases are initiated by value 0.1, i.e.,
all entries of
W
(0)
1
and
w
(0)
2
are 0.1, where
W
1
is the matrix containing all weights and
biases of the hidden layer and
w
2
is the vector containing all weights and biases of the
output layer.
1. Determine all variables calculated in the forward pass. You have to explain the
order of your calculation using the forward propagation algorithm.
2. Determine the gradient of the loss with respect to all the weights and biases at
the given initial values
via backpropagation
.
Note:
You
must
use the backpropagation algorithm.
3. Assume we are doing sample-level training. Calculate the updated weights and
biases for the next iteration of gradient descent, i.e.,
W
(1)
1
and
w
(1)
2
.
Q
UESTION
2: F
ORWARD
-P
ROPAGATION
R
EVISITED
Consider a fully-connected feed-
forward neural network (FNN) with
L
hidden layers. The input data-point
x
to this FNN
has
N
entries, i.e.,
x
∈
N
.
The hidden layer
ℓ
for
ℓ
∈ {
1, ... ,
L
}
has
W
ℓ
neurons
all being activated with activation function
f
ℓ
(
·
) :
7→
and its output layer contains
W
L
+1
neurons with activation function
f
L
+1
(
·
) :
7→
. For this network, we derived
the forward-propagation algorithm in the lecture as given in
Algorithm 1
.
Algorithm 1
ForwardProp():
Standard Form Derived in Lecture
1: Initiate with
y
0
=
x
2:
for
ℓ
= 0, ... ,
L
do
3:
Add
y
ℓ
[0] = 1 and determine
z
ℓ
+1
=
W
ℓ
+1
y
ℓ
# forward affine
4:
Determine
y
ℓ
+1
=
f
ℓ
+1
(
z
ℓ
+1
)
# forward activation
5:
end for
6:
for
ℓ
= 1, ... ,
L
+ 1
do
7:
Return
y
ℓ
and
z
ℓ
8:
end for
In this algorithm, matrix
W
ℓ
+1
∈
W
ℓ
+1
×
(
W
ℓ
+1)
which contains all the
weights
and
bi-
ases
of the neurons in layer
ℓ
+ 1, where we define the input layer to be layer 0 with
W
0
=
N
nodes, i.e., the input entries, and the output layer to be layer
L
+ 1.
In this exercise, we intend to represent an alternative form for forward-propagation in
which we represent the
weights
and
biases
as separate components.
2
For
ℓ
∈ {
0, ... ,
L
}
, let
˜
W
ℓ
+1
∈
W
ℓ
+1
×W
ℓ
be a matrix whose entry in row
j
and column
i
denotes the weight of neuron
j
in layer
ℓ
+1 for its
i
-th input. Moreover, let
b
ℓ
+1
∈
W
ℓ
+1
2
This means that we do not want to use the dummy node 1 in each layer as we did it in the lecture.
Assignment 2
Deadline: Feb 16, 2024
Page 3 of 14
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
be the vector of biases in layer
ℓ
+ 1 whose entry
j
denotes the bias of neuron
j
in layer
ℓ
+ 1.
1. Write the affine transform of layer
ℓ
+ 1 in terms of the weight matrix
˜
W
ℓ
+1
and
the bias vector
b
ℓ
+1
.
2. Re-write the forward-propagation algorithm in terms of
˜
W
ℓ
+1
and
b
ℓ
+1
. For sake of
simplicity, an uncompleted version of the algorithm is given below in
Algorithm 2
:
you should only complete the blank lines
.
Hint:
Note that this alternative form
should
not
contain
W
ℓ
anymore.
Algorithm 2
ForwardProp():
Alternative Form
1:
----------
# complete
2:
for
ℓ
= 0, ... ,
L
do
3:
----------
# complete
4:
Determine
y
ℓ
+1
=
f
ℓ
+1
(
z
ℓ
+1
)
# forward activation
5:
end for
6:
for
ℓ
= 1, ... ,
L
+ 1
do
7:
Return
y
ℓ
and
z
ℓ
8:
end for
3. Explain what is the relation between matrix
W
ℓ
in
Algorithm 1
and
˜
W
ℓ
and
b
ℓ
in
Algorithm 2
.
Q
UESTION
3: C
HAIN
-R
ULE FOR
A
FFINE
O
PERATION
Assume that scalar
ˆ
R
∈
is a
function of vector
y
∈
K
, i.e.,
ˆ
R
=
L
(
y
) for some
L
(
·
) :
K
7→
1. We already have
calculated the gradient of
ˆ
R
with respect to
y
, i.e., we have the vector
∇
y
ˆ
R
=
∂
ˆ
R
∂
y
1
.
.
.
∂
ˆ
R
∂
y
K
.
We further know that
y
is an affine function of an input vector
z
∈
N
, i.e.,
y
=
Az
+
b
for some matrix
A
∈
K
×
N
and
b
∈
K
. We want to calculate the gradient of
ˆ
R
with
respect to any of these three components, i.e.,
z
,
A
and
b
from
∇
y
ˆ
R
.
1. First assume that
A
and
b
are given. We intend to calculate gradient of
ˆ
R
with
respect to
z
, i.e.,
∇
z
ˆ
R
.
The computation graph for this problem is shown in
Figure. 1.2
. Determine
∇
z
ˆ
R
in terms of
∇
y
ˆ
R
.
Hint:
You need to present the result compactly as a matrix-vector multiplication.
z
y
ˆ
R
A
z
+
b
L
∇
y
ˆ
R
Figure 1.2:
Computation graph for Case 1, where we aim to calculate
∇
z
ˆ
R
.
Assignment 2
Deadline: Feb 16, 2024
Page 4 of 14
2. Now, assume another case in which
z
and
b
are given, and we intend to calcu-
late gradient of
ˆ
R
with respect to
A
, i.e.,
∇
A
ˆ
R
. The computation graph for this
problem is shown in
Figure. 1.3
. Determine
∇
A
ˆ
R
in terms of
∇
y
ˆ
R
.
Hint:
You need to present the result compactly as a vector-vector multiplication.
A
y
ˆ
R
A
z
+
b
L
∇
y
ˆ
R
Figure 1.3:
Computation graph for Case 2, where we aim to calculate
∇
A
ˆ
R
.
3. As the last case, assume that
A
and
z
are given. We now intend to calculate
gradient of
ˆ
R
with respect to
b
, i.e.,
∇
b
ˆ
R
. The computation graph for this problem
is shown in
Figure. 1.4
. Determine
∇
b
ˆ
R
in terms of
∇
y
ˆ
R
.
Hint:
You need to present the result compactly as a vector.
b
y
ˆ
R
Az
+
b
L
∇
y
ˆ
R
Figure 1.4:
Computation graph for Case 3, where we aim to calculate
∇
b
ˆ
R
.
Q
UESTION
4: B
ACKPROPAGATION
R
EVISITED
We now extend the alternative repres-
entation of Question 2 to the backward pass, using the results of Question 3. Recall
the backpropagation algorithm derived in the lecture: this is given in
Algorithm 3
. In
this algorithm,
↼
y
ℓ
represents the gradient of loss
ˆ
R
determined for data-point (
x
,
v
),
i.e., if
y
L
+1
denotes the output of the FNN for input point
x
; then,
ˆ
R
=
L
(
y
L
+1
,
v
). Fur-
thermore,
˙
f
ℓ
(
·
) :
7→
denotes the derivative of activation
f
ℓ
(
·
) with respect to its
argument and
⊙
is entry-wise product.
Algorithm 3
BackProp():
Standard Form Derived in Lecture
1: Initiate with
↼
y
L
+1
=
∇
L
(
y
L
+1
,
v
) and
↼
z
L
+1
=
↼
y
L
+1
⊙
˙
f
L
+1
(
z
L
+1
)
2:
for
ℓ
=
L
, ... , 1
do
3:
Determine
↼
y
ℓ
=
W
T
ℓ
+1
↼
z
ℓ
+1
and drop
↼
y
ℓ
[0]
# backward affine
4:
Determine
↼
z
ℓ
=
˙
f
ℓ
(
z
ℓ
)
⊙
↼
y
ℓ
# backward activation
5:
end for
6:
for
ℓ
= 1, ... ,
L
+ 1
do
7:
Return
∇
W
ℓ
ˆ
R
=
↼
z
ℓ
y
T
ℓ
−
1
8:
end for
1. Using the results of Question 3, complete the alternative form of Backpropaga-
tion algorithm given below in
Algorithm 4
.
This alternative form should only
contain matrices
˜
W
ℓ
+1
and vectors
b
ℓ
+1
, as defined in Question 2.
Assignment 2
Deadline: Feb 16, 2024
Page 5 of 14
Algorithm 4
BackProp():
Alternative Form
1: Initiate with
↼
y
L
+1
=
∇
L
(
y
L
+1
,
v
) and
↼
z
L
+1
=
↼
y
L
+1
⊙
˙
f
L
+1
(
z
L
+1
)
2:
for
ℓ
=
L
, ... , 1
do
3:
----------
# complete
4:
Determine
↼
z
ℓ
=
˙
f
ℓ
(
z
ℓ
)
⊙
↼
y
ℓ
# backward activation
5:
end for
6:
for
ℓ
= 1, ... ,
L
+ 1
do
7:
----------
# complete
8:
end for
2. Explain what is the relation between
∇
W
ℓ
ˆ
R
in
Algorithm 1
and those gradient
that are returned in line:7
of
Algorithm 4
.
2 P
ROGRAMMING
E
XERCISES
Throughout programming tasks we use library
torch
to implement the forward and
backward propagation through a three-layer FNN. We also use and
torchvision
to
access to the MNIST dataset. In case that you are a beginner, you may find the follow-
ing description useful to start with these packages.
U
SING
T
ENSORS IN
P
Y
T
ORCH
To use any library in
Python
, you need to install it first.
This can be done directly through terminal (command line) using
pip installer
which
is an inline installer of Python packages. Below is an example of installing
PyTorch
:
pip install torch torchvision torchaudio
Depending on your operating system you can find the exact installation command at
pytorch.org/get-started/locally
.
Once the packages are installed, you could import them using the command
import
:
import
torch
and we could access modules and functions by calling
torch
: for instance, in this
assignment, we use the random generator. We could generate a 3
×
2
×
2 uniform
random tensor by
torch.rand(3,2,2)
This gives a random tensor, for example
>> tensor([[[0.3428, 0.4368],
[0.5732, 0.4344]],
[[0.7477, 0.8229],
[0.8687, 0.4596]],
[[0.9962, 0.0207],
[0.4515, 0.8986]]])
You could learn more at
docs.python.org/3/tutorial/modules#packages
.
Assignment 2
Deadline: Feb 16, 2024
Page 6 of 14
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
P
REFACE
: S
PECIFYING THE
FNN
AND
L
OSS
We are going to work with the three-layer FNN studied in the lecture. Wherever needed,
we consider the following specifications: we have a
three-layer fully-connected FNN
whose input is an image from MNIST dataset. This means that the input dimension is
N
= 28
×
28 = 784. Both hidden layers have 128 neurons, i.e.,
W
1
=
W
2
= 128. All
hidden neurons are activated by ReLU function. In MNIST, we have
C
= 10 classes.
We hence use a softmax-activated neuron at the output layer: it gets the 128 outputs
of the second hidden layer and returns a 10-dimensional vector.
We further determine the loss via the cross-entropy function:
assume the image be-
longs to class v
∈ {
1, ... , 10
}
and let
y
3
∈
10
be the output of our FNN. The loss is
then calculated as
ˆ
R
=
−
log
y
3
[
v
]
where y
3
[
v
]
is entry v of vector
y
3
.
2.1 F
ORWARD AND
B
ACKWARD
P
ROPAGATION FROM
S
CRATCH
In this assignment, we intend to implement the forward and backpropagation algorithms
for the specified three-layer FNN.
T
ASK
1: I
MPLEMENTING
H
IDDEN
L
AYERS
Now that we know the exact architecture of
the FNN and the loss function, we intend to implement the forward and backpropaga-
tion. To this end, we need to first define our model. We do this by first defining the
hidden layers.
1. Start your code by writing a class called
hidden()
. This class gets the input and
output dimensions of the hidden layer and performs the forward and backward
passes. In this class, define the input and output dimensions as attributes. Also,
initiate the matrix of weights via a matrix given as the input. You can make this
class by completing the following code:
class
hidden():
def
__
init
__
(self, input
_
size, output
_
size, W):
# define the attributes of this class
self.input
_
size =
# complete
self.output
_
size =
# complete
# initiate the matrix of weights
self.weights = W
Next, we implement the forward propagation from the input to the output of the hidden
layer. This includes two passes, i.e.,
one affine transform and one activation
. We can
implement the affine transform by multiplying the extended input (input + dummy entry
1) to the weight matrix of the hidden layer. For activation, we need to pass every entry
through ReLU, i.e., make it zero if negative and leave it unchanged otherwise.
Assignment 2
Deadline: Feb 16, 2024
Page 7 of 14
2. Add the function
forward
to the class
hidden()
.
This function gets an input
vector. This input is the output of the previous layer that
has been added with a
dummy entry 1 at its index 0
. The function returns both the
affine transform
and
the
output of hidden layer after ReLU activation
.
It also adds a dummy entry 1 at
its index 0 of its output.
You can write this function by completing the following
code.
class
hidden():
def
__
init
__
(self, input
_
size, output
_
size, W):
# we implemented in previous part
pass
def
forward(self, x):
’’’
x is input to the hidden layer
x is of size input
_
size + 1
’’’
self.z =
# complete
self.y =
# complete
# self.y should be of size output
_
size + 1
return
We finally implement the backward propagation through the hidden layer. To under-
stand what we are going to do, let’s denote the input to the hidden layer with
x
, the
affine transform by
z
and the output with
y
. We assume that we have the gradient of
the loss with respect to the output, i.e., we already have
∇
y
ˆ
R
. Note that this gradi-
ent
does not contain the derivative with respect to the dummy entry
. We should now
calculate
∇
z
ˆ
R
and
∇
x
ˆ
R
.
3. Add the function
backward
to the class
hidden()
. This function gets an input
vector. This input is the gradient with respect to the layer’s output given by the
next layer in the backward pass.
This vector has
output
_
size
entries.
The
function returns both the gradient with respect to
affine transform
, i.e.,
∇
z
ˆ
R
, and
the gradient with respect to
input of the hidden layer
, i.e.,
∇
x
ˆ
R
.
It also drops the
first dummy entry of
∇
x
ˆ
R.
You can write this function by completing the following
code.
class
hidden():
def
__
init
__
(self, input
_
size, output
_
size, W):
# we implemented in previous part
pass
def
forward(self, x):
# we implemented in previous part
pass
def
backward(self, g
_
y):
’’’
g
_
y is gradient w.r.t. output
Assignment 2
Deadline: Feb 16, 2024
Page 8 of 14
g
_
y is of size output
_
size
’’’
self.g
_
z =
# complete
self.g
_
x =
# complete
# self.g
_
x should be of size input
_
size
return
T
ASK
2: I
MPLEMENTING
O
UTPUT
L
AYER
In this task, we implement the output layer
which is a softmax-activated vector neuron.
Recall that for an
K
-dimensional input
x
∈
K
and
C
classes, this layer first calculate an affine function as
z
=
W
"
1
x
#
for some matrix
W
∈
C
×
(
K
+1)
. It then passes
z
∈
C
through the softmax activation
to determine the output
y
∈
C
. Entry
i
of
y
is given by
y
[
i
] =
e
z
[
i
]
C
X
j
=1
e
z
[
j
]
(2.1)
where
z
[
i
] is entry
i
of vector
z
.
1. Write a new class
outLayer()
. This class gets the input and output dimensions
of the output layer, i.e.,
K
and
C
in the above example, and performs the forward
and backward passes. In this class, define the input and output dimensions as
attributes. Also, initiate the matrix of weights via a matrix given as input. You can
make this class by completing the following code:
class
outLayer():
def
__
init
__
(self, input
_
size, output
_
size, W):
# define the attributes of this class
self.input
_
size =
# complete
self.output
_
size =
# complete
# initiate the matrix of weights
self.weights = W
We implement the forward propagation from the input to the output of output layer.
This includes two passes, i.e.,
one affine transform and one softmax activation
. We
can implement the affine transform by multiplying the extended input (input + dummy
entry 1) to the weight matrix. For activation, we need to pass the output of the affine
transform through the softmax function defined in (2.1).
2. Add the function
forward
to the class
outLayer()
. This function gets an input
vector. This input is the output of last hidden layer that
has been added with a
dummy entry 1 at its index 0
. The function returns both the
affine transform
and
the
output of softmax activation
. You can write this function by completing the
following code.
Assignment 2
Deadline: Feb 16, 2024
Page 9 of 14
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
class
outLayer():
def
__
init
__
(self, input
_
size, output
_
size, W):
# we implemented in previous part
pass
def
forward(self, x):
’’’
x is output of last hidden layer
x is of size input
_
size + 1
’’’
self.z =
# complete
self.y =
# complete
# size of self.y should be # of classes
return
We next implement the backward propagation through the softmax-activated neuron.
To understand what we are going to do, let’s again denote the input to the output layer
with
x
, the affine transform by
z
and the output of softmax function with
y
. We should
calculate
∇
y
ˆ
R
using the definition of the cross-entropy loss, and
∇
z
ˆ
R
and
∇
x
ˆ
R
via
backward pass.
3. Add the function
loss
to the class
outLayer()
. This function gets the true label
as input and determines the cross-entropy loss, as well as its gradient with re-
spect to the output layer’s output. You can write this function by completing the
following code.
class
outLayer():
def
__
init
__
(self, input
_
size, output
_
size, W):
# we implemented in previous part
pass
def
forward(self, x):
# we implemented in previous part
pass
def
loss(self, v):
’’’
v is the true label {0, 1, ..., 9}
’’’
self.loss =
# complete
# self.loss is cross-entropy between self.y and v
self.g
_
y =
# complete
# self.g
_
y is the gradient of loss w.r.t. output
# self.g
_
y is of size output
_
size
return
4. Add the function
backward
to the class
outLayer()
. This function returns both
the gradient with respect to
affine transform
, i.e.,
∇
z
ˆ
R
, and the gradient with
Assignment 2
Deadline: Feb 16, 2024
Page 10 of 14
respect to
input to the output layer
, i.e.,
∇
x
ˆ
R
.
It also drops the first dummy entry
of
∇
x
ˆ
R.
You can write this function by completing the following code.
class
outLayer():
def
__
init
__
(self, input
_
size, output
_
size, W):
# we implemented in previous part
pass
def
forward(self, x):
# we implemented in previous part
pass
def
loss(self, v):
# we implemented in previous part
pass
def
backward(self):
self.g
_
z =
# complete
self.g
_
x =
# complete
# size of self.g
_
x should be input
_
size
# no input needed as loss generated self.g
_
y
return
T
ASK
3: C
OMPLETING
FNN I
MPLEMENTATION
Since we have all the layers implemen-
ted, we can now implement our specified three-layer FNN with its complete forward and
backward pass.
1. Write a new class
myFNN()
. This attributes of this class are the widths of our FNN
and its initially chosen weights, i.e.,
W
(0)
ℓ
for
ℓ
= 1, 2, 3. Set these initial weights
to be matrices whose entries are randomly chosen from interval [
−
1, 1]. You can
write this class by completing the following code:
class
myFNN():
def
__
init
__
(self):
# define the attributes of this class
self.input
_
size = 784
weights
_
1 =
# complete
self.hidden
_
size
_
1 = 128
self.hidden1 = hidden(784, 128, weights
_
1)
weights
_
2 =
# complete
self.hidden
_
size
_
2 = 128
self.hidden2 = hidden(128, 128, weights
_
2)
weights
_
3 =
# complete
self.num
_
classes = 10
self.outLayer = outLayer(128, 10, weights
_
3)
Assignment 2
Deadline: Feb 16, 2024
Page 11 of 14
2. Write the function
forward
for this class. This functions gets a data-point
x
with
its label
v
and implements the forward pass through the three-layer FNN. You
can write this function by completing the following code:
class
myFNN():
def
__
init
__
(self):
# we implemented in previous part
pass
def
forward(self, x, v):
# add dummy 1 to x at index 0
x =
# complete
# forward pass through hidden layer 1
self.hidden1.forward(x)
# forward pass through hidden layer 2
self.hidden2.forward(self.hidden1.y)
# forward pass through output layer
# complete
# compute loss
# complete <loss>
return
3. Write the function
backward
for this class that implements the backward pass
through the FNN. You can write this function by completing the following code:
class
myFNN():
def
__
init
__
(self):
# we implemented in previous part
pass
def
forward(self, x, v):
# we implemented in previous part
pass
def
backward(self):
# backward pass through output layer
self.outLayer.backward()
# backward pass through hidden layer 2
self.hidden2.backward(self.outLayer.g
_
y)
# backward pass through hidden layer 1
# complete
# Now, compute gradients w.r.t. weights
self.grad
_
1 =
# complete < gradient for layer 1>
self.grad
_
2 =
# complete < gradient for layer 2>
self.grad
_
3 =
# complete < gradient for output layer>
return
The class
myFNN()
now implements forward and backward pass. We can initiate call
it to have the three-layer FNN with some random weights. We can then give an input
Assignment 2
Deadline: Feb 16, 2024
Page 12 of 14
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
along with its true label to compute the forward pass, and then apply the backward
pass to get the gradients. This can be readily done by just few lines of code:
# define the model
model = myFNN()
# for input data-point x and label v pass forward
model.forward(x,v)
# then we pass backward
model.backward()
# now we have the sample gradients
print
(model.grad
_
1)
print
(model.grad
_
2)
print
(model.grad
_
3)
We next need to read some data-point from MNIST.
2.2 MNIST D
ATASET
In this short assignment, we learn how to load MNIST data-points as a torch.Tensor. As
mentioned in the lecture, you can read MNIST from module
torchvision.datasets
:
let’s import this module, and also the module
torchvision.transforms
that helps us
convert MNIST data-points to torch.Tensor. While importing we give them names:
import
torchvision.datasets as DS
import
torchvision.transforms as transform
T
ASK
1: L
OADING A
D
ATA
-P
OINT
We can now readily load MNIST as
mnist = DS.MNIST(’./data’ ,
train=True,
transform=transform.ToTensor(),
download=True)
In the above code, we indicate that the dataset is saved in folder
’data’
inside our
current directory. We indicate that we load the training dataset. We apply the trans-
form
.ToTensor()
to load them as torch.Tensor, and finally we let the dataset to be
downloaded. The object
mnist
is a collection of 60,000 tuples: the first entry of the
tuple is a torch.Tensor whose entries are pixels of the image and the second entry is
the label.
1. Use the command
len()
to check the length of object
mnist
.
2. Read the first tuple in
mnist
and specify its pixel tensor and label.
3. Use the method
.reshape()
to reshape the pixel tensor into a form that can be
given to the three-layer FNN implemented in previous assignment.
4. Call the reshaped pixel tensor
x
and the label
v
and run the code below to check
myFNN()
implementation
Assignment 2
Deadline: Feb 16, 2024
Page 13 of 14
# define the model
model = myFNN()
# for input data-point x and label v pass forward
model.forward(x,v)
# then we pass backward
model.backward()
T
ASK
2: M
AKING
M
INI
-B
ATCHES
PyTorch
has modules that can be used to divide
a dataset into mini-batches. But, we want to do this ourselves in this task. You can
imagine how easy it is: we only need to make a loop.
1. Write the function
myBatcher
that gets
batch
_
size
as input and returns a list of
mini-batches of size
batch
_
size
. You can write this function by completing the
following code:
def
myBatcher(batch
_
size):
# initiate with empty list of mini-batches
batch
_
list = []
# compute the number of mini-batches
num
_
batches =
# complete
for
j
in range
(num
_
batches):
# initiate with tensors of all zeros
batch
_
x = torch.zeros(batch
_
size,784)
batch
_
v = torch.zeros(batch
_
size)
for
i
in range
(batch
_
size):
# read pixel batche entry
batch
_
x[i] =
# complete
# read label batche entry
batch
_
v[i] =
# complete
# put pixel and label batch in a tuple
batch = (batch
_
x,batch
_
v)
# append this mini-batch to the list
batch
_
list.append(batch)
return
batch
_
list
2. Run the function
myBatcher
with
batch
_
size = 100
and print the labels of the
first and third mini-batches.
Assignment 2
Deadline: Feb 16, 2024
Page 14 of 14
Related Questions
I need help with (a) and (b)
arrow_forward
Please, I do not want a theoretical solution or using artificial intelligence. I want a solution on paper using the mathematical laws of the topic
arrow_forward
Question 1 Digital Electronics and Combinational Logic
la) Analog and Digital Electronics
i. Write either "digital" or "analog" in this to indicate whether the property in that row is
typical of digital electronics or analog electronics. The first row has been completed as an
example.
Property
Difficult, manual circuit design
Continuous valued signals
Tolerant of electrical noise
Circuit state tends to leak
Intolerant of component variations
Digital/Analog
Analog
ii. In older cars the timing of the electrical pulses to the spark plugs was controlled by a
mechanical distributor. This contained a rotating contact that was mechanically linked to the
rotation of the engine. Newer cars use electronic ignition. Electrical sensors detect the position
and speed of the motor and a digital controller sends ignition pulses to the spark plugs.
Briefly describe 2 likely benefits of the digital electronic ignition system over the mechanical
one. An example is given first.
More flexible control: the…
arrow_forward
HW Matlab 1) Create a variable fiemp to store a temperature in degrees Fahrenheit (F). Write m-file to convert this to degrees Celsius and store the result in a variable ctemp. The conversion factor is C = (F —32) * 5/9. 2) Write m-file to generate a matrix of random integers of size 100 by 100 their values between 15 to 80. 3) Free fall of objects is given by y =5mgt? where a is the acceleration, v is the velocity, y is the distance, m is the mass of the object, g is the gravitational acceleration. Plot the distance and velocity of the object for 15 seconds after its fall from rest (y = 0). Take m = 0.2 kg.
arrow_forward
build an Arduino project that solves a real-world problem. This can be anything such asa motion detector, alarm, temperature reader, etc. It can be as simple or as complicated as youwould.
NameDetail1. What problem does it solve?2. What improvements can be made to your device?3. Research: Is there a device that already solves this problem? (it’s ok if there is, how isit different and/or similar?)4. Hardware Parts Used:5. Photo or Screenshot of your Project (top view, pin layout should be clear):
6. Code (minimal 25 lines of code)://code goes here
arrow_forward
Note:Leave an image of the exercise in Spanish, so that it is better understood.
In the other image, is the final answer, to verify that it is correct. :)
arrow_forward
Don't use ai to answer I will report your answer Solve it Asap with explanation...
What is a wind rose and how is it used in wind energy analysis?
arrow_forward
My Signals and Systems question is attached. I don't understand how to figure out parts b and c mostly, but help with a would also be appreciated. The problem is from Alan V. Oppenheim by Alan S. Willsky. Thank you!
arrow_forward
please solve number 1
arrow_forward
Q1: A sequential circuit has one inputs X and one output (Z) is used to detect the sequence 1011. The output Z=1 when
the circuit detect the sequence 1011, otherwise Z=0. Sketch the stated diagrams of the following specifications:
1. Mealy model and the overlap is allowed.
2. Mealy model and the overlap is not allowed.
I need to solve the question step by step and in clear handwriting.
arrow_forward
do in matlab and paste the code here .
arrow_forward
SEE MORE QUESTIONS
Recommended textbooks for you
Introductory Circuit Analysis (13th Edition)
Electrical Engineering
ISBN:9780133923605
Author:Robert L. Boylestad
Publisher:PEARSON
Delmar's Standard Textbook Of Electricity
Electrical Engineering
ISBN:9781337900348
Author:Stephen L. Herman
Publisher:Cengage Learning
Programmable Logic Controllers
Electrical Engineering
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education
Fundamentals of Electric Circuits
Electrical Engineering
ISBN:9780078028229
Author:Charles K Alexander, Matthew Sadiku
Publisher:McGraw-Hill Education
Electric Circuits. (11th Edition)
Electrical Engineering
ISBN:9780134746968
Author:James W. Nilsson, Susan Riedel
Publisher:PEARSON
Engineering Electromagnetics
Electrical Engineering
ISBN:9780078028151
Author:Hayt, William H. (william Hart), Jr, BUCK, John A.
Publisher:Mcgraw-hill Education,
Related Questions
- I need help with (a) and (b)arrow_forwardPlease, I do not want a theoretical solution or using artificial intelligence. I want a solution on paper using the mathematical laws of the topicarrow_forwardQuestion 1 Digital Electronics and Combinational Logic la) Analog and Digital Electronics i. Write either "digital" or "analog" in this to indicate whether the property in that row is typical of digital electronics or analog electronics. The first row has been completed as an example. Property Difficult, manual circuit design Continuous valued signals Tolerant of electrical noise Circuit state tends to leak Intolerant of component variations Digital/Analog Analog ii. In older cars the timing of the electrical pulses to the spark plugs was controlled by a mechanical distributor. This contained a rotating contact that was mechanically linked to the rotation of the engine. Newer cars use electronic ignition. Electrical sensors detect the position and speed of the motor and a digital controller sends ignition pulses to the spark plugs. Briefly describe 2 likely benefits of the digital electronic ignition system over the mechanical one. An example is given first. More flexible control: the…arrow_forward
- HW Matlab 1) Create a variable fiemp to store a temperature in degrees Fahrenheit (F). Write m-file to convert this to degrees Celsius and store the result in a variable ctemp. The conversion factor is C = (F —32) * 5/9. 2) Write m-file to generate a matrix of random integers of size 100 by 100 their values between 15 to 80. 3) Free fall of objects is given by y =5mgt? where a is the acceleration, v is the velocity, y is the distance, m is the mass of the object, g is the gravitational acceleration. Plot the distance and velocity of the object for 15 seconds after its fall from rest (y = 0). Take m = 0.2 kg.arrow_forwardbuild an Arduino project that solves a real-world problem. This can be anything such asa motion detector, alarm, temperature reader, etc. It can be as simple or as complicated as youwould. NameDetail1. What problem does it solve?2. What improvements can be made to your device?3. Research: Is there a device that already solves this problem? (it’s ok if there is, how isit different and/or similar?)4. Hardware Parts Used:5. Photo or Screenshot of your Project (top view, pin layout should be clear): 6. Code (minimal 25 lines of code)://code goes herearrow_forwardNote:Leave an image of the exercise in Spanish, so that it is better understood. In the other image, is the final answer, to verify that it is correct. :)arrow_forward
- Don't use ai to answer I will report your answer Solve it Asap with explanation... What is a wind rose and how is it used in wind energy analysis?arrow_forwardMy Signals and Systems question is attached. I don't understand how to figure out parts b and c mostly, but help with a would also be appreciated. The problem is from Alan V. Oppenheim by Alan S. Willsky. Thank you!arrow_forwardplease solve number 1arrow_forward
- Q1: A sequential circuit has one inputs X and one output (Z) is used to detect the sequence 1011. The output Z=1 when the circuit detect the sequence 1011, otherwise Z=0. Sketch the stated diagrams of the following specifications: 1. Mealy model and the overlap is allowed. 2. Mealy model and the overlap is not allowed. I need to solve the question step by step and in clear handwriting.arrow_forwarddo in matlab and paste the code here .arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Introductory Circuit Analysis (13th Edition)Electrical EngineeringISBN:9780133923605Author:Robert L. BoylestadPublisher:PEARSONDelmar's Standard Textbook Of ElectricityElectrical EngineeringISBN:9781337900348Author:Stephen L. HermanPublisher:Cengage LearningProgrammable Logic ControllersElectrical EngineeringISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
- Fundamentals of Electric CircuitsElectrical EngineeringISBN:9780078028229Author:Charles K Alexander, Matthew SadikuPublisher:McGraw-Hill EducationElectric Circuits. (11th Edition)Electrical EngineeringISBN:9780134746968Author:James W. Nilsson, Susan RiedelPublisher:PEARSONEngineering ElectromagneticsElectrical EngineeringISBN:9780078028151Author:Hayt, William H. (william Hart), Jr, BUCK, John A.Publisher:Mcgraw-hill Education,
Introductory Circuit Analysis (13th Edition)
Electrical Engineering
ISBN:9780133923605
Author:Robert L. Boylestad
Publisher:PEARSON
Delmar's Standard Textbook Of Electricity
Electrical Engineering
ISBN:9781337900348
Author:Stephen L. Herman
Publisher:Cengage Learning
Programmable Logic Controllers
Electrical Engineering
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education
Fundamentals of Electric Circuits
Electrical Engineering
ISBN:9780078028229
Author:Charles K Alexander, Matthew Sadiku
Publisher:McGraw-Hill Education
Electric Circuits. (11th Edition)
Electrical Engineering
ISBN:9780134746968
Author:James W. Nilsson, Susan Riedel
Publisher:PEARSON
Engineering Electromagnetics
Electrical Engineering
ISBN:9780078028151
Author:Hayt, William H. (william Hart), Jr, BUCK, John A.
Publisher:Mcgraw-hill Education,