Problem#2 ReLu activation function reduces the effect of the vanishing gradient problem. That is the reason it is preferred over sigmoid and tanh activation functions. The gradient of the following 3 activation functions is specified in the following table (the derivation of the gradient of the activation functions will be discussed later in this course) (also given in Appendix). Function Function Definition Sigmoid Tanh ReLu x=-4.0 • x=0.5 • x=4.0 Answer: Function g(x) = 1 + ² = 1 + ²x sinh(x) cosh(x) f(x) = x Sigmoid tanh(x)= Tanh ReLu Function Definition o(x)= tanh(r) 1 1 Gradient of Activation Function o'(x) = o(x)(1 − o(x)) d dz Write TensorFlow (or Numpy) code to compute the gradient of the 3 activation functions at the following point when: sinh(x) = cosh(x). f(x) = x :tanh(Z) = 1 – tanh(2) (the value of the gradient of sigmoid and tanh functions will be much smaller than the value of the gradient of ReLu activation function) 0 when x <= 0 1 when x > 0 Gradient of Activation Function o'(x) = o(x)(1 − o(x)) Gradient computed at x = -4.0 x=0.5 d ;tanh(2) = 1 – tanh(2) dz x=4.0 ? 1 for x > 0 ? ? ? ? ? ? ? ? Gradient computed at x = -4.0 x=0.5 x=4.0 x = -4.0, Gradient-0.0176 x=0.5, Gradient=0.235 x 4.0, Gradient-0.0176 x= -4.0, Gradient-0.0013 x=0.5, Gradient-0.7864 x 4.0, Gradient-0.0013 x= -4.0, Gradient-0 x=0.5, Gradient-1 x 4.0, Gradient-1
Problem#2 ReLu activation function reduces the effect of the vanishing gradient problem. That is the reason it is preferred over sigmoid and tanh activation functions. The gradient of the following 3 activation functions is specified in the following table (the derivation of the gradient of the activation functions will be discussed later in this course) (also given in Appendix). Function Function Definition Sigmoid Tanh ReLu x=-4.0 • x=0.5 • x=4.0 Answer: Function g(x) = 1 + ² = 1 + ²x sinh(x) cosh(x) f(x) = x Sigmoid tanh(x)= Tanh ReLu Function Definition o(x)= tanh(r) 1 1 Gradient of Activation Function o'(x) = o(x)(1 − o(x)) d dz Write TensorFlow (or Numpy) code to compute the gradient of the 3 activation functions at the following point when: sinh(x) = cosh(x). f(x) = x :tanh(Z) = 1 – tanh(2) (the value of the gradient of sigmoid and tanh functions will be much smaller than the value of the gradient of ReLu activation function) 0 when x <= 0 1 when x > 0 Gradient of Activation Function o'(x) = o(x)(1 − o(x)) Gradient computed at x = -4.0 x=0.5 d ;tanh(2) = 1 – tanh(2) dz x=4.0 ? 1 for x > 0 ? ? ? ? ? ? ? ? Gradient computed at x = -4.0 x=0.5 x=4.0 x = -4.0, Gradient-0.0176 x=0.5, Gradient=0.235 x 4.0, Gradient-0.0176 x= -4.0, Gradient-0.0013 x=0.5, Gradient-0.7864 x 4.0, Gradient-0.0013 x= -4.0, Gradient-0 x=0.5, Gradient-1 x 4.0, Gradient-1
Related questions
Question
Please help step to step with explanation for tensorflow code with a final code for understanding thank you.

Transcribed Image Text:Problem#2
ReLu activation function reduces the effect of the vanishing gradient problem. That is the reason it is
preferred over sigmoid and tanh activation functions. The gradient of the following 3 activation
functions is specified in the following table (the derivation of the gradient of the activation functions will
be discussed later in this course) (also given in Appendix).
Function
Function Definition
Sigmoid
Tanh
ReLu
x = 0.5
x = 4.0
Answer:
Function
0 (x)
Sigmoid
Tanh
=
tanh(x)
ReLu
ex
1+ e*
sinh(x)
cosh(x)
f(x) = x
0 (x) =
=
Function Definition
1+ e*
1
1 + 6*
tanh(r) =
=
sinh(x)
cosh(x)
f(x) = x
Write TensorFlow (or Numpy) code to compute the gradient of the 3 activation functions at the following
point when:
x = -4.0
(the value of the gradient of sigmoid and tanh functions will be much smaller than the value of the
gradient of ReLu activation function)
1
1 + e
=
Gradient of
Activation Function
o'(x) =
d
dz
*--*
:o(x)(1-0(x))
;tanh(Z)=1 – tanh(2)
0 when x <= 0
1 when x > 0
Gradient of
Activation Function
o'(x) = o(x) (1 – 0 (x))
Gradient computed
at x = -4.0
x = 0.5
x = 4.0
?
d
dz
-tanh(2) = 1 – tanh (2)
?
?
1 for x > 0
?
222
?
?
?
?
?
Gradient computed
at x = -4.0
x = 0.5
x = 4.0
x = -4.0, Gradient=0.0176
x = 0.5, Gradient=0.235
x = 4.0, Gradient=0.0176
x = -4.0, Gradient-0.0013
x = 0.5, Gradient-0.7864
x = 4.0, Gradient=0.0013
x = -4.0, Gradient=0
x = 0.5, Gradient=1
x = 4.0, Gradient-1
Expert Solution

This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
This is a popular solution!
Trending now
This is a popular solution!
Step by step
Solved in 5 steps with 3 images
