Let f be a differentiable and μ-strongly-convex function whose minimum is achieved at x*. Let us assume that the variance on the gradients is controlled: There exists σ > 0 and L≥ 0 such that E, [||Vfi(x)|||xk| ≤ 0² + L ||xk - x* ||². Prove the following statements: 1. If σ > 0 and L = 0, SGD with step size where nk satisfies · E [||x − x. [2] + Σ;=0 130² E[f (zk) - f*]≤ k 2-oj (1) Σj=0jxj Zk= (2) ak In particular, E[f (zk) - f*] converges to 0 if and only if Σ; ni 2. If σ > 0 and L > 0, SGD with a constant step size n satisfies = ∞ and = 0. E ||xk+1 - x* ||² ≤ (1-2μ + m² L)*E ||x − x*||² + (1 − 2nµ + n²L); - ησε 2μ-nL (3) What is the restriction on the stepsize? 3. Let us observe by definition, SGD with step size n satisfies: |xk|1x| = || − x 2 + nổ |Vf(x)|| – 20k (k – x, Vfi(x)). -x -x+ Derive the optimal step size and comment on it. - - (4)

Advanced Engineering Mathematics
10th Edition
ISBN:9780470458365
Author:Erwin Kreyszig
Publisher:Erwin Kreyszig
Chapter2: Second-order Linear Odes
Section: Chapter Questions
Problem 1RQ
icon
Related questions
Question
Let f be a differentiable and μ-strongly-convex function whose minimum is achieved at x*. Let us assume that the
variance on the gradients is controlled: There exists σ > 0 and L≥ 0 such that E, [||Vfi(x)|||xk| ≤ 0² + L ||xk - x* ||².
Prove the following statements:
1. If σ > 0 and L = 0, SGD with step size
where
nk satisfies
· E [||x − x. [2] + Σ;=0 130²
E[f (zk) - f*]≤
k
2-oj
(1)
Σj=0jxj
Zk=
(2)
ak
In particular, E[f (zk) - f*] converges to 0 if and only if Σ; ni
2. If σ > 0 and L > 0, SGD with a constant step size n satisfies
= ∞ and
= 0.
E ||xk+1 - x* ||² ≤ (1-2μ + m² L)*E ||x − x*||² + (1 − 2nµ + n²L);
-
ησε
2μ-nL
(3)
What is the restriction on the stepsize?
3. Let us observe by definition, SGD with step size n satisfies:
|xk|1x| = || − x 2 + nổ |Vf(x)|| – 20k (k – x, Vfi(x)).
-x
-x+
Derive the optimal step size and comment on it.
-
-
(4)
Transcribed Image Text:Let f be a differentiable and μ-strongly-convex function whose minimum is achieved at x*. Let us assume that the variance on the gradients is controlled: There exists σ > 0 and L≥ 0 such that E, [||Vfi(x)|||xk| ≤ 0² + L ||xk - x* ||². Prove the following statements: 1. If σ > 0 and L = 0, SGD with step size where nk satisfies · E [||x − x. [2] + Σ;=0 130² E[f (zk) - f*]≤ k 2-oj (1) Σj=0jxj Zk= (2) ak In particular, E[f (zk) - f*] converges to 0 if and only if Σ; ni 2. If σ > 0 and L > 0, SGD with a constant step size n satisfies = ∞ and = 0. E ||xk+1 - x* ||² ≤ (1-2μ + m² L)*E ||x − x*||² + (1 − 2nµ + n²L); - ησε 2μ-nL (3) What is the restriction on the stepsize? 3. Let us observe by definition, SGD with step size n satisfies: |xk|1x| = || − x 2 + nổ |Vf(x)|| – 20k (k – x, Vfi(x)). -x -x+ Derive the optimal step size and comment on it. - - (4)
Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer
Recommended textbooks for you
Advanced Engineering Mathematics
Advanced Engineering Mathematics
Advanced Math
ISBN:
9780470458365
Author:
Erwin Kreyszig
Publisher:
Wiley, John & Sons, Incorporated
Numerical Methods for Engineers
Numerical Methods for Engineers
Advanced Math
ISBN:
9780073397924
Author:
Steven C. Chapra Dr., Raymond P. Canale
Publisher:
McGraw-Hill Education
Introductory Mathematics for Engineering Applicat…
Introductory Mathematics for Engineering Applicat…
Advanced Math
ISBN:
9781118141809
Author:
Nathan Klingbeil
Publisher:
WILEY
Mathematics For Machine Technology
Mathematics For Machine Technology
Advanced Math
ISBN:
9781337798310
Author:
Peterson, John.
Publisher:
Cengage Learning,
Basic Technical Mathematics
Basic Technical Mathematics
Advanced Math
ISBN:
9780134437705
Author:
Washington
Publisher:
PEARSON
Topology
Topology
Advanced Math
ISBN:
9780134689517
Author:
Munkres, James R.
Publisher:
Pearson,