1. (Proximal Gradient Descent) In this problem, we will show the sublinear convergence for proximal gradient descent. To be precise, we assume that the objective f(x) can be written as f(x) = g(x) + h(x), where (a) g is convex, differentiable, and dom(g) € Rd. (b) Vg is Lipschitz, with constant L > 0. (c) h is convex, not necessarily differentiable, and we take dom(h) = Rd for simplicity. By defining the generalized gradient to be: G(x) = L(xk – X+1), where xk+1 is next iterate obtained from applying PGD to xk. Show that f(xk+1) - f(x*) ≤ ( |x − x*||² — ||×xk+1 − x*||²), where x* is the minimizer of f, and use it to conclude L f(xk) - f(x*) ≤ xo-x*||². 2k That is, the proximal descent method achieves O(1/k) accuracy at the k-th iteration. Hint: You can freely use the following lemma, which shows that the PGD is also a "descent method": Lemma 1 (Proximal Descent Lemma). 1 f(xk+1) − f(z) ≤ G(xk)¯(xk — z) - - ||G(x)||2, VzЄR". 2L

Advanced Engineering Mathematics
10th Edition
ISBN:9780470458365
Author:Erwin Kreyszig
Publisher:Erwin Kreyszig
Chapter2: Second-order Linear Odes
Section: Chapter Questions
Problem 1RQ
icon
Related questions
Question
1. (Proximal Gradient Descent) In this problem, we will show the sublinear convergence for proximal gradient descent.
To be precise, we assume that the objective f(x) can be written as f(x) = g(x) + h(x), where
(a) g is convex, differentiable, and dom(g) € Rd.
(b) Vg is Lipschitz, with constant L > 0.
(c) h is convex, not necessarily differentiable, and we take dom(h) = Rd for simplicity.
By defining the generalized gradient to be:
G(x) = L(xk – X+1),
where xk+1 is next iterate obtained from applying PGD to xk. Show that
f(xk+1) - f(x*) ≤ ( |x − x*||² — ||×xk+1 − x*||²),
where x* is the minimizer of f, and use it to conclude
L
f(xk) - f(x*) ≤ xo-x*||².
2k
That is, the proximal descent method achieves O(1/k) accuracy at the k-th iteration.
Hint: You can freely use the following lemma, which shows that the PGD is also a "descent method":
Lemma 1 (Proximal Descent Lemma).
1
f(xk+1) − f(z) ≤ G(xk)¯(xk — z) -
-
||G(x)||2, VzЄR".
2L
Transcribed Image Text:1. (Proximal Gradient Descent) In this problem, we will show the sublinear convergence for proximal gradient descent. To be precise, we assume that the objective f(x) can be written as f(x) = g(x) + h(x), where (a) g is convex, differentiable, and dom(g) € Rd. (b) Vg is Lipschitz, with constant L > 0. (c) h is convex, not necessarily differentiable, and we take dom(h) = Rd for simplicity. By defining the generalized gradient to be: G(x) = L(xk – X+1), where xk+1 is next iterate obtained from applying PGD to xk. Show that f(xk+1) - f(x*) ≤ ( |x − x*||² — ||×xk+1 − x*||²), where x* is the minimizer of f, and use it to conclude L f(xk) - f(x*) ≤ xo-x*||². 2k That is, the proximal descent method achieves O(1/k) accuracy at the k-th iteration. Hint: You can freely use the following lemma, which shows that the PGD is also a "descent method": Lemma 1 (Proximal Descent Lemma). 1 f(xk+1) − f(z) ≤ G(xk)¯(xk — z) - - ||G(x)||2, VzЄR". 2L
Expert Solution
steps

Step by step

Solved in 2 steps with 1 images

Blurred answer
Recommended textbooks for you
Advanced Engineering Mathematics
Advanced Engineering Mathematics
Advanced Math
ISBN:
9780470458365
Author:
Erwin Kreyszig
Publisher:
Wiley, John & Sons, Incorporated
Numerical Methods for Engineers
Numerical Methods for Engineers
Advanced Math
ISBN:
9780073397924
Author:
Steven C. Chapra Dr., Raymond P. Canale
Publisher:
McGraw-Hill Education
Introductory Mathematics for Engineering Applicat…
Introductory Mathematics for Engineering Applicat…
Advanced Math
ISBN:
9781118141809
Author:
Nathan Klingbeil
Publisher:
WILEY
Mathematics For Machine Technology
Mathematics For Machine Technology
Advanced Math
ISBN:
9781337798310
Author:
Peterson, John.
Publisher:
Cengage Learning,
Basic Technical Mathematics
Basic Technical Mathematics
Advanced Math
ISBN:
9780134437705
Author:
Washington
Publisher:
PEARSON
Topology
Topology
Advanced Math
ISBN:
9780134689517
Author:
Munkres, James R.
Publisher:
Pearson,