Consider Slide 55 in "8 Reinforcement Learning.ppt". We have a part of another trajectory: (c1,E, -1) -> (c2, N, -1). 。 Under SARSA, the new Q(c1,E) = • Under Q-Learning, the new Q(c1,E) = answer) Picture below is of the slide N(s,a) (keep one decimal value in your answer) (keep one decimal value in your Q(s, a) 6 3 1 -2.4 -1.8 41.0 10 10 55 12 0 25 25 +100 C-2.4 16.4-1.8 60.8 0.0 88.8 +100 11 3 1 -2.4 -1.8 3.4 18 5 13 9 -2.9 45.8 9 55 55 2 -100 b-2.9 -2.9 12.2 -101.0 -100 4 -2.9 -51.5 6 4 28 4 -2.6 -1.9 11.0 -101.0 00 8 17 3 20 1 88 8a-2.5 -2.2-2.0 0.5 -2.0 -2.1 -15.0 -14.9 6 4 4 77 -2.4 -2.1 -1.9 -3.7 1 2 3 4 1 2 3 4

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Consider Slide 55 in "8 Reinforcement Learning.ppt". We have a part of another trajectory: (c1,E, -1) -> (c2, N, -1).

  • Under SARSA, the new Q(c1,E) = ?               (keep one decimal value in your answer)
  • Under Q-Learning, the new Q(c1,E) = ?           (keep one decimal value in your answer)
Consider Slide 55 in "8 Reinforcement Learning.ppt". We have a part of another trajectory: (c1,E, -1) -> (c2,
N, -1).
。 Under SARSA, the new Q(c1,E) =
• Under Q-Learning, the new Q(c1,E) =
answer)
Picture below is of the slide
N(s,a)
(keep one decimal value in your answer)
(keep one decimal value in your
Q(s, a)
6
3
1
-2.4
-1.8
41.0
10
10
55
12 0
25
25
+100
C-2.4
16.4-1.8
60.8 0.0
88.8
+100
11
3
1
-2.4
-1.8
3.4
18
5
13
9
-2.9
45.8
9
55
55
2 -100
b-2.9
-2.9
12.2 -101.0
-100
4
-2.9
-51.5
6
4
28
4
-2.6
-1.9
11.0
-101.0
00
8
17 3
20 1
88
8a-2.5
-2.2-2.0
0.5 -2.0
-2.1 -15.0 -14.9
6
4
4
77
-2.4
-2.1
-1.9
-3.7
1
2
3
4
1
2
3
4
Transcribed Image Text:Consider Slide 55 in "8 Reinforcement Learning.ppt". We have a part of another trajectory: (c1,E, -1) -> (c2, N, -1). 。 Under SARSA, the new Q(c1,E) = • Under Q-Learning, the new Q(c1,E) = answer) Picture below is of the slide N(s,a) (keep one decimal value in your answer) (keep one decimal value in your Q(s, a) 6 3 1 -2.4 -1.8 41.0 10 10 55 12 0 25 25 +100 C-2.4 16.4-1.8 60.8 0.0 88.8 +100 11 3 1 -2.4 -1.8 3.4 18 5 13 9 -2.9 45.8 9 55 55 2 -100 b-2.9 -2.9 12.2 -101.0 -100 4 -2.9 -51.5 6 4 28 4 -2.6 -1.9 11.0 -101.0 00 8 17 3 20 1 88 8a-2.5 -2.2-2.0 0.5 -2.0 -2.1 -15.0 -14.9 6 4 4 77 -2.4 -2.1 -1.9 -3.7 1 2 3 4 1 2 3 4
Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer
Similar questions
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education