Below is a 3x2 version of the reinforcement learning task from class. Other than being a smaller world, the details are the same. The agent can execute actions: up, down, left, or right. When executing an action, the agent has an 80% chance of actually moving in that direction, a 10% chance of moving in the-90 degrees direction, and a 10% chance of moving in the +90 degrees direction. If the agent attempts to move into a wall, then the agent stays in the same location. If the agent moves into location (3,1), it receives a +1 reward and the task is over. If the agent moves into location (3,2). It receives a -1 reward and the task is over. For all other actions, the agent receives a -0.04 reward. I 2 1 1 -1 +1 1 2 3 (a) Show the utility equations for U(1.1), U(1,2), U(2,1) and U(2,2) for the policy in the above picture assuming the discount factor gamma=0.9. (b) Show the final utility values for U(1,1), U(1,2), U(2,1), and U(2,2) for this policy. You do not need to show the computations, just the final values rounded to two-digit precision.
Below is a 3x2 version of the reinforcement learning task from class. Other than being a smaller world, the details are the same. The agent can execute actions: up, down, left, or right. When executing an action, the agent has an 80% chance of actually moving in that direction, a 10% chance of moving in the-90 degrees direction, and a 10% chance of moving in the +90 degrees direction. If the agent attempts to move into a wall, then the agent stays in the same location. If the agent moves into location (3,1), it receives a +1 reward and the task is over. If the agent moves into location (3,2). It receives a -1 reward and the task is over. For all other actions, the agent receives a -0.04 reward. I 2 1 1 -1 +1 1 2 3 (a) Show the utility equations for U(1.1), U(1,2), U(2,1) and U(2,2) for the policy in the above picture assuming the discount factor gamma=0.9. (b) Show the final utility values for U(1,1), U(1,2), U(2,1), and U(2,2) for this policy. You do not need to show the computations, just the final values rounded to two-digit precision.
Chapter1: Making Economics Decisions
Section: Chapter Questions
Problem 1QTC
Related questions
Question
4
![Below is a 3x2 version of the reinforcement learning task from class. Other than being a smaller world, the details are the same. The agent can
execute actions: up, down, left, or right. When executing an action, the agent has an 80% chance of actually moving in that direction, a 10% chance
of moving in the -90 degrees direction, and a 10% chance of moving in the +90 degrees direction. If the agent attempts to move into a wall, then the
agent stays in the same location. If the agent moves into location (3,1), it receives a +1 reward and the task is over. If the agent moves into location
(3.2). it recelves a -1 reward and the task is over. For all other actions, the agent receives a -0.04 reward.
2
-1
1
+1
1
2
3
(a) Show the utility equations for U(1,1), U(1,2), U(2,1) and U(2,2) for the policy in the above picture assuming the discount factor gamma = 0.9.
(b) Show the final utility values for U(1,1), U(1,2), U(2,1), and U(2,2) for this policy. You do not need to show the computations, just the final values
rounded to two-digit precislon.](/v2/_next/image?url=https%3A%2F%2Fcontent.bartleby.com%2Fqna-images%2Fquestion%2F629eb2d8-2aec-4b81-96f5-36db7876a096%2F77b275de-60e5-4caf-870b-1d7e12d4f416%2Fk6dzr1i_processed.png&w=3840&q=75)
Transcribed Image Text:Below is a 3x2 version of the reinforcement learning task from class. Other than being a smaller world, the details are the same. The agent can
execute actions: up, down, left, or right. When executing an action, the agent has an 80% chance of actually moving in that direction, a 10% chance
of moving in the -90 degrees direction, and a 10% chance of moving in the +90 degrees direction. If the agent attempts to move into a wall, then the
agent stays in the same location. If the agent moves into location (3,1), it receives a +1 reward and the task is over. If the agent moves into location
(3.2). it recelves a -1 reward and the task is over. For all other actions, the agent receives a -0.04 reward.
2
-1
1
+1
1
2
3
(a) Show the utility equations for U(1,1), U(1,2), U(2,1) and U(2,2) for the policy in the above picture assuming the discount factor gamma = 0.9.
(b) Show the final utility values for U(1,1), U(1,2), U(2,1), and U(2,2) for this policy. You do not need to show the computations, just the final values
rounded to two-digit precislon.
Expert Solution
![](/static/compass_v2/shared-icons/check-mark.png)
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by step
Solved in 3 steps
![Blurred answer](/static/compass_v2/solution-images/blurred-answer.jpg)
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, economics and related others by exploring similar questions and additional content below.Recommended textbooks for you
![ENGR.ECONOMIC ANALYSIS](https://compass-isbn-assets.s3.amazonaws.com/isbn_cover_images/9780190931919/9780190931919_smallCoverImage.gif)
![Principles of Economics (12th Edition)](https://www.bartleby.com/isbn_cover_images/9780134078779/9780134078779_smallCoverImage.gif)
Principles of Economics (12th Edition)
Economics
ISBN:
9780134078779
Author:
Karl E. Case, Ray C. Fair, Sharon E. Oster
Publisher:
PEARSON
![Engineering Economy (17th Edition)](https://www.bartleby.com/isbn_cover_images/9780134870069/9780134870069_smallCoverImage.gif)
Engineering Economy (17th Edition)
Economics
ISBN:
9780134870069
Author:
William G. Sullivan, Elin M. Wicks, C. Patrick Koelling
Publisher:
PEARSON
![ENGR.ECONOMIC ANALYSIS](https://compass-isbn-assets.s3.amazonaws.com/isbn_cover_images/9780190931919/9780190931919_smallCoverImage.gif)
![Principles of Economics (12th Edition)](https://www.bartleby.com/isbn_cover_images/9780134078779/9780134078779_smallCoverImage.gif)
Principles of Economics (12th Edition)
Economics
ISBN:
9780134078779
Author:
Karl E. Case, Ray C. Fair, Sharon E. Oster
Publisher:
PEARSON
![Engineering Economy (17th Edition)](https://www.bartleby.com/isbn_cover_images/9780134870069/9780134870069_smallCoverImage.gif)
Engineering Economy (17th Edition)
Economics
ISBN:
9780134870069
Author:
William G. Sullivan, Elin M. Wicks, C. Patrick Koelling
Publisher:
PEARSON
![Principles of Economics (MindTap Course List)](https://www.bartleby.com/isbn_cover_images/9781305585126/9781305585126_smallCoverImage.gif)
Principles of Economics (MindTap Course List)
Economics
ISBN:
9781305585126
Author:
N. Gregory Mankiw
Publisher:
Cengage Learning
![Managerial Economics: A Problem Solving Approach](https://www.bartleby.com/isbn_cover_images/9781337106665/9781337106665_smallCoverImage.gif)
Managerial Economics: A Problem Solving Approach
Economics
ISBN:
9781337106665
Author:
Luke M. Froeb, Brian T. McCann, Michael R. Ward, Mike Shor
Publisher:
Cengage Learning
![Managerial Economics & Business Strategy (Mcgraw-…](https://www.bartleby.com/isbn_cover_images/9781259290619/9781259290619_smallCoverImage.gif)
Managerial Economics & Business Strategy (Mcgraw-…
Economics
ISBN:
9781259290619
Author:
Michael Baye, Jeff Prince
Publisher:
McGraw-Hill Education