Below is a 3x2 version of the reinforcement learning task from class. Other than being a smaller world, the details are the same. The agent can execute actions: up, down, left, or right. When executing an action, the agent has an 80% chance of actually moving in that direction, a 10% chance of moving in the-90 degrees direction, and a 10% chance of moving in the +90 degrees direction. If the agent attempts to move into a wall, then the agent stays in the same location. If the agent moves into location (3,1), it receives a +1 reward and the task is over. If the agent moves into location (3,2). It receives a -1 reward and the task is over. For all other actions, the agent receives a -0.04 reward. I 2 1 1 -1 +1 1 2 3 (a) Show the utility equations for U(1.1), U(1,2), U(2,1) and U(2,2) for the policy in the above picture assuming the discount factor gamma=0.9. (b) Show the final utility values for U(1,1), U(1,2), U(2,1), and U(2,2) for this policy. You do not need to show the computations, just the final values rounded to two-digit precision.
Below is a 3x2 version of the reinforcement learning task from class. Other than being a smaller world, the details are the same. The agent can execute actions: up, down, left, or right. When executing an action, the agent has an 80% chance of actually moving in that direction, a 10% chance of moving in the-90 degrees direction, and a 10% chance of moving in the +90 degrees direction. If the agent attempts to move into a wall, then the agent stays in the same location. If the agent moves into location (3,1), it receives a +1 reward and the task is over. If the agent moves into location (3,2). It receives a -1 reward and the task is over. For all other actions, the agent receives a -0.04 reward. I 2 1 1 -1 +1 1 2 3 (a) Show the utility equations for U(1.1), U(1,2), U(2,1) and U(2,2) for the policy in the above picture assuming the discount factor gamma=0.9. (b) Show the final utility values for U(1,1), U(1,2), U(2,1), and U(2,2) for this policy. You do not need to show the computations, just the final values rounded to two-digit precision.
Chapter1: Making Economics Decisions
Section: Chapter Questions
Problem 1QTC
Related questions
Question
4
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by step
Solved in 3 steps
Knowledge Booster
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, economics and related others by exploring similar questions and additional content below.Recommended textbooks for you
Principles of Economics (12th Edition)
Economics
ISBN:
9780134078779
Author:
Karl E. Case, Ray C. Fair, Sharon E. Oster
Publisher:
PEARSON
Engineering Economy (17th Edition)
Economics
ISBN:
9780134870069
Author:
William G. Sullivan, Elin M. Wicks, C. Patrick Koelling
Publisher:
PEARSON
Principles of Economics (12th Edition)
Economics
ISBN:
9780134078779
Author:
Karl E. Case, Ray C. Fair, Sharon E. Oster
Publisher:
PEARSON
Engineering Economy (17th Edition)
Economics
ISBN:
9780134870069
Author:
William G. Sullivan, Elin M. Wicks, C. Patrick Koelling
Publisher:
PEARSON
Principles of Economics (MindTap Course List)
Economics
ISBN:
9781305585126
Author:
N. Gregory Mankiw
Publisher:
Cengage Learning
Managerial Economics: A Problem Solving Approach
Economics
ISBN:
9781337106665
Author:
Luke M. Froeb, Brian T. McCann, Michael R. Ward, Mike Shor
Publisher:
Cengage Learning
Managerial Economics & Business Strategy (Mcgraw-…
Economics
ISBN:
9781259290619
Author:
Michael Baye, Jeff Prince
Publisher:
McGraw-Hill Education