Project 4 | CS 188 Spring 2023

pdf

School

Hong Kong Polytechnic University *

*We aren’t endorsed by this school

Course

273

Subject

Computer Science

Date

Nov 24, 2024

Type

pdf

Pages

Uploaded by lixun73230

CS 188 Spring 2023 Projects / Project 4 Due: Tuesday, March 14, 11 : 59 PM PT . Introduction MDPs Question 1 (6 points): Value Iteration Question 2 (5 points): Policies Question 3 (6 points): Q-Learning Question 4 (2 points): Epsilon Greedy Question 5 (2 point): Q-Learning and Pacman Question 6 (4 points): Approximate Q-Learning Submission In this project, you will implement value iteration and Q-learning. You will test your agents first on Gridworld (from class), then apply them to a simulated robot controller (Crawler) and Pacman. Questions 1 and 2 are on MDPs and are in-scope for the midterm. As in previous projects, this project includes an autograder for you to grade your solutions on your machine. This can be run on all questions with the command: It can be run for one particular question, such as q2, by: It can be run for one particular test by commands of the form: Project 4: Reinforcement Learning TABLE OF CONTENTS • • • • • • • • • Introduction python autograder.py Copy python autograder.py -q q2 Copy

The code for this project contains the following files, available as a zip archive . Files you'll edit: valueIterationAgents.py A value iteration agent for solving known MDPs. qlearningAgents.py Q-learning agents for Gridworld, Crawler and Pacman. analysis.py A file to put your answers to questions given in the project. Files you might want to look at: mdp.py Defines methods on general MDPs. learningAgents.py Defines the base classes ValueEstimationAgent and QLearningAgent , which your agents will extend. util.py Utilities, including util.Counter , which is particularly useful for Q-learners. gridworld.py The Gridworld implementation. featureExtractors.py Classes for extracting features on (state, action) pairs. Used for the approximate Q-learning agent (in qlearningAgents.py ). Supporting files you can ignore: environment.py Abstract class for general reinforcement learning environments. Used by gridworld.py . graphicsGridworldDisplay.py Gridworld graphical display. graphicsUtils.py Graphics utilities. textGridworldDisplay.py Plug-in for the Gridworld text interface. crawler.py The crawler code and test harness. You will run this but not edit it. python autograder.py -t test_cases/q2/1-bridge-grid Copy

graphicsCrawlerDisplay.py GUI for the crawler robot. autograder.py Project autograder testParser.py Parses autograder test and solution files testClasses.py General autograding test classes test_cases/ Directory containing the test cases for each question reinforcementTestClasses.py Project 4 specific autograding test classes Files to Edit and Submit : You will fill in portions of valueIterationAgents.py , qlearningAgents.py , and analysis.py during the assignment. Once you have completed the assignment, you will submit these files to Gradescope (for instance, you can upload all .py files in the folder). Please do not change the other files in this distribution. Evaluation : Your code will be autograded for technical correctness. Please do not change the names of any provided functions or classes within the code, or you will wreak havoc on the autograder. However, the correctness of your implementation – not the autograder’s judgements – will be the final judge of your score. If necessary, we will review and grade assignments individually to ensure that you receive due credit for your work. Academic Dishonesty : We will be checking your code against other submissions in the class for logical redundancy. If you copy someone else’s code and submit it with minor changes, we will know. These cheat detectors are quite hard to fool, so please don’t try. We trust you all to submit your own work only; please don’t let us down. If you do, we will pursue the strongest consequences available to us. Getting Help : You are not alone ! If you find yourself stuck on something, contact the course staff for help. Office hours, section, and the discussion forum are there for your support; please use them. If you can’t make our office hours, let us know and we will schedule more. We want these projects to be rewarding and instructional, not frustrating and demoralizing. But, we don’t know when or how to help unless you ask. Discussion : Please be careful not to post spoilers.

Your preview ends here