Spring 2022 CS 444Assignment 5: Deep Reinforcement LearningDue date: Friday, May 5, 11:59:59PM![]() In this assignment, you will implement the famous Deep Q-Network (DQN) and its successor Double DQN on the game of Atari Breakout using the OpenAI Gym. The goals of this assignment are to (1) understand how deep reinforcement learning works when interacting with the pixel-level information of an environment and (2) implement a recurrent state to encode and maintain history. Download the starting code here. The top-level notebook ( Note, as you look in the iPython notebook, in our terminology, a single episode is a game played by the agent till it loses all its lives (in this case, your agent has 5 lives). In the paper, however, an episode refers to almost 30 minutes of training on the GPU and such training is not feasible for us. Below is a more thorough example of expected rewards vs. number of episodes to help with your debugging.
Your goal is to have either agent.py or agent_double.py reach an evaluation score of 10. To have a fair comparison in the report, we ask you to run both files the same amount of episodes, but only one model is required to reach the evaluation score of 10. We recommend that you look at the following links:
We highly recommend that you understand the Official DQN PyTorch tutorial before starting this assignment. This will give you a great starting point to implement DQN and Double DQN as the tutorial implements a version of double DQN for cartpole! However, we expect you to follow our code instructions and implement code in our format. Uploading code that does not follow our format will receive a zero. This is a computationally expensive assignment. It is expected that your code should run for at least 4 hours to complete 2000 episodes. You can stop training early if you reach a mean score of 10 in the game. As mentioned, we will be providing some initial expectations of score values with respect to episodes on Campuswire. This assignment requires a GPU, so use your Google Cloud credits (colab could work for this assignment as well). Extra Credit
Environment SetupThe assignment is given to you in the If you will be working on the assignment on a local machine then you will need a Python environment set up with the appropriate packages. We suggest that you use Conda to manage Python package dependencies (https://conda.io/docs/user-guide/getting-started.html). Unless you have a machine with a GPU, running this assignment on your local machine will be very slow and is not recommended. Submission InstructionsThis is your last assignment, so feel free to use up your remaining late days if you so choose!
Please refer to course policies on collaborations, late submission, etc. |