Fall 2018 CS498DL
Assignment 5: Deep Reinforcement Learning
Due date: Thursday, December 20th, 11:59:59PM -- No late submissions accepted!
In this assignment, you will implement the famous Deep Q-Network (DQN) on the game of Breakout using the OpenAI Gym. The goal of this assignment to understand how Reinforcement Learning works using deep neural networks when interacting with the pixel-level information of an environment.
Download the starting code here.
The top-level notebook (
Note, as you look in the ipython notebook, in our terminology, a single episode is a game played by the agent till it loses all its lives (in this case, your agent has 5 lives). In the paper, however, an episode refers to almost 30 minutes of training on the GPU and such training is not feasible for us. We will provide a more thorough table of expected rewards vs. number of episodes on Piazza soon to help with your debugging.
We recommend that you look at the following links provided.
We highly recommend that you do the Official DQN Pytorch tutorial before starting this assignment.
This is a computationally expensive assignment. It is expected that your code should run for at least 13-15 hours to complete 5000 episodes. You can stop training early if you reach a mean score of 10 in the game. As mentioned, we will soon be providing some initial expectations of score values with respect to episodes on Piazza, so stay tuned and in the meanwhile please get started.
This assignment requires a GPU, so use your Google Cloud credits.
Environment Setup (Local)
If you will be working on the assignment on a local machine then you will need a python environment set up with the appropriate packages. We suggest that you use Conda to manage python package dependencies (https://conda.io/docs/user-guide/getting-started.html).
Unless you have a machine with a GPU, running this assignment on your local machine will be very slow and is not recommended.
The assignment is given to you in the
Late submissions will not be accepted!
Please refer to course policies on collaborations, late submission, and extension requests.