

DS595 - Reinforcement Learning - Spring 2022
Version: Jan 4th, 2022


Tentative Schedule:
Slides will be uploaded before each lecture.
-1. Week 1 (1/19 W):
Topic: Overview of Reinforcement Learning and Class Logistics (Slides)
Readings: N/A
-2. Week 2 (1/26 W):
-3. Week 3 (2/2 W):
-4. Week 4 (2/9 W):
Topic: Model-free Control.(updated Slides on 2/10/2022) .
Note: Quiz 1 on Markov Decision Process and Model-based Control (20min).
Note: Project 1 due.
-5. Week 5 (2/16 W):
-6. Week 6 (2/23 W):
Topic: Review of Deep Learning and Deep Reinforcement Learning.(Slides)
Note: Quiz 2 on Model-free Policy Evaluation.
-7. Week 7 (3/2 W):
Topic: Advanced Deep Reinforcement Learning by Prof Li, and Deep Learning Implementation in Pytorch (by TA Yingxue).(Updated slides on 3/3)
Optional Reading #1: [AAAI 2016, Double DQN] Deep Reinforcement Learning with Double Q-learning, Hado van Hasselt and Arthur Guez and David Silver Google DeepMind https://arxiv.org/pdf/1509.06461.pdf.
Optional Reading #2: [ICLR 2016] PRIORITIZED EXPERIENCE REPLAY, Tom Schaul, John Quan, Ioannis Antonoglou and David Silver Google DeepMind https://arxiv.org/pdf/1511.05952.pdf.
Optional Reading #3: [ICML 2016, Dueling DQN] Dueling Network Architectures for Deep Reinforcement Learning, Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas https://arxiv.org/pdf/1511.06581.pdf.
Optional Reading #4: [AAAI 2018, Rainbow] Rainbow: Combining Improvements in Deep Reinforcement Learning, Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver, AAAI 2018, https://arxiv.org/pdf/1710.02298.pdf.
Note: Quiz 3 on Model-free Control.
Note: Project 2 due.
Note: Project 3 starts.
-8. Week 8 (3/9 W): No Class; Spring Break
-9. Week 9 (3/16 W): .
Topic: Advanced DQNs (Continued) and Inverse Reinforcement Learning and Imitation learning.(Slides).
Note: Quiz 4 on linear function approximation for policy evaluation and Control.
Note: We will have an inclass selfintroduction session, so you can start forming a team for project 4.
-10. Week 10 (3/23 W):
Topic: Imitation Learning (Continued!) and Policy as a Deep Neural Network: Policy Gradient Reinforcement Learning.(Slides)(Updated on 3/29 for fixing some typosi, and updated Vanilla PG algorithm on 3/31 Thanks for Zhitian's comments.)
Suggested Reading: Policy Gradient RL algorithms (a good and comprehensive blog) (For this class, reading the sections we covered is sufficient.)
Note: Project 4 starts.
-11. Week 11 (3/30 W):
Topic: Policy Gradient RL (continued) (See the slides from last week.)
Note: Project 4 Proposal due.
-12. Week 12 (4/6 W):
Topic: Advanced Policy Gradient (PPO, TRPO, PPO2) (continued), Actor-Critic Approaches (A2C, A3C, Pathwise Derivative PG).(Slides).
Optional Reading #1: [TRPO] https://arxiv.org/pdf/1502.05477.pdf
Optional Reading #2: [PPO] https://arxiv.org/pdf/1707.06347.pdf
Note: Quiz 5 on policy gradient (including Basic PG, REINFORCE PG, and Vanilla PG).
Note: Project 3 due (extended! Originally Week 10 on 3/23).
-13. Week 13 (4/13 W):
Topic: Advanced RL methods: Sparse reward, hierarchical RL; Multi-Agent RL (MARL)(Slides).
Optional Readings: DDPG, MA-DDPG, GAIL, MA-GAIL.
Note: Project #4 Progressive Report is due. Please submit it to Canvas discussion board in teams.
-14. Week 14 (4/20 W):
Topic:Deep Inverse Reinforcement Learning, Multi-agent IRL, Meta-RL, and Class Review.(Slides) (Updated on 4/20!).
Optional Readings: Meta-RL.
-15. Week 15 (4/27 W):
Topic: Project #4 Presentations.
Note: Project 4 due.
-->
To be updated.
yli15 at wpi.edu
|