DS551/CS525 - Reinforcement Learning - Fall 2024Version: June 24th, 2024
Tentative Schedule:-1. Week 1 (8/27 T): -2. Week 2 (9/3 T):
Optional readings: Reinforcement Learning: An Introduction, Sutton and Barto, 2nd Edition Sec 3.1-3.6, 4.1-4.4. -3. Week 3 (9/10 T):
Optional readings: Reinforcement Learning: An Introduction, Sutton and Barto, 2nd Edition Sec 5.1-5.3, 6.1-6.6. -4. Week 4 (9/17 T):
Optional readings: Reinforcement Learning: An Introduction, Sutton and Barto, 2nd Edition Sec 5.1-5.3, 6.1-6.6. -5. Week 5 (9/24 T):
Optional readings: Reinforcement Learning: An Introduction, Sutton and Barto, 2nd Edition Sec 9.1-9.4. -6. Week 6 (10/1 T):
Topic: Deep Reinforcement Learning. Optional readings: Mnih, Volodymyr, et al., Playing Atari with Deep Reinforcement Learning, arXiv preprint arXiv:1312.5602 (2013). -7. Week 7 (10/8 T):
Optional Reading #1: [AAAI 2016, Double DQN] Deep Reinforcement Learning with Double Q-learning, Hado van Hasselt and Arthur Guez and David Silver Google DeepMind https://arxiv.org/pdf/1509.06461.pdf. Optional Reading #2: [ICLR 2016] PRIORITIZED EXPERIENCE REPLAY, Tom Schaul, John Quan, Ioannis Antonoglou and David Silver Google DeepMind https://arxiv.org/pdf/1511.05952.pdf. Optional Reading #3: [ICML 2016, Dueling DQN] Dueling Network Architectures for Deep Reinforcement Learning, Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas https://arxiv.org/pdf/1511.06581.pdf. Optional Reading #4: [AAAI 2018, Rainbow] Rainbow: Combining Improvements in Deep Reinforcement Learning, Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver, AAAI 2018, https://arxiv.org/pdf/1710.02298.pdf. -8. Week 8 (10/15 T): No Class; Fall Break
-9. Week 9 (10/22 T): .
-10. Week 10 (10/29 T):
Optional readings: Reinforcement Learning: An Introduction, Sutton and Barto, 2nd Edition Chapter 13. optional Reading: Policy Gradient RL algorithms (a good and comprehensive blog) (For this class, reading the sections we covered is sufficient.)
-11. Week 11 (11/5 T):
Topic: Advanced Policy Gradient (PPO, TRPO, PPO2) (continued), Actor-Critic Approaches (A2C, A3C, Pathwise Derivative PG), Sparse Reward, Hierarchical RL.. Optional Reading #1: [TRPO] https://arxiv.org/pdf/1502.05477.pdf Optional Reading #2: [PPO] https://arxiv.org/pdf/1707.06347.pdf Optional Reading #3: [Actor-critic RL algorithms] Optional Reading #2: [DDPG] https://spinningup.openai.com/en/latest/algorithms/ddpg.html -13. Week 13 (11/19 T):
Optional Readings: DDPG, MA-DDPG, AlphaTensor. -14. Week 14 (11/26 T):
Optional Readings: GAIL, MA-GAIL. Optional Readings: A Beginner's Guide to Generative Adversarial Networks (GANs) (link). Optional Readings: Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680). (paper) -15. Week 15 (12/3 T):
Optional Readings: Meta-RL. -16. Week 16 (12/10 T):
yli15 at wpi.edu |