Reinforcement learning
General data
Course ID: | 1000-2M20UZW |
Erasmus code / ISCED: |
11.3
|
Course title: | Reinforcement learning |
Name in Polish: | Uczenie ze wzmocnieniem (wspólnie z 1000-318bRL) |
Organizational unit: | Faculty of Mathematics, Informatics, and Mechanics |
Course groups: |
(in Polish) Przedmioty obieralne na studiach drugiego stopnia na kierunku bioinformatyka Elective courses for Computer Science and Machine Learning |
ECTS credit allocation (and other scores): |
(not available)
|
Language: | English |
Type of course: | elective monographs |
Short description: |
The classes present contemporary techniques and algorithms of reinforcement learning. |
Full description: |
1. Model-free methods a) Reinforcement Learning formalism: Markov Decision Processes (MDPs) & Dynamic programming (DP) b) Value methods * SARSA and TD(1) * Bias-variance trade-off and TD(lambda) * Function approximators and corresponding challenges c) Policy gradient methods * Vanilla policy gradients * Generalized Advantage Estimator (GAE) * Problems with policy gradient methods d) Actor-critic methods * Trust Region Policy Optimization (TRPO) * Proximal Policy Optimization (PPO) * Soft Actor-Critic (SAC) 2. Model-based methods: a) Model estimation b) Planning * Continuous and discrete control problems * Monte-Carlo Tree Search * AlphaZero 3. Exploration a) Multi-armed bandits model b) Uncertainty related exploration strategies 4. Research topics 5. Talks by practitioners |
Bibliography: |
R. Sutton, G. Barto, Reinforcement Learning: An Introduction Francois-Lavet, F., Henderson P., Islam R., Bellemare M. G., Pineau J.,, An Introduction to Deep Reinforcement Learning. Szepesvari, C., Algorithms for Reinforcement Learning |
Learning outcomes: |
Knowledge * Mathematical formalism of reinforcement learning, which allows to develop efficient RL algorithms and analyse existing ones. * Understands the basic components of RL algorithms and how they interact together. * Knows when to apply and how to implement most important algorithms in RL from policy gradient, value-based and actor-critic classes. * Has a basic knowledge of popular RL libraries. Skills * Can develop efficient algorithms and test them. * Can distinguish types of RL problems and estimate its difficulty. * Can appropriately apply methods to develop an algorithm or apply already known methods in own research projects. * Can implement own algorithms and use existing RL libraries. * Can test implemented and developed algorithms. * Can find and use the information contained in research papers Competences * Knows limits of own RL knowledge and realizes the need of continuous learning. * Understands the need for systematic work and meeting deadlines. * Understands and appreciates the importance of intellectual honesty in the use of someone else's software. Behaves ethically during the implementation of algorithmic projects. * Independently be able to find and use various types of information about algorithms, also in foreign languages. |
Assessment methods and assessment criteria: |
Attendance and project. |
Copyright by University of Warsaw.