Applied Reinforcement Learning

Applied Reinforcement Learning
Lecturer: Hao Shen
Assistant: Martin Gottwald and Zhiwei Han
Contact: adprl.ldv@xcit.tum.de
Targeted Audience: Wahlfach, Ergänzungsvorlesung (Master)
ECTS: 6
Umfang: 2/2 (SWS Lecture/Tutorial)
Term: Summer
Registration phase: ??.??.2023 - 01.04.2023
Time & Place:  
Lecture:

03.04.2023 - 05.04.2023 and 12.04.2023 - 14.04.2023
(6 days in total before the semester starts)
N 1090 ZG, 9:00 - 17:00 h

Regular Question Session: Wednesday, Z995 or in NavigaTUM ,15:00 - 16:30
Extra Question Session: on request

Short-term Announcements

  • The room of the block course has been adjusted (30.01.2023)

Content

Reinforcement learning (RL) is one approach for solving sequential decision making problems. A reinforcement learning agent interacts with its environment and uses its experience to make decisions towards solving the problem. The technique has succeeded in various applications of operation research, robotics, game playing, network management, and computational intelligence.

This lecture provides an overview of basic concepts, practical techniques, and programming tools used in reinforcement learning. Specifically, it focuses on the application aspects of the subject, such as problem solving and implementations. By design, it aims to complement the theoretical treatment of the subject, such as mathematical derivation, convergence proves, and bound analysis, which are covered in the lecture "Approximate Dynamic Programming and Reinforcement Learning" in winter semesters.

In this lecture, we will cover the following topics:

  • Reinforcement learning problems as Markov decision processes
  • Dynamic programming (value iteration and policy iteration)
  • Monte Carlo reinforcement learning methods
  • Temporal difference learning (SARSA and Q-learning)
  • Simulation-based reinforcement learning algorithms
  • Linear value function approximation, e.g. tile coding

We will not cover:

  • Deep Reinforcement Learning in any flavor
  • Deep function approximation architectures that change during the learning process

The excessive tuning of hyper parameters exceeds the time and computational constraints of the lecture.

The goal of the lecture is, that students are able to master the application of reinforcement learning algorithms for solving continuous control problems. This involves:

  • describe classic scenarios of reinforcement learning problems
  • explain basics of reinforcement learning methods
  • select proper reinforcement learning algorithms in accordance with specific problems, and argue their choices
  • construct and implement reinforcement learning algorithms
  • model an engineering problem as Markov Decision Process
  • design and plan experiments
  • analyze and evaluate systematically the outcome of experiments
  • compare performance of the reinforcement learning algorithms covered by the course
  • creating a concise report / documentation of the project and their results.

The reinforcement learning algorithms will be applied on a simulated game in groups of three.

Lecture Details

The course consists of two phases:

  1. block lecture before the semester starts:
    1. frontal teaching sessions in the morning covering the RL basics
    2. practical coding part with discussions and individual feedback after lunch
  2. project phase:
    1. weekly consulting sessions (two hours per week) throughout the semester, where we will be available for questions and feedback
    2. Additional sessions if requested and required

Registration Details

Due to the required feedback for the reports and the project nature of the course, we have to restrict the number of students to 15. Please mind the following procedure:

  • If you have interest in the course, register in TUMOnline for the waiting list
  • Room and Location are given in the table at the top
  • Visit the block lecture before the semester starts and participate in the hands-on part to demonstrate your motivation
  • The slots for the actual course (project and reports during semester) will be filled from all active and motivated students. The order is still determined by TUMonline and its lottery.
  • Once you were signed up for the actual course, you are enrolled and thus block a slot. If you quit the course later on, then you will prevent other students from taking the course, thus: Only sign up if you are sure to stay in the course for the whole semester!

Literature

  • Sutton, R. S. & Barto, A. G., Reinforcement Learning: An Introduction. The MIT Press, 1998 (or the new version)
  • Bertsekas, D. P. & Tsitsiklis, J., Neuro-dynamic programming. Athena Scientific, 1996
  • Bertsekas, D. P., Dynamic Programming and Optimal Control Vol. 1 & 2.
  • Szepesvári, S., Algorithms for Reinforcement Learning. Morgan & Claypool, 2010 (a draft)

Target Audience and Signup

Students in a Masters degree program. Registration via TUMOnline.