Summer Internships

Are you looking for a full-time summer internship with us? Compared to a regular practical course, you will be more closely involved in our research group and have the opportunity to gain deeper insights into our work. Currently, we are offering the topics listed below. If you are interested in one of these topics, please contact the indicated supervisor for details and submit your application using our application form. If you want to propose your own topic, please look for a suitable supervisor here and contact them directly.

Offered Internship Topics

EDGAR Datacenter

Our research vehicle, EDGAR, is generating vast amounts of valuable driving data. We're looking for talented contributors to help us shape this data into one of the largest open-source dataset that will drive innovation in autonomous driving research.

A major obstacle for that is the implementation of an efficient and reliable data warehouse that allows researchers to easily access and work with this data.

Your tasks include:

API Design: Craft intuitive endpoints for seamless data access.
Data Transformation: Extract and transform raw rosbag files into usable formats.
CI Testing: Build robust pipelines to ensure data integrity and project reliability.

Essential skills:

Python
API design
Ros2 (preferrable)

If you are interested, feel free to contact me at roland.stolz@tum.de.

End-to-end Safe Reinforcement Learning Using Differentiable Simulation

Reinforcement learning (RL) has demonstrated remarkable success in solving complex control tasks, such as robotic manipulation and autonomous driving. However, many real-world control scenarios impose safety constraints that vanilla RL algorithms struggle to satisfy. Guaranteeing constraint satisfaction in RL is an active field of research. Most safeguarding approaches, such as predictive safety filters, rely on a (potentially simplified) analytical model of the system under control. However, this model is treated as a black box from the perspective of the RL agent. The central idea of this work is to incorporate the model knowledge used in safeguarding into the training process. By using a differentiable simulation as well as a fully differentiable safeguarding approach, we can obtain the gradient of the reward w.r.t. the agent’s actions. This promises to improve sample efficiency and speed up training, which is advantageous since the safeguarding is computationally expensive. We aim to combine previous work on policy learning with fully differentiable simulation with a differentiable action projection safety shield that can be integrated into the RL agent’s policy. Your goal is to evaluate whether this approach can improve sample efficiency and wall clock time during training compared to model-free RL algorithms with non-differentiable safety layers.

Your tasks will include:

replicating results from an existing paper on our hardware (reinforcement learning with fully differentiable simulation),
implementing new systems (e.g., quadrotor) as fully differentiable gym environments,
integrating a fully differentiable safety shield, e.g., using CVXPyLayers

Essential skills:

Python
PyTorch

If you are interested, feel free to contact me at hannah.markgraf@tum.de with your CV and transcript of records.