Student projects and final year projects at the Chair of Media Technology

We constantly offer topics for student projects (engineering experience, research experience, student, IDPs) and final year projects (bachelor thesis or master thesis).

 

Open Thesis

Self-Attention-based End-to-End Scan2Cad

Description

In this work, we want to improve on Scan2Cad Methods as proposed by [1][2][3].

This works aims at enhancing [2] through self-attention mechanisms.

References

[1] https://arxiv.org/pdf/1811.11187.pdf

[2] https://arxiv.org/pdf/1906.04201.pdf

[3] https://arxiv.org/pdf/2003.12622.pdf

Prerequisites

  • Python (Pytorch)
  • Experience with Git

Contact

driton.salihu@tum.de

Supervisor:

Driton Salihu

A Scene Graph based Refinement for 3D Scene Point Clouds Completion

Keywords:
Scene Completion, Point Clouds, Scene Graph, Deep Learning

Description

In this work, we want to investigate how scene graphs can help to improve scene completion/point cloud completion. The scene graph will be generated by object, attributes, and relationships detection with the 2D RGB images as input. The first stage of this work is to exploit the state-of-the-art scene reconstruction framework to construct the scene point clouds. In the second stage, we need to utilize the generated scene graphs to fine-tune and improve the constructed scene point clouds.

Prerequisites

  • High motivation to learn and conduct research
  • Good programming skills in Python, Pytorch
  • Basic experience with computer vision

Contact

dong.yang@tum.de

(Please attach your CV and transcript)

Supervisor:

Dong Yang, Xiao Xu

Inverse Rendering in a Digital Twin for Augmented Reality

Keywords:
Digital Twin, Illumination, HDR

Description

The task is to generate an End-to-End pipeline for illumination estimation inside of a digital twin.

Finally, also an AR application can be created.

Possible References

[1] https://arxiv.org/pdf/1905.02722.pdf

[2] https://arxiv.org/pdf/1906.07370.pdf

[3] https://arxiv.org/pdf/2011.10687.pdf

Prerequisites

  • Python (Pytorch)
  • Experience with Git

Contact

driton.salihu@tum.de

Supervisor:

Driton Salihu

An effective GCN for real-time skeleton-based human action detection

Keywords:
Action Recognition, Skeleton Sequence, Graph Convolutional Network, EfficientNet

Description

We need to build a real-time human skeleton-based human action detection system. This work is divided into two parts, first extracting skeleton information using a skeleton detector such as OpenPose, and then using a graph-based action detection model to perform real-time detection.

Prerequisites

- High motivation

- Python3 and Pytorch

- Deep learning basics

Supervisor:

Yuankai Wu

Unsupervised Vital Sign Detection

Description

Accurately detecting vital signs such as heart rate or blood pressure  from facial images is a key step towards personal health monitoring at home or remote doctor visits via Zoom. However, vital sign measurements obtained this way are typically very noisy. For the purpose of training models, having clean and ideal vital signs available would be highly beneficial.

In this work, an approach for unsupervised heart rate detection from face videos is investigated. Then, a supervised learning task is supplemented with these unsupervised predictions to make the vital sign detection more robust. For this, state-of-the-art machine learning techniques such as contrastive learning, convolutional neural networks, and autoencoders are investigated. The work will be done externally at CareX.AI, a healthcare company focused on cutting-edge AI concepts for vital sign monitoring.

 

Prerequisites

Basic experience in machine learning and Pytorch

 

Supervisor:

Markus Hofbauer - Christopher Kuhn (christopher@carex.ai) (CareX.AI)

Skin Tone Analysis for Vital Sign Detection: Towards More Unbiased AI

Description

Accurately detecting vital signs such as heart rate or blood pressure  from facial images is a key step towards personal health monitoring at home or remote doctor visits via Zoom. However, vital sign measurements obtained this way are typically very noisy. For the purpose of training models, having clean and ideal vital signs available would be highly beneficial.

In this work, existing vital sign detection methods are investigated regarding how their performance depends on the skin tone distribution. Methods for synthetically changing the skin tone in face videos such as Generative Adversarial Networks need to be tested to allow adjusting existing training sets towards optimal performance. The work will be done externally at CareX.AI, a healthcare company focused on cutting-edge AI concepts for vital sign monitoring.

Prerequisites

Basic experience in machine learning and Pytorch

 

Supervisor:

Markus Hofbauer - Christopher Kuhn (christopher@carex.ai) (CareX.AI)

Synthetic Vital Sign Generation and Application

Description

Accurately detecting vital signs such as heart rate or blood pressure  from facial images is a key step towards personal health monitoring at home or remote doctor visits via Zoom. However, vital sign measurements obtained this way are typically very noisy. For the purpose of training models, having clean and ideal vital signs available would be highly beneficial.

In this work, synthetic vital signs such as the blood volume change are generated using just the blood pressure value. Then, a model is designed to reconstruct these clean signals from the noisy signal estimates obtained from face videos. For this, a range of machine learning, computer vision, and signal processing techniques needs to be investigated and applied. The work will be done externally at CareX.AI, a healthcare company focused on cutting-edge AI concepts for vital sign monitoring.

 

Prerequisites

Basic experience in machine learning and Pytorch

 

Supervisor:

Markus Hofbauer - Christopher Kuhn (christopher@carex.ai) (CareX.AI)

Removing Personal Priors in Vital Sign Detection

Description

Accurately detecting vital signs such as heart rate or blood pressure from facial images is a key step towards personal health monitoring at home or remote doctor visits via Zoom. One major challenge is the unique shape of every person’s vital signs. Removing the personal component (or prior) from the signal obtained from a person’s face would make detecting the relevant signals easier. In this work, state-of-the-art signal processing and machine learning concepts will be used to investigate how personal priors can be removed from vital signs. The work will be done externally at CareX.AI, a healthcare company focused on cutting-edge AI concepts for vital sign monitoring.

 

Prerequisites

Basic experience in machine learning and Pytorch

 

Supervisor:

Markus Hofbauer - Christopher Kuhn (christopher@carex.ai) (CareX.AI)

3D object model extraction during teleoperation

Description

 

Robotic teleoperation is applied often for tasks that the robots are not proficient at. During teleoperation, the operator observes the remote scene via a camera. On the other hand, most robots additionally have a depth sensor which can be used to extract useful information for future tasks.

 

In this project we will analyze and test the situations, where we can use the recorded depth data to extract and reconstruct the 3D model of a novel object grasped and manipulated by the robot.

Prerequisites

Necessary and useful backgrond:

- ROS, python, C++

- Image processing, video processing

 

Additionally:

- Motivation to yield a good work

 

Contact

furkan.kaynar@tum.de

 

(Please provide your CV and transcript in your application)

 

 

Supervisor:

Hasan Furkan Kaynar

Implementation of robotic motion planning

Description

The motion planner of a robotic arm requires planning of the necessary motion, under collision avoidance and regarding the joint limitations of the robot. In this project, we will focus on motion planning of the Panda robot arm, using several planners. We will test OpenRave motion planner and compare it to the moveit motion planner.

We will also implement and test cartesian path planning using methods like the Descartes path planner.

 

At the end of this project, the student will learn about implementation and usage of different motion/path planners.

 

 

Prerequisites

Useful background:

- Robotic control

- Experience with ROS

 

Necessary background:

- Experience with C++

 

 

Contact

furkan.kaynar@tum.de

 

(Please provide your CV and transcript in your application)

 

 

Supervisor:

Hasan Furkan Kaynar

Change Detection in a Digital Twin

Keywords:
2D Pose Estimation, Nvidia Omniverse, Digital Twin, Point Clouds

Description

The core idea is to detect movements in a pre-defined scene (Digital Twin) and adapt the moving objects in the simulation (Nvidia Omniverse).

In this work, the student would have to:

  • Create a sensor setup
  • Segmentation and Pose Estimation of known objects in the room
  • Introduce dynamics and tracking
  • Update Environment in Nvidia Omniverse

For a research internship, further contributions in the field of pose estimation would have to be made.

Prerequisites

Preferable

  • Experience with Git
  • Python (Pytorch)
  • Nvidia Omniverse

Contact

driton.salihu@tum.de

Supervisor:

Driton Salihu

Multiband evaluation of passive signal for human activity recognition

Keywords:
CSI; HAR; AI
Short Description:
To obtain samples

Description

The student must use a rf system to collect samples for different activities.
Implement classification algorithms to determinine dofferent activies from CSI or using T-F transforms

Prerequisites

 

 

Contact

fabian.seguel@tum.de

of. 2940

Supervisor:

Fabian Esteban Seguel Gonzalez

Optimizing End-to-End Model Performance for Human Activity Detection

Keywords:
Deep Learning, Long-term Temporal Features, Human Activity Detection

Description

In this work, we want to extend an implemented end-to-end model for long-term human activity detection in videos. The goal of this topic is to recognize the current activity by finding the most effective input features and deep learning model structure.

Prerequisites

  • Good programming skills in Python, Pytorch
  • Knowledge of computer vision
  • Basic knowledge of handling sequential data

Contact

Please send me your CV and transcripts.

yuankai.wu@tum.de

Supervisor:

Yuankai Wu

Human-Object Interaction Detection in Videos

Keywords:
Deep Learning, Spatio-temporal feature, Object detection, Transformer.

Description

We identify semantic information about human and object in the video by object detection and human detection and later classify the interaction information by a Transformer model. A public dataset will be used and we will collect the dataset ourselves (we provide the sensors and code to collect the dataset).

Prerequisites

  • Good programming skills in Python, Pytorch
  • Knowledge of computer vision
  • Basic knowledge of handling sequential data

Contact

yuankai.wu@tum.de

Supervisor:

Yuankai Wu

OCC for indoor positioning and identification in harsh environments

Keywords:
optical camera communications; video tracking
Short Description:
Identify and track multiple light sources in a video stream

Description

Identify and track multiple light sources in a  video stream.

The student must record the video multiple objects to the tracked in the video stream.

Prerequisites

Python

Signal/image processing

 

Contact

fabian.seguel@tum.de

of. 2940

 

Supervisor:

Fabian Esteban Seguel Gonzalez

Sub-band analysis for indoor positioning: extracting robust features

Keywords:
OFDMA; CSI; indoor positioning
Short Description:
To obtain samples

Description

The student must set up a transmission scheme based on MIMO technology and OFDMA.

Take samples in different points inside an indoor environment for further processing of the signal characteristics to obtain an estimated position of the mobile node.

Prerequisites

Python

WIreless communications with focus in channel state information and OFDMA MIMO systems

Knowledge in USRP not required but is a plus

Contact

fabian.seguel@tum.de

of. 2940

Supervisor:

Fabian Esteban Seguel Gonzalez

High-level Robotic Teleoperation via Scene Editing

Description

Autonomous grasping and manipulation are complicated tasks which require precise planning and a high level of scene understanding. Although robot autonomy is evolving since decades, there is still need for improvement, especially for operating in unstructured environments like households. Human demonstration can improve the autonomous robot abilities further to increase the task success in different scenarios. In this thesis we will work on user interaction methods for describing a robotic task via modifying the viewed scene.

 

Prerequisites

Useful background:

- 3D Human-computer interfaces

- Game Design

- Digital signal processing

 

Required abilities:

- Experience with Unity and ROS

- Motivation to yield a good work

 

 

Contact

furkan.kaynar@tum.de

 

(Please provide your CV and transcript in your application)

 

 

Supervisor:

Hasan Furkan Kaynar

Robotic grasp learning from human demonstrations

Keywords:

Description

Autonomous grasping and manipulation are complicated tasks which require precise planning and a high level of scene understanding. Although robot autonomy is evolving since decades, there is still need for improvement, especially for operating in unstructured environments like households. Human demonstration can improve the autonomous robot abilities further to increase the task success in different scenarios. In this thesis we will work on learning from human demonstration for improving the robot autonomy.

Prerequisites

Required background:

- Digital signal processing

- Computer vision

- Neural networks and other ML algorithms

 

Required abilities:

- Experience with Python or C++

- Experience with Tensorflow or PyTorch

- Motivation to yield a good thesis

 

 

Contact

furkan.kaynar@tum.de

 

(Please provide your CV and transcript in your application)

 

 

Supervisor:

Hasan Furkan Kaynar

Reinforcement Learning for Estimating Virtual Fixture Geometry to Improve Robotic Manipulation

Description

Robotic teleoperation is often used to accomplish complex tasks remotely with human-in-the-loop. In cases, where the task requires very precise manipulation, virtual fixtures can be used to restrict and guide the motion of the end effector of the robot while the person teleoperates. In this thesis, we will analyze the geometry of virtual fixtures depending on the scene and task. We will use reinforcement learning to estimate ideal virtual fixture model parameters. At the end of the thesis, the performance can be evaluated with user experiments.

Prerequisites

Useful background:

- Machine learning (Reinforcement Learning)

- Robotic simulation

 

Requirements:

- Experience with Python & Deep learning frameworks (PyTorch / Tensorflow...)

- Experience with a RL framework

- Motivation to yield a good outcome

 

Contact

(Please provide your CV and transcript in your application)

 

furkan.kaynar@tum.de

diego.prado@tum.de

 

Supervisor:

Hasan Furkan Kaynar, Diego Fernandez Prado

Jacobian Null-space Energy Dissipation TDPA for Redundancy Robots in Teleoperation

Keywords:
Teleoperation, Robotics, Control Theory

Description

Teleoperation Systems

 

Bilateral teleoperation with haptic feedback provides its users with a new dimension of immersion in virtual or remote environments. This technology enables a great variety of applications in robotics and virtual reality, such as remote surgery and industrial digital twin [1]. Figure 1 shows a generalized human-in-the-loop teleoperation system with kinesthetic feedback, where the operator commands a remote/virtual robot to explore the environment and experiences the interactive force feedback through the haptic interface. 

Teleoperation systems face many challenges caused by unpredictable environment changes, time-delayed feedback, limited network capacity, etc. [2]. These issues inevitably distort the force feedback signal, degrading the transparency and stability of the system. In the past decades, many control algorithms and hardware architectures were developed to tackle these problems in the past decades [3].

Time Domain Passivity Method (TDPA)

 

TDPA is a passivity-based control scheme that ensures the stability of teleoperation systems in the presence of communication delays [4] (See Figure 2.). It abstracts two-port networks from the haptic system and observes the energy flow between the networks. Passivity condition is maintained by dissipating extra energy generated by non-passive networks. Original TDAP suffers from position drift and feedback force jump [5], and one reason for the position drift is that the energy, which is generated by the delayed communication, is dissipated in the task space of the robots.

 

Jacobian Null-Space for Redundancy Robot

 

Many robot mechanisms have redundant degrees of freedom (rDOFs), which means that they have a larger number of joints than the number of dimensions of their task or configuration space. The null space of the Jacobian null space stands for the redundant dimensions which can be exploited to dissipate extra energy by damping the null space motion without affecting the task space [5].

Your Task and Target

In this work, we target at improving the performance of TDPA by considering dissipating energy generated by time delay and other factors in the Jacobian null-space of the kinesthetically redundant robots. With the help of the Jacobian null-space method, we can avoid dissipating energy in the task space, so as to alleviate position drift and force distortion while keeping the system passive. For more information, previous work can be referred to as [7-9].

In this master's internship, your work will include

1.      1. surveying the related algorithms

2.      2. constructing the simulation environment

3.      3. experimenting with the state-of-the-art Jacobian null-space TDPA method.

4.      4. analyzing system passivity in Cartesian task space, joint space, null space, etc.

Prerequisites

Requirements

All requirements are recommended but not mandatory. However, you will need extra effort to catch up if you are unfamiliar with the following topics:

1.    3. Basic knowledge about robotics and control theory is favorable.

2.    2. Experience with robotics simulation software and platforms is favorable.

3.    1. C++, Matlab, and Python would be the primary working language. Basic knowledge about one or more of them is highly recommended.

Contact

zican.wang@tum.de

xiao.xu@tum.de

Supervisor:

Zican Wang, Xiao Xu

Learning-based human-robot shared autonomy

Keywords:
robot learning, shared control

Description

In shared control teleoperation, the robot intelligence and human input can be blended together to achieve improved task performance and reduce the human workload. In this topic, we would like to investigate how we can combine human input and robot intelligence effectively to achieve at the end full robot autonomy. We will employ robot learning from demonstration approaches, where we provide task demonstrations using teleoperation. 

We aim to test the developed algorithms in simulation and using Franka Emika robot arm.

Requirements:

Basic experience in C/C++

ROS is a plus

High motivation to learn and conduct research

Supervisor:

Embodied Object Detection in Real Environments

Description

In this project, we investigate embodied (active) object detection in real-world environments. The goal is to actively adapt the path of an autonomous agent to increase the performance of an object detector.

Prerequisites

  • Object detection
  • Pytorch
  • Knowledge of graph optimization

Supervisor:

Scene graph generation models analysis using Visual Genome benchmark and self-recorded datasets

Description

Scene graphs were first proposed [1] as a data structure that describes the object instances in a scene and the relationships between these objects. The nodes in the scene graph represent the detected target objects, whereas the edges denote the detected pairwise relationships. A complete scene graph can represent the detailed semantics of a dataset of scenes.

Scene graph generation (SGG) is a visual detection task for building structured scene graphs. In this work, we would like to compare the three traditional SGG models: Neural Motifs [2], Graph R-CNN [3], and IMP [4]. We will train and evaluate the models using the Visual Genome dataset and other commonly used datasets. In addition, our dataset will be annotated and then utilized.

[1]: J. Johnson, et al. "Image retrieval using scene graphs."

[2]: R. Zellers, et al. "Neural motifs: Scene graph parsing with global context."

[3]: J. Yang, et al. "Graph r-cnn for scene graph generation."

[4]: D. Xu, et al. "Scene graph generation by iterative message passing."

Prerequisites

  • Good Programming Skills (Python, GUI design)
  • Knowledge about Ubuntu/Linux/Pytorch
  • Knowledge about Computer vision/Neural network
  • Motivation to learn and conduct research

Contact

dong.yang@tum.de

Supervisor:

Dong Yang, Xiao Xu

Generalized Robot Learning from Demonstration

Keywords:
LfD, imitation learning, Franka Arm

Description

In this work, we would like to design a generative model for robot learning from demonstration. We will consider a table cleaning task, determine varying task conditions, and apply a generative model such that the robot can generalize to novel unseen conditions. We aim to use both simulation and Franka arm for testing our algorithms. 

Prerequisites

Requirements:

Experience in C/C++

ROS is a plus

High motivation to learn and conduct research

 

Supervisor:

Robotic task learning from human demonstration

Description

Autonomous grasping and manipulation are complicated tasks which require precise planning and a high level of scene understanding. Although robot autonomy is evolving since decades, there is still need for improvement, especially for operating in unstructured environments like households. Human demonstration can improve the autonomous robot abilities further to increase the task success in different scenarios. In this thesis we will work on learning from human demonstration for improving the robot autonomy.

Prerequisites

Required background:

- Digital signal processing

- Computer vision

- Neural networks and other ML algorithms

 

Required abilities:

- Experience with Python or C++

- Experience with Tensorflow or PyTorch

- Motivation to yield a good thesis

 

Contact

furkan.kaynar@tum.de

 

(Please provide your CV and transcript in your application)

 

Supervisor:

Hasan Furkan Kaynar

PCB design based on Embedded TPU and Linux-based SoM

Keywords:
PCB Design, Embedded Linux

Description

In this project, we aim to design a multi-layer PCB integrating embedded TPU with an ARM processor. After the electronic development, we will capture low-latency RGB data from a camera in sync with an inertial measurement unit(IMU) and use the TPU unit for AI processing. The first step of the project is to modify the provided embedded SoM and the embedded TPU board to test the communication with the TPU module. And afterward, finalize the PCB design and develop the board.

Prerequisites

Good PCB design skills in KiCAD.

Basic understanding of embedded systems.

Knowledge about ARM processors.

Contact

Please send your application to:

mojtaba.karimi@tum.de

Supervisor:

Mojtaba Karimi, Edwin Babaians

 

You can find important information about writing your thesis, giving a talk at LMT as well as templates for Powerpoint and LaTeX here.