Studentische Arbeiten

Diese Arbeit als PDF öffnen.

Selfsupervised Depth Estimation with Uncertainty

Beschreibung

Understanding the three-dimensional world is crucial for operating autonomous systems, as it enables accurate navigation, object recognition, and interaction with the environment. Depth estimation plays a vital role in this context, providing essential information about the spatial structure of the surroundings. Depth completion, which involves estimating dense depth maps from sparse measurements, is a topic that has received significant attention in the last years due to its capacity to provide a more complete understanding of the environment by only measuring a few data points in it.

This project focuses on depth completion by incorporating uncertainty estimation to improve the robustness and reliability of the generated depth maps. Students will build upon our current framework and findings to further extend the research by exploring the use of the latest neural network architectures and experimenting with new training strategies and methods. This project will provide an opportunity to advance state-of-the-art visual SLAM by leveraging cutting-edge deep learning techniques.

Voraussetzungen

- Knowledge in Computer Vision

- Motivation to learn and research.

- Good coding skills in C++ and Python.

- Experience in Machine Learning (PyTorch) is a plus.

Kontakt

xin.su@tum.de

Betreuer:

Xin Su

Diese Arbeit als PDF öffnen.

HPF-SLAM2, towards better Hybrid-Point Feautres SLAM System

Stichworte:
Machine Learning, Visual SLAM, Computer Vision

Beschreibung

In our previous research work [1], we demonstrated the effectiveness and benefits of using hybrid feature points and corresponding feature matching schemes for visual SLAM (Simultaneous Localization and Mapping). This thesis project aims to further extend this line of research, focusing on one or more of the following key tasks:

Feature Matching with a Novel Neural Network: The project will involve designing, training, and validating a new neural network model to improve feature-matching performance in visual SLAM systems, aiming for enhanced robustness and accuracy.
Loop Closure Detection: Development of efficient loop closure detection algorithms to reduce accumulated errors and improve the long-term stability and accuracy of the SLAM system.
Research on New Feature Extraction Algorithms: Exploration and application of new feature extraction algorithms to study their versatility in the context of our previous work [1], aiming to improve the performance of SLAM systems in various challenging environments.

Through this project, students will have the opportunity to learn cutting-edge solutions in visual SLAM, including the design, training, and validation of neural networks. This experience will not only deepen their understanding of SLAM systems but also equip them with the skills to apply deep learning techniques to real-world robot navigation and localization tasks, laying a solid foundation for future research and engineering practice.

Reference

[1] Su, Xin; Eger, Sebastian; Misik, Adam; Yang, Dong; Pries, Rastin; Steinbach, Eckehard: HPF-SLAM: An Efficient Visual SLAM System Leveraging Hybrid Point Features. 2024 IEEE International Conference on Robotics and Automation (ICRA), 2024

[2] P. Lindenberger, P. -E. Sarlin and M. Pollefeys, "LightGlue: Local Feature Matching at Light Speed," 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2023, pp. 17581-17592,

Voraussetzungen

- Knowledge in Computer Vision

- High interest in visual SLAM and machine learning, motivation to learn and research.

- Good coding skills in C++ and Python.

- Experience in Machine Learning (PyTorch) is a plus.

Kontakt

Xin Su,

xin.su@tum.de

Betreuer:

Xin Su

Diese Arbeit als PDF öffnen.

Vibrotactile Dataset Analysis

Stichworte:
Haptics, Sensors, Compression

Beschreibung

In this project, the student will analyze a dataset recorded on a vibrotactile glove. The goal is to gain information about signal properties like correlation between sensors, variance among samples, and frequency content.

The context is vibrotactile signal compression. Vibrotactile signals represent vibrations that, in this case, occur while sliding a robot hand over a surface. By gaining more information about the signal characteristics, we can better judge about the requirements for codecs that aim to leverage redundancies in signals.

Supervision possible in German or English.

Voraussetzungen

Programming knowledge in MATLAB or Python

Betreuer:

Lars Nockenberg

Diese Arbeit als PDF öffnen.

Diffusion Model-based Imitation Learning for Robot Manipulation Task

Beschreibung

Diffusion models are powerful generative models that enable many successful applications, such as image, video, and 3D generation from texts. It's inspired by non-equilibrium thermodynamics, which defines a Markov chain of diffusion steps to slowly add random noise to data and then learn to reverse the diffusion process to construct desired data samples from the noise.

In this work, we aim to explore the application of the diffusion model or its variants in imitation learning and evaluate it on the real-world Franka robot arm.

Voraussetzungen

Good Programming Skills (Python, C++)
Knowledge about Ubuntu/Linux/ROS
Motivation to learn and conduct research

Kontakt

dong.yang@tum.de

(Please attach your CV and transcript)

Betreuer:

Dong Yang

Diese Arbeit als PDF öffnen.

Comparative study of various hand-tracking approaches for Hand-Object Interaction in VR

Beschreibung

In this thesis, various hand-tracking approaches should be evaluated and compared for Hand-Object Interaction in Virtual Reality.

Voraussetzungen

- C++ and Python

- Ideally: Blender 3D software, experience with game development using Unreal Engine.

Betreuer:

Rahul Chaudhari

Diese Arbeit als PDF öffnen.

text2anim: 3D Animation of Animal Figurines based on Natural Language Commands

Beschreibung

This topic is about translating natural language user commands -- e.g, "[animal] shakes off water as if it's just rained" -- to animations of animal figurines.

Voraussetzungen

Interest and first experiences in Computer Graphics, Blender, Python.

Betreuer:

Rahul Chaudhari

Diese Arbeit als PDF öffnen.

Radar-based Material Classification

Stichworte:
signal processing, machine learning, material classification

Beschreibung

The work focuses on radar-based material classification. Due to the rapid development of autonomous driving technology, drones, home robots, and various smart devices in recent years, material sensing has received more attention. Millimeter-wave radar has been widely installed on these platforms due to its low price and robustness in harsh environments. Therefore, in this work, we will study methods for classifying some common indoor materials using millimeter wave radar signals.

In this work, we will collect radar signals from some common indoor materials such as wood, metal, glass, etc. After obtaining the required features through radar signal processing methods, we will use some machine learning algorithms to classify the materials.

Voraussetzungen

Programming in Python

Knowledge about Machine Learning

Knowledge about signal processing, particularly on radar signal processing

Kontakt

mengchen.xiong@tum.de

(Please attach your CV and transcript)

Betreuer:

Mengchen Xiong

Diese Arbeit als PDF öffnen.

Equivariant 3D Object Detection

Stichworte:
3D Object Detection, Computer Vision, Deep Learning, Indoor Environments

Beschreibung

The thesis focuses on the application of equivariant deep learning techniques for 3D object detection in indoor scenes. Indoor environments, such as homes, offices, and industrial settings, present unique challenges for 3D object detection due to diverse object arrangements, varying lighting conditions, and occlusions. Traditional methods often struggle with these complexities, leading to suboptimal performance. The motivation for this research is to enhance the robustness and accuracy of 3D object detection in these environments, leveraging the inherent advantages of equivariant deep learning. This approach aims to improve the model's ability to recognize objects regardless of their orientation and position in the scene, which is crucial for applications in robotics, or augmented reality.

The thesis proposes the development of a deep learning model that incorporates equivariant neural networks for 3D object detection, such as the equivariant framework proposed in [1]. The proposed model will be evaluated on a benchmark 3D indoor dataset, such as the Stanford 3D Indoor Spaces Dataset (S3DIS) or the ScanNet dataset [2, 3].

References

[1] Deng, Congyue, et al. "Vector neurons: A general framework for so (3)-equivariant networks." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.

[2] Dai, Angela, et al. "Scannet: Richly-annotated 3d reconstructions of indoor scenes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.

[3] Straub, Julian, et al. "The Replica dataset: A digital replica of indoor spaces." arXiv preprint arXiv:1906.05797 (2019).

Voraussetzungen

Python and Git
Experience with a deep learning framework (Pytorch, Tensorflow)
Interest in Computer Vision and Machine Learning

Betreuer:

Adam Misik

Diese Arbeit als PDF öffnen.

3D Hand-Object Reconstruction from monocular RGB images

Stichworte:
Computer Vision, Hand-Object Interaction

Beschreibung

Understanding human hand and object interaction is fundamental for meaningfully interpreting human action and behavior.

With the advent of deep learning and RGB-D sensors, pose estimation of isolated hands or objects has made significant progress.

However, despite a strong link to real applications such as augmented and virtual reality, joint reconstruction of hand and object has received relatively less attention.

This task focuses on accurately reconstructing hand-object interactions in three-dimensional space, given a single RGB image.

Voraussetzungen

Programming in Python
Knowledge about Deep Learning
Knowledge about Pytorch

Kontakt

xinguo.he@tum.de

Betreuer:

Xinguo He

Diese Arbeit als PDF öffnen.

intuitive teleoperation and behavior understanding

Beschreibung

You need to follow our previous development and implement a demo for real-time human behavior understanding/prediction.

job 1: dataset generation

job 2: pipeline and demo implementation

job 3: algo development

requirements: knowledge of yolo, opencv, mediapipe, programming with python or C++

Betreuer:

Xiao Xu

Diese Arbeit als PDF öffnen.

HiWI Position Project Lab Human Activity Understanding

Stichworte:
deep-learning,ros,real-sense,python

Beschreibung

A HiWi position is available for the Lab Course Human Activity Understanding.

The position offers 6 h/week contract.

The lab involves:

Practical Sessions where the students collect data from a color/depth sensor setup.
Notebook Sessions where the students are introduced to a jupyter notebook with brief theoretical content and homework.
Project Sessions, where the students are working on their own projects.

The main tasks of this position involve the following:

Helping students with data collection in Practical and Project Sessions.
Assisting during the notebook sessions with regard to the contents of the notebooks and homework.

Voraussetzungen

Knowledge about ROS.
Knowledge about python.
Basic Knowledge in Deep Learning

Kontakt

marsil.zakour@tum.de

Betreuer:

Marsil Zakour

Diese Arbeit als PDF öffnen.

leverage deep learning into visuaL SLAM

Stichworte:
Machine Learning, Computer Vision

Beschreibung

Vision-based Simultaneous Localization and Mapping (Visual-SLAM) is an intensively researched topic in computer vision. It is an essential tool in many tasks such as Extended Reality, Digital Twin, Autonomous driving, etc.

Existing methods for this topic work on hand-crafted visual features such as ORB [1]. However, deep-learning-based features have been proposed that show significant improvement in computer vision tasks [2][3]. The main task of this work is to leverage deep features in the existing framework and address the challenge of efficiency. Students will develop new neural networks based on our previous work, integrate them into Visual SLAM, and evaluate their performance. Paper Publication is possible and planned.

For further details, please do not hesitate to reach out to me.

Some References:

[1] https://arxiv.org/abs/1610.06475

[2] https://arxiv.org/abs/1712.07629

[3] https://arxiv.org/abs/1911.11763

Voraussetzungen

- Background in Computer Vision

- High interest in visual SLAM, motivation to learn and research.

- Good coding skills in C++ and Python.

- Experience in Machine Learning (PyTorch) is a plus.

Kontakt

Please send your CV and Transcript of Records to:

xin.su@tum.de

Betreuer:

Xin Su

Studentische Arbeiten am Lehrstuhl für Medientechnik

Offene Arbeiten

BA, MA, FP: Selfsupervised Depth Estimation with Uncertainty

Selfsupervised Depth Estimation with Uncertainty

Beschreibung

Voraussetzungen

Kontakt

Betreuer:

MA, FP: HPF-SLAM2, towards better Hybrid-Point Feautres SLAM System

HPF-SLAM2, towards better Hybrid-Point Feautres SLAM System

Beschreibung

Reference

Voraussetzungen

Kontakt

Betreuer:

IP: Vibrotactile Dataset Analysis

Vibrotactile Dataset Analysis

Beschreibung

Voraussetzungen

Betreuer:

MA, FP: Diffusion Model-based Imitation Learning for Robot Manipulation Task

Diffusion Model-based Imitation Learning for Robot Manipulation Task

Beschreibung

Voraussetzungen

Kontakt

Betreuer:

BA, IDP: Comparative study of various hand-tracking approaches for Hand-Object Interaction in VR

Comparative study of various hand-tracking approaches for Hand-Object Interaction in VR

Beschreibung

Voraussetzungen

Betreuer:

BA, IDP: text2anim: 3D Animation of Animal Figurines based on Natural Language Commands

text2anim: 3D Animation of Animal Figurines based on Natural Language Commands

Beschreibung

Voraussetzungen

Betreuer:

MA, FP: Radar-based Material Classification

Radar-based Material Classification

Beschreibung

Voraussetzungen

Kontakt

Betreuer:

MA: Equivariant 3D Object Detection

Equivariant 3D Object Detection

Beschreibung

Voraussetzungen

Betreuer:

MA, FP: 3D Hand-Object Reconstruction from monocular RGB images

3D Hand-Object Reconstruction from monocular RGB images

Beschreibung

Voraussetzungen

Kontakt

Betreuer:

SHK: intuitive teleoperation and behavior understanding

intuitive teleoperation and behavior understanding

Beschreibung

Betreuer:

SHK: HiWI Position Project Lab Human Activity Understanding

HiWI Position Project Lab Human Activity Understanding

Beschreibung

Voraussetzungen

Kontakt

Betreuer:

MA, FP: leverage deep learning into visuaL SLAM

leverage deep learning into visuaL SLAM

Beschreibung

Voraussetzungen

Kontakt

Betreuer: