Ongoing Thesis

Bachelor's Theses

Erarbeitung eines Konzeptes zur vollständigen Automatisierung des 3D-Scans aller BMW Standorte

Description

Erarbeitung eines Konzeptes zur vollständigen Automatisierung des 3D-Scans aller BMW Standorte

Supervisor:

Michael Adam - Thomas Irrenhauser (BMW)

Evaluation of Inverse Rendering using Multi-View RGB-D data

Description

  • The core idea is to detect illumination in a pre-defined scene (Digital Twin) and adapt the moving objects in the simulation.

    In this work, the student would have to:

    • Create a sensor setup
    • Inverse Rendering
    • Show lighting changes in the room
    • Estimate novel views

Prerequisites

Preferable

  • Experience with Git
  • Python (Pytorch)
  • Nvidia Omniverse

Contact

driton.salihu@tum.de

Supervisor:

Driton Salihu

Multiband evaluation of passive signal for human activity recognition

Keywords:
CSI; HAR; AI
Short Description:
To obtain samples

Description

The student must use a rf system to collect samples for different activities.
Implement classification algorithms to determinine dofferent activies from CSI or using T-F transforms

Prerequisites

 

 

Contact

fabian.seguel@tum.de

of. 2940

Supervisor:

Fabian Esteban Seguel Gonzalez

Multiband analysis of mm-Waves for passive patient monitoring

Keywords:
CSI; HAR; AI
Short Description:
To obtain samples

Description

The student must use a radar sytem to obtain vital signs of a patient.
The system must be embedded in a hospital bed.
Vital signs such as breathing rate and HR will be targeted; Others applications must be discussed.

Prerequisites

 

 

Contact

fabian.seguel@tum.de

of. 2940

Supervisor:

Fabian Esteban Seguel Gonzalez

Learning-based human-robot shared autonomy

Keywords:
robot learning, shared control

Description

In shared control teleoperation, the robot intelligence and human input can be blended together to achieve improved task performance and reduce the human workload. In this topic, we would like to investigate how we can combine human input and robot intelligence effectively to achieve at the end full robot autonomy. We will employ robot learning from demonstration approaches, where we provide task demonstrations using teleoperation. 

We aim to test the developed algorithms in simulation and using Franka Emika robot arm.

Requirements:

Basic experience in C/C++

ROS is a plus

High motivation to learn and conduct research

Supervisor:

Elfolgsmaximierung auf der streaming Platform YouTube

Description

...

Supervisor:

haptic data redution for position-position teleoperation control architecture

Keywords:
teleoperation control, haptics

Description

Using a teleoperation system with haptic feedback, the users can thus truly immerse themselves into a distant environment, i.e., modify it, and execute tasks without physically being present but with the feeling of being there. A typical teleoperation system with haptic feedback (referred to as a teleoperation system) comprises three main parts: the human operator OP)/master system, the teleoperator (TOP)/slave system, and the communication link/network in between. During teleoperation, the slave and master devices exchange multimodal sensor information over the communication link. This work aims to develop a haptic data reduction scheme based on a position-position teleoperation architecture and compare the performance with the classical position-force control architecture.

 

Your work:

(1) build up a teleoperation system that can switch between position-position and position-force architectures.

(2) integrate the existing haptic data reduction scheme with the PP architecture.

(3)  introduce delays, and implement existing passivity based control scheme to ensure system stability

(4) compare the performance difference between the PF and PP architectures.

Prerequisites

C++, matlab simulink

Supervisor:

Optimization of Saliency Map Creation

Keywords:
Saliency maps, deep learning, computer vision

Description

Saliency maps can be interpreted as probability maps the assess the scenes attractiveness and highlight the regions that might be interesting for the user to look. The objective of this thesis is to help create a novel dataset that records the head-motions and gaze directions for participants that watch 360° videos with varying dynamics in the scene. This dataset is then to be used to improve state-of-the-art saliency map creation algorithms and make them soft realtime-capable. Deep learning proved to be a robust technique to create saliency maps. The student is supposed to either use pruning techniques to boost the performance of state-of-the-art methods or to develop an own approach that delivers a trade-off between accuracy and computational complexity.

Prerequisites

Computer Vision, Machine Learning, C++, Python

Supervisor:

Master's Theses

Global Camera Localization in Lidar Maps

Keywords:
Contrastive Learning, Localization, Camera, Lidar, Point Clouds

Description

Visual localization is a fundamental problem in computer vision that applies to applications such as robotics, autonomous driving, or augmented reality. A common approach to visual localization is based on matching 2D features of an image query with points in a previously acquired 3D map [1]. A shortcoming of this of this approach is the viewpoint dependency of the query features, which leads to poor results when the viewing angle between the query and the map varies. Other effects, such as photometric inconsistencies, limit the potential of using 2D image features for  localization in a 3D map. 

 

Recently, localization based on point cloud to point cloud matching has been introduced [2,3]. Once a map is created using a 3D sensor such as lidar, the device can be localized by directly matching a query lidar point cloud with a previously created global map. The  advantage of this approach is the higher robustness against viewpoint variations and the direct depth information available in both the 3D map and 3D query [1]. 

 

A common assumption in visual localization approaches is that the same modality is used for both query and map generation. However, this is often not the case, especially for devices used for robotics and augmented reality. In order to perform point cloud-based localization for these common cases, cross-source point cloud retrieval and registration must be addressed. In this work, such an approach is investigated.

 

References

 

[1] T. Caselitz, B. Steder, M. Ruhnke, and W. Burgard, “Monocular camera localization in 3D LiDAR maps,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct. 2016, pp. 1926–1931. doi: 10.1109/IROS.2016.7759304.

 

[2] J. Du, R. Wang, and D. Cremers, “DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization,” in Computer Vision – ECCV 2020, vol. 12349, A. Vedaldi, H. Bischof, T. Brox, and J.-M. Frahm, Eds. Cham: Springer International Publishing, 2020, pp. 744–762. doi: 10.1007/978-3-030-58548-8_43.

 

[3] J. Komorowski, M. Wysoczanska, and T. Trzcinski, “EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale.” arXiv, Oct. 24, 2021. Accessed: Oct. 31, 2022. [Online]. Available: http://arxiv.org/abs/2110.12486

 

 

 

 

Prerequisites

  • Python and Git
  • Experience with SLAM
  • Experience with a deep learning framework (Pytorch, Tensorflow)
  • Interest in Computer Vision and Machine Learning

Contact

Please send your CV and Transcript of Records to:

adam.misik@tum.de

Supervisor:

Adam Misik

Deep learning based collaborative task prediction and execution for the mobile Yumi robot

Keywords:
Human-robot collaboration, human-object interaction, Deep Learning, assistive robot

Description

The participation of assistive robots in current industrial, as well as daily life scenarios, is increasing significantly. To better enable robots to perform the tasks that people are performing, or intend to perform, we want to investigate a deep learning-based vision solution. 

Prerequisites

- Deep Learning 

- Python3, Pytorch

- Computer Vision

- C++ and ROS

 

Contact

- Yuankai Wu

- Arne-christoph Hildebrandt (ABB(Asea Brown Boveri))

Supervisor:

Yuankai Wu - Arne-christoph Hildebrandt (ABB(Asea Brown Boveri))

Modeling and Generation of Interactive Stories for Children

Description

Conventional stories for children of ages 3—6 years are static, independent of the medium (text, video, audio). We aim to make stories interactive, by giving the user control over characters, objects, scenes, and timing. This will lead to the construction of novel, unique, and personalized stories situated in (partially) familiar environments. We restrict this objective to specific domains consisting of a coherent body of works, such as the children’s book series “Meine Freundin Conni”. The challenges in this thesis include finding a suitable knowledge representation for the domain, learning that representation automatically, and inferring a novel storyline over that representation with active user interaction. In this direction, both neural as well as symbolic approaches should be explored.

Prerequisites

- Familiarity with AI incl. Machine Learning (university courses / practical experience)

- Programming in C++ / Java / Python

Supervisor:

Rahul Chaudhari

Real-time multiple human tracking with camera-radar fusion for safe autonomous transportation

Keywords:
Deep Learning, data fusion, multi-human tracking

Description

This work aims to develop low-cost real-time 3D multiple object tracking frameworks based on camera-radar fusion for multiple people tracking. The purpose is to ensure the safety of AGV navigations in industrial autonomous transportation scenarios, most importantly, to avoid collisions with people.

Prerequisites

- Basic knowledge of computer vision

- Basic knowledge of deep learning-based tracker model

- Python 3. Pytorch

- C++

Contact

Yuankai Wu (yuankai.wu@tum.de)

Sophie Roscher (sophie.roscher@sick.de)

Supervisor:

Yuankai Wu - Sophie Roscher (Sick Ag (Sick))

Semi-automatic Objects Annotation for Human Acitivity Datasets

Keywords:
Object annotation, semi-automatic, object detection, object tracking, optical flow

Description

Most of the current human activity datasets focus on end-to-end models, i.e., they directly use RGB images as input and ignore the semantic information of human-object interaction. On the one hand, this model is easier to build, but on the other hand, it is also because of the huge amount of work and expense required to annotate the objects. This work focuses on annotating objects in the human behavior dataset to the extent that it allows our object detection models and human activity prediction models to provide valid semantic information.

Prerequisites

  • Very high motivation
  • Computer Vision Fundamentals
  • Programming in python
  • Basic understanding of deep learning

Contact

yuankai.wu@tum.de

Supervisor:

Yuankai Wu

Incorporating AoA information to MPDC fingerprinting for indoor localization

Keywords:
indoor positioning; fingerprinting; ULA

Description

Fuse AoA information with multipath delay componens for robust indoor localization.

Contact

fabian.seguel@tum.de

of 2940

Supervisor:

Fabian Esteban Seguel Gonzalez

Optimizing network deployment for AI-aided AoA estimation

Keywords:
AI; Network deployment; Indoor positioning

Description

The student will optimize through simulations the network atchitecture in order to obtain a robust positioning  estimate in a indoor set-up.

 

 

Prerequisites

matlab

AI/ML

 

Contact

fabian.seguel@tum.de

of. 2940

 

Supervisor:

Fabian Esteban Seguel Gonzalez

Reinforcement Learning for Estimating Virtual Fixture Geometry to Improve Robotic Manipulation

Description

Robotic teleoperation is often used to accomplish complex tasks remotely with human-in-the-loop. In cases, where the task requires very precise manipulation, virtual fixtures can be used to restrict and guide the motion of the end effector of the robot while the person teleoperates. In this thesis, we will analyze the geometry of virtual fixtures depending on the scene and task. We will use reinforcement learning to estimate ideal virtual fixture model parameters. At the end of the thesis, the performance can be evaluated with user experiments.

Prerequisites

Useful background:

- Machine learning (Reinforcement Learning)

- Robotic simulation

 

Requirements:

- Experience with Python & Deep learning frameworks (PyTorch / Tensorflow...)

- Experience with a RL framework

- Motivation to yield a good outcome

 

Contact

(Please provide your CV and transcript in your application)

 

furkan.kaynar@tum.de

diego.prado@tum.de

 

Supervisor:

Hasan Furkan Kaynar, Diego Fernandez Prado

3D ground truth capture for hand-object interactions

Description

The Dex-YCB dataset from NVIDIA and University of Washington provides visual data on several instances of human hands grasping physical objects. This topic is about augmenting the above dataset with ground-truth information from additional sensors, such as a VR hand-glove and contact sensors.

Apart from ground truth for benchmarking purely visual algorithms, the auxiliary sensors can be fused together with visual data for improved reconstruction of hand-object interactions.

Reference: https://dex-ycb.github.io/

Prerequisites

  • Interest and experience in working with hardware -- multiview cameras, VR gloves, self-made embedded sensors.
  • Faimiliarity with ROS (www.ros.org)
  • C++ for data acquisition and python/C++ for data processing

Contact

https://www.ei.tum.de/lmt/team/mitarbeiter/chaudhari-rahul/

(Please provide your CV and transcript in your application)

Supervisor:

Rahul Chaudhari

Network Aware Imitation Learning

Keywords:
Teleoperation, Learning from Demonstration

Description

 In this thesis, we would like the make the best out of varying quality demonstrations provided via teleoperation. We will use different teleoperation setups to test the developed robot learning approaches.

Prerequisites

Requirements:

Experience in C/C++

ROS is a plus

High motivation to learn and conduct research

 

Supervisor:

Depth data analysis for investigating robotic grasp estimates

Description

Many robotic applications are based on computer vision, which relies on the sensor output. A typical example is semantic scene analysis, with which the robot plans its motions. Many computer vision algorithms are trained in simulation which may or may not represent the actual sensor data realistically. Physical sensors are imperfect and the output erroneous data may deteriorate the performance of the required tasks. In this thesis, we will analyze the sensor data and estimate its effects on the final robotic task.

Prerequisites

Required background:

- Digital signal processing

- Image analysis / Computer vision

- Neural networks and other ML algorithms

 

Required abilities:

- Experience with Python or C++

- Experience with Tensorflow or PyTorch

- Motivation to yield a good thesis

 

Contact

furkan.kaynar@tum.de

 

(Please provide your CV and transcript in your application)

 

Supervisor:

Hasan Furkan Kaynar

Hand Pose Estimation Using Multi-View RGB-D Sequences

Keywords:
Hand Object Interaction, Pose Estimation, Deep Learning

Description

In this project the task is to fit a parametric hand mesh model and a set of rigid objects to a sequence of multi-view RGB-D cameras. Existing models for hand key-point detection and 6DoF pose estimation for rigid objects models have significantly evolved in recent years. Our goal is to utilize such models to estimate the hand and object poses.

Related Work

  1. https://dex-ycb.github.io/
  2. https://www.tugraz.at/institute/icg/research/team-lepetit/research-projects/hand-object-3d-pose-annotation/
  3. https://github.com/hassony2/obman
  4. https://github.com/ylabbe/cosypose

Prerequisites

  • Knowledge in computer vision.
  • Experience about segmentation models (i.e. Detectron2)
  • Experience with deep learning frameworks PyTorch or TensorFlow(2.x).
  • Experience with Pytorch3D is a plus.

Contact

marsil.zakour@tum.de

Supervisor:

Marsil Zakour

Attentive observation using intensity-assisted segmentation for SLAM in a dynamic environment

Keywords:
SLAM, ROS, Deep Learning, Segmentation

Description

Attentive observation using intensity-assisted segmentation for SLAM in a dynamic environment.

 

Supervisor:

Illumination of Augmented Reality Content using a Digital Enviroment Twin

Description

...

Supervisor:

Klassifikation von Wafermuster mit Hilfe von Machine Learing Methoden zur Automatisierung der Mustererkennung

Description

...

Supervisor:

Solid-State LiDAR and Stereo-Camera based SLAM for unstructured planetary-like environments

Keywords:
Solid-State LiDAR; Stereo-Camera; SLAM

Description

New developments in solid-state LiDAR technology open the possibility of integrating range sensors in possible space-qualifiable perception setups, thanks to mechanical designs with reduced moveable parts. Thereby, the development of a hybrid stereo-camera/LiDAR sensor setup might overcome disadvantages each technology comes with, such as limit range for stereo camera setups or the minimum range Lidars need. This thesis investigates such a new solid-state Lidar's possibilities by incorporating it along with a stereo camera setup and an IMU sensor into a SLAM system. Foreseen activities might include, but are not limited to, the design and construction of a portable/handhold sensor setup for recording and testing in planetary-like environments, extrinsic calibration of the sensors, integration into a software pipeline, development of a ROS interface, and preliminary mapping tests.

Supervisor:

Mojtaba Karimi - (German Space Agency (DLR))

Closing the sim-to-real gap using haptic teleoperation

Keywords:
Liquid Pouring , Skill Refinment, Sim-to-Real gap

Description

Simulation is a good way of investigating the possibility of a complex behavior implementation for service robots. Since we can not consider all the characteristics of the real work scenario there would be the sim-to-real adaptation step in order to fine-tune the current learned skill from simulations. In this project, the scenario will be the liquid pouring tasks which we already learned in a simulation environment. We will tackle the sim-to-real adaptation problem using haptic communication in order to demonstrate and skill for extra time and the robot will use the expert user knowledge in order to fine-tune the current learned skill from the simulation.

Prerequisites

Be familiar with skill refinement techniques. 

Be familiar with haptic teleoperation

Strong background in c++

String background in python

 

 

Contact

edwin.babaians@tum.de

Supervisor:

Edwin Babaians

Deep Predictive Attention Controller for LiDAR-Inertial localization and mapping

Keywords:
SLAM, Sensor Fusion, Deep Learning

Description

The multidimensional sensory data is computationally expensive for localization algorithms in autonomous navigation for drones. Research shows that not all sensory data are equivalently important during the entire process of SLAM to perform a reliable output. The attention control scheme is one of the effective ways to filter out the highly valuable sensory data for such a system. The predictive attention model, for instance, can help us to improve the result of the sensor fusion algorithms by concentrating on the most valuable sensory data based on the dynamic of the vehicle motion or the semantic understanding of the environment. The aim of this work is to investigate the state-of-the-art attention control models that can be adapted for the multidimensional sensory data acquisition system and compare them from different modalities. 

Prerequisites

- Strong background in Python and C++ programming

- Solid background in robot control theory

- Be familiar with deep learning frameworks (Tensorflow)

- Be familiar with the robot operating system (ROS)

Contact

leox.karimi@tum.de

Supervisor:

Model based Collision Identification for Real-Time Jaco2 Robot Manipulation 

Keywords:
ROS, Haptics, Teleoperation, Jaco2

Description

By the advancement of robotics and communication networks such as 5G, telemedicine has become a critical application for remote diagnosis and treatment.

In this project, we want to perform robotic teleoperation using a Sigma 7 haptic master and a Jaco 2 robotic manipulator.

 Tasks:

  • State of the art review and mathematical modeling
  • Jaco2 haptic controller implementation
  • Fault-tolerant (delay, network disconnect) controller design 
  • System evaluation with external force-torque sensor

Prerequisites

  • Strong background in C++ programming
  • Solid background in control theory 
  • Be familiar with robot dynamics and kinematics 
  • Be familiar with the robot operating system (ROS) and ROS Control (Optional)

Contact

edwin.babaians@tum.de

Supervisor:

Edwin Babaians

Interdisciplinary Projects

Point Cloud Completion using Generative Adversarial Networks

Keywords:
GAN, Point Cloud Completion

Description

To improve the quality of sparse/occluded point clouds, the problem of scan completion needs to be addressed. In recent years, deep generative models have shown promising completion capabilities in the 2D domain [1].

For the 3D domain, generative adversarial networks have been introduced to address the scan completion problem [2, 3, 4, 5]. In this IDP, we will explore GAN architectures for the completion of point clouds. First, we will present a comprehensive literature review. Our review will outline the current state of the art of GAN architectures for point cloud completion along with the datasets and metrics used. Second, we will propose suggestions for improving the current state of the art. Finally, the proposals will be implemented and discussed.

 

References

[1] Haitian Zheng, Zhe Lin, Jingwan Lu, Scott Cohen, Eli Shechtman, Connelly Barnes, Jianming Zhang, Ning Xu, Sohrab Amirghodsi, & Jiebo Luo (2022). CM-GAN: Image Inpainting with Cascaded Modulation GAN and Object-Aware Training. arXiv preprint arXiv:2203.11947.

[2]  Zhang, Junzhe, et al. Unsupervised 3D Shape Completion through GAN Inversion. arXiv:2104.13366, arXiv, 29 Apr. 2021. arXiv.orghttp://arxiv.org/abs/2104.13366.

[3]  Wang, Weiyue, et al. “Shape Inpainting Using 3D Generative Adversarial Network and Recurrent Convolutional Networks.” 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2317–25. IEEE Xplorehttps://doi.org/10.1109/ICCV.2017.252.

[4] Smith, Edward J., and David Meger. “Improved Adversarial Systems for 3D Object Generation and Reconstruction.” Proceedings of the 1st Annual Conference on Robot Learning, PMLR, 2017, pp. 87–96. proceedings. mlr.presshttps://proceedings.mlr.press/v78/smith17a.html.

[5] Ibing, Moritz, et al. “3D Shape Generation with Grid-Based Implicit Functions.” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2021, pp. 13554–63. DOI.org (Crossref)https://doi.org/10.1109/CVPR46437.2021.01335.

Prerequisites

  • Python and Git
  • Experience with a deep learning framework (Pytorch, Tensorflow)
  • Interest in Computer Vision and Machine Learning

Contact

Please send your CV and Transcript of Records to:

adam.misik@tum.de

Supervisor:

Adam Misik

Impact of Instance Segmentation Modality on the Accuracy of Action Recognition Models from Ego-Perspective Views

Description

The goal of this project is to use interactive segmentation methods to collect data for instance segmentation models and then analyze the impact of the instance segmentation modality on the performance of action detection networks.

 

 

Contact

marsil.zakour@tum.de

Supervisor:

Marsil Zakour

Extension of an Open-source Autonomous Driving Simulation for German Autobahn Scenarios

Description

This work can be done in German or English in a team of 2-4 members.
Self-driving cars need to be safe in the interaction with other road users such as motorists, cyclists, and pedestrians. But how can car manufacturers ensure that their self-driving cars are safe with us humans? The only realistic and economic way to test this is to use simulation.
cogniBIT is a Munich-based Startup founded by Alumni of TUM and LMU and provides realistic models of all kind of road users. These models are based on state-of-the art neurocognitive and sensorimotor research and reproduce human perception, cognition, and action with all its limitations.
In this project the objective is to extend the open-source simulator CARLA (www.carla.org) such that German Autobahn-like scenarios can be simulated.

Tasks:
•    Design an Autobahn scenario using the road description format OpenDRIVE.
•    Adapt the CARLA OpenDRIVE standalone mode (requires C++ knowledge).
•    Design an environment for the scenario using the Unreal Engine 4 Editor.
•    Perform a simulation-based experiment using the German Autobahn scenario and the cogniBIT driver model.

Prerequisites

•    C++ knowledge
•    experience with Python is helpful
•    experience with the UE4 editor is helpful
•    interest in autonomous driving and cognitive models

Supervisor:

Markus Hofbauer - Lukas Brostek (cogniBIT)

Extension of an Open-source Autonomous Driving Simulation for German Autobahn Scenarios

Description

This work can be done in German or English in a team of 2-4 members.
Self-driving cars need to be safe in the interaction with other road users such as motorists, cyclists, and pedestrians. But how can car manufacturers ensure that their self-driving cars are safe with us humans? The only realistic and economic way to test this is to use simulation.
cogniBIT is a Munich-based Startup founded by Alumni of TUM and LMU and provides realistic models of all kind of road users. These models are based on state-of-the art neurocognitive and sensorimotor research and reproduce human perception, cognition, and action with all its limitations.
In this project the objective is to extend the open-source simulator CARLA (www.carla.org) such that German Autobahn-like scenarios can be simulated.

Tasks:
•    Design an Autobahn scenario using the road description format OpenDRIVE.
•    Adapt the CARLA OpenDRIVE standalone mode (requires C++ knowledge).
•    Design an environment for the scenario using the Unreal Engine 4 Editor.
•    Perform a simulation-based experiment using the German Autobahn scenario and the cogniBIT driver model.

Prerequisites

•    C++ knowledge
•    experience with Python is helpful
•    experience with the UE4 editor is helpful
•    interest in autonomous driving and cognitive models

Supervisor:

Markus Hofbauer - Lukas Brostek (cogniBIT)

Research Internships (Forschungspraxis)

Human-robot interaction using vision-based human-object interaction prediction

Short Description:
Human-object interaction, human-robot interaction

Description

We use vision solution to locate the target object for the robot and send the desired object back to the operator to complete the whole process of human-robot interaction.

Prerequisites

- Panda arm

- Computer vision

- Human-object interaction prediction

-Grasping

Supervisor:

Yuankai Wu

Generating 3D facial expressions from speech

Description

Under this topic, the student should investigate Deep Learning approaches for animating faces of characters from speech signals. This involves generating facial expressions as well as lip/mouth movements according to what's being said in the speech signal.

 

References:

[1] NVIDIA audio2face https://www.nvidia.com/de-de/omniverse/apps/audio2face/

[2] Emotional Voice Puppetry https://napier-repository.worktribe.com/preview/3033680/EmotionalVoicePuppetry.pdf

Prerequisites

  • Familiarity and first experiences with Computer Vision and Deep Learning
  • Strong Python programming skills
  • Strong interest in 3D Computer Graphics and Computer Vision

Contact

https://www.ce.cit.tum.de/lmt/team/mitarbeiter/chaudhari-rahul/

 

Supervisor:

Rahul Chaudhari

Multimodal Learning for Localization Tasks

Keywords:
Camera, Lidar, Point Clouds, Deep Learning

Description

The field of multimodal learning has gained attention given the multisensor configurations of robotic systems and the advantages of sensor fusion [1]. In this work, we focus on combining camera and lidar data. There are several existing approaches that combine both sensors for object detection, pedestrian classification, or lane segmentation [1,2,3].

Our goal is to combine RGB images and lidar point clouds for localization tasks. The idea is to use a global lidar point cloud as a map and perform localization based on a single RGB image. The localization can then be solved based on a common embedding space as presented in [4,5]. 

References

[1] Melotti, Gledson, Cristiano Premebida, and Nuno Gonçalves. "Multimodal deep-learning for object recognition combining camera and LIDAR data." 2020 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC). IEEE, 2020.

[2] Zhang, Xinyu, et al. "Channel attention in LiDAR-camera fusion for lane line segmentation." Pattern Recognition 118 (2021): 108020.

[3] Asvadi, Alireza, et al. "Multimodal vehicle detection: fusing 3D-LIDAR and color camera data." Pattern Recognition Letters 115 (2018): 20-29.

[4] Cattaneo, D., et al. “Global Visual Localization in LiDAR-Maps through Shared 2D-3D Embedding Space.” 2020 IEEE International Conference on Robotics and Automation (ICRA), 2020, pp. 4365–71. IEEE Xplore, https://doi.org/10.1109/ICRA40945.2020.9196859.

[5] Jiang, Peng, and Srikanth Saripalli. "Contrastive Learning of Features between Images and LiDAR." 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE). IEEE, 2022.

 

 

Prerequisites

  • Python and Git
  • Experience with a deep learning framework (Pytorch, Tensorflow)
  • Interest in Computer Vision and Machine Learning

Contact

Please send your CV and Transcript of Records to:

adam.misik@tum.de

Supervisor:

Adam Misik

Improving Point Cloud-based Place Recognition and Re-localization using Point Cloud Completion Approaches

Keywords:
Point Cloud Completion, Place Recognition, Relocalization, Deep Learning

Description

The field of point cloud-based global placer recognition and re-localization has become a trending research area given the recent advances in geometric deep learning [1, 2, 3]. The main advantage of point cloud-based place recognition and re-localization is its robustness to photometric perturbations (day/night, weather conditions) and the direct availability of depth information. However, if the queried point cloud is sparse (e.g., due to occlusions) and thus of lower quality, such localization approaches fail.

One approach to deal with the sparseness of point clouds is through point cloud completion. In this research internship, we will investigate the potential of current point cloud completion methods for improving point cloud-based place recognition and re-localization. The completion models investigated will fall into the following categories: generative adversarial networks, probabilistic diffusion models, and variational autoencoders [4, 5]. 

References

 

  1. Du, Juan, et al. “DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization.” Computer Vision – ECCV 2020, edited by Andrea Vedaldi et al., vol. 12349, Springer International Publishing, 2020, pp. 744–62. DOI.org (Crossref)https://doi.org/10.1007/978-3-030-58548-8_43.
  2. Komorowski, Jacek, et al. EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale. arXiv:2110.12486, arXiv, 24 Oct. 2021. arXiv.orghttp://arxiv.org/abs/2110.12486.
  3. Uy, Mikaela Angelina, and Gim Hee Lee. “PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition.” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, 2018, pp. 4470–79. DOI.org (Crossref)https://doi.org/10.1109/CVPR.2018.00470.
  4. Zhang, Junzhe, et al. Unsupervised 3D Shape Completion through GAN Inversion. arXiv:2104.13366, arXiv, 29 Apr. 2021. arXiv.orghttp://arxiv.org/abs/2104.13366.
  5. Fei, Ben, et al. Comprehensive Review of Deep Learning-Based 3D Point Cloud Completion Processing and Analysis. arXiv:2203.03311, arXiv, 9 Mar. 2022. arXiv.orghttp://arxiv.org/abs/2203.03311.

 

 

Prerequisites

  • Python and Git
  • Experience with a deep learning framework (Pytorch, Tensorflow)
  • Interest in Computer Vision and Machine Learning

Contact

Please send your CV and Transcript of Records to:

adam.misik@tum.de

Supervisor:

Adam Misik

GAN-based subjective haptic signal quality assessment database augmentation and enlargement methods

Description

This project needs the student to research and implement a novel GAN-based approach for subjective haptic signal quality assessment database augmentation and enlargement. Also, subjective experiments will also be conducted to evaluate the result of the automatic data expansion.

Supervisor:

Zican Wang

Handshake for Plug-and-Play Haptic Interaction system

Keywords:
Teleoperation, GUI, Qt, JavaScript
Short Description:
Our project aims to implement handshake communication protocol for Plug-and-Play Haptic Interaction system according to the IEEE standard.

Description

Our project aims to implement handshake communication protocol for Plug-and-Play Haptic Interaction system according to the IEEE standard. For the system, the main achievements are:

 

1.     Plug and play on the Leader side: When the Leader device disconnects from the system, the Follower device will turn to the waiting state and will remain in its initial position same as when it’s activated in the system until the next re-insertion of the Leader device.

2.     Automatic adjustment of device parameters according to the specific type of Leader device to guarantee the performance of human perception: First of all, when connecting, the Leader device will transmit its media and interface information to the Follower side, so-called Metadata, and at the same time it will inform the Follower device of the specific model type it is using. The Follower device will adjust its parameters according to the received information to adapt to the Leader if the type of Leader device has different precision from the Follower device and transmits its metadata to the Leader.

   For the adjustments to the message transmission process:

1).   Achieve the PnP adjustment on the follower side.

 

2).   The message sending order, the format of the interface, the mode of pushing data packets into stacks, and the decoding function should obey the regulations of the IEEE standard.

 

Prerequisites

C/C++

socket programming

visual studio IDE

Contact

Email Adress: siwen.liu@tum.de, xiao.xu@tum.de

Supervisor:

Siwen Liu, Xiao Xu

GUI for Plug-and-Play Haptic Interaction system

Keywords:
Teleoperation, GUI, Qt, JavaScript
Short Description:
Our project aims to build a GUI for Plug-and-Play Haptic Interaction system according to the IEEE standard.

Description

Our project aims to build a GUI for Plug-and-Play Haptic Interaction system according to the IEEE standard. For the system, the main achievements are:

 

1.     Plug and play on the Leader side: When the Leader device disconnects from the system, the Follower device will turn to the waiting state and will remain in its initial position same as when it’s activated in the system until the next re-insertion of the Leader device.

2.     Automatic adjustment of device parameters according to the specific type of Leader device to guarantee the performance of human perception: First of all, when connecting, the Leader device will transmit its media and interface information to the Follower side, so-called Metadata, and at the same time it will inform the Follower device of the specific model type it is using. The Follower device will adjust its parameters according to the received information to adapt to the Leader if the type of Leader device has different precision from the Follower device and transmits its metadata to the Leader.

 

 

Prerequisites

The requirements of our project are as follows. For the GUI part:

1.   The GUI should be implemented under either Qt or JavaScript (first considering Qt) on both the Leader and Follower sides.

2.     For the Leader side, the GUI should be proposed including these functions:

1). Chooses the device on the Leader side.

2). Shows whether the handshaking is successful or not.

3). Shows the device type used on the Follower side after the handshake.

4). When the Leader device is disconnected from the system, show as well.

3.    For the Follower side, the GUI should be proposed including these functions:

1). Chooses the device on the Follower side.

2). Shows whether the handshaking is successful or not.

3). Shows the device type used on the Leader side, adjusts the parameters on the Follower side, and then shows the adjusted device type if the handshake is successful.

4). When the Leader device is disconnected from the system, show as well. And then shows the initial position of the Follower device in the waiting state.

4.    For the adjustments to the message transmission process:

1).   Achieve the PnP adjustment on the follower side.

 

2).   The message sending order, the format of the interface, the mode of pushing data packets into stacks, and the decoding function should obey the regulations of the IEEE standard.

 

Contact

Email Adress: siwen.liu@tum.de, xiao.xu@tum.de

Supervisor:

Siwen Liu, Xiao Xu

Learning the Layout of Large-Scale Indoor Point Clouds

Keywords:
Semantic Segmentation, Point Cloud, Layout Understanding

Description

Indoor localization approaches based on point clouds rely on global 3D maps of the target environment. Learning the layout of the 3D map by generating submaps representing scenes or rooms simplifies the localization process. In this work, we aim to develop an approach for the layout understanding of large-scale indoor point clouds. The approach can utilize methods from 3D semantic segmentation and structural element detection [1, 2]. 

The developed pipeline will be evaluated on indoor point clouds from different sources, e.g. LiDAR and SfM point clouds.

References

[1] I. Armeni et al., "3D Semantic Parsing of Large-Scale indoor Spaces," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1534-1543, doi: 10.1109/CVPR.2016.170

[2] T. Zheng et al. "Building fusion: semantic aware structural building-scale 3d reconstruction." IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).

Prerequisites

  • Python and Git
  • Experience with a deep learning framework (Pytorch, Tensorflow)
  • Interest in Computer Vision and Machine Learning

Contact

Please send your CV and Transcript of Records to:

adam.misik@tum.de

Supervisor:

Adam Misik

A 3D human-object interaction dataset for assistive robot

Keywords:
Human-robot interaction, human-object predition, 3D features

Description

Collect a dataset available for assistive robots in a 3D environment. Meet the visual perception system under human-object interaction detection to serve the robot.

Prerequisites

- Basics of Realsense 

- Basics of ROS

- Basics of computer vision

Supervisor:

Yuankai Wu

Inverse Rendering in a Digital Twin for Augmented Reality

Keywords:
Digital Twin, Illumination, HDR

Description

The task is to generate an End-to-End pipeline for illumination estimation inside of a digital twin.

Finally, also an AR application can be created.

Possible References

[1] https://arxiv.org/pdf/1905.02722.pdf

[2] https://arxiv.org/pdf/1906.07370.pdf

[3] https://arxiv.org/pdf/2011.10687.pdf

Prerequisites

  • Python (Pytorch)
  • Experience with Git

Contact

driton.salihu@tum.de

Supervisor:

Driton Salihu

Network Aware Shared Control

Keywords:
Teleoperation, Learning from Demonstration

Description

In this thesis, we would like the make the best out of varying quality demonstrations. We will test the developed approach with shared control.

Prerequisites

Requirements:

Experience in C/C++

ROS is a plus

High motivation to learn and conduct research

 

Supervisor:

Optimization of 3D Object Detection Procedures for Indoor Environments

Keywords:
3D Object Detection, 3D Point Clouds, Digital Twin, Optimization

Description

3D object detection has been a major task for point cloud-based 3D reconstruction of indoor environments. Current research has focused on having a low inference time for 3D object detection. While this is preferable, a lot of cases do not profit from this. Especially considering the use of a pre-defined static Digital Twin for AR and robotics application, thus this decreases the incentive for low inference time at the cost of accuracy.

As such this thesis will follow the approach of [1] (in this work only based on point cloud data) to generate proposals of layout and objects in a scene through for example [2]/[3] and use some form of optimization algorithm (reinforcement learning, genetic algorithm) to optimize to the correct solution.

Further, for more geometrical-reasonable results the use of a relationship graph neural network, as in [4], would be applied in the pipeline.

References

[1] Hampali, Shreyas, et al. “Monte Carlo Scene Search for 3D Scene Understanding.” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  (2021): 13799-13808. https://arxiv.org/abs/2103.07969#:~:text=We explore how a general, from noisy RGB-D scans.

[2] Chen, Xiaoxue, Hao Zhao, Guyue Zhou, and Ya-Qin Zhang. “PQ-Transformer: Jointly Parsing 3D Objects and Layouts From Point Clouds.” IEEE Robotics and Automation Letters  7 (2022): 2519-2526. https://arxiv.org/abs/2109.05566

[3] Qi, C., Or Litany, Kaiming He and Leonidas J. Guibas. “Deep Hough Voting for 3D Object Detection in Point Clouds.” 2019 IEEE/CVF International Conference on Computer Vision (ICCV)  (2019): 9276-9285. https://arxiv.org/abs/1904.09664

[4] Avetisyan, Armen, Tatiana Khanova, Christopher Bongsoo Choy, Denver Dash, Angela Dai and Matthias Nießner. “SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans.” ArXiv  abs/2003.12622 (2020): n. pag. https://arxiv.org/abs/2003.12622

 

Prerequisites

  • Python (Pytorch)
  • Experience with Git
  • Knowledge in working with 3D Point Clouds (preferable)
  • Knowledge about optimization methods (preferable)

Contact

driton.salihu@tum.de

Supervisor:

Driton Salihu

Learning Temporal Knowledge Graphs with Neural Ordinary Differential Equations

Description

...

Contact

zhen.han@campus.lmu.de

Supervisor:

Eckehard Steinbach - Zhen Han (LMU)

Sim-to-Real Gap in Liquid Pouring

Keywords:
sim-to-real

Description

We want to investigate what are the simulation bottlenecks in order to learn the pouring task. How we can tackle this problem. This project is more paper reading and the field of research is skill refinement and domain adaptation. In addition, we will try to implement one of the states of the art methods of teaching by demonstration in order to adapt the simulation skill to the real-world scenario.

Prerequisites

Creativity

Motivation

Strong C++ Background

Strong Phyton Background

Contact

edwin.babaians@tum.de

Supervisor:

Edwin Babaians

"Pouring Liquids" dataset development

Keywords:
Nvidia Flex, Unity3D, Nvidia Physics 4.0
Short Description:
Using Unity3D and Nvidia Flex plugin, develop a learning environment and model different fluids for teaching pouring tasks to robots.

Description

The student will develop different liquid characteristics using Nvidia Flex, will add different containers and particle collision checking system. In addition, a ground truth system to later use for robot teaching.

 

Reference:

https://developer.nvidia.com/flex

https://developer.nvidia.com/physx-sdk%20

Prerequisites

Strong Unity3D background

Familiar with Nvidia Physics and Nvidia Flex libraries. 

 

Contact

edwin.babaians@tum.de

Supervisor:

Edwin Babaians

Analysis and evaluation of DynaSLAM for dynamic object detection

Description

Investigation of DynaSLAM in terms of real-time capabilities and dynamic object detection.

Supervisor:

Comparison of Driver Situation Awareness with an Eye Tracking based Decision Anticipation Model

Keywords:
Situation Awareness, Autonomous Driving, Region of Interest Prediction, Eye Tracking

Description

This work can be done in German or English

The transmission of control to the human driver in autonomous driving requires the observation of the human driver. The vehicle has to guarantee that the human driver is aware of the current driving situation. One input source for observing the human driver is based on the driver's gaze.

The objective of this project is to compare two existing approaches for driver observation [1,2]. While [1] measures the driver situation awareness (SA), [2] anticipates the drivers decision. As part of a user study [2] published a gaze dataset. An interesting cross validation would be the comparison of the
SA score generated by [1] and the predicted decision correctness of [2].

Tasks

  • Generate ROI predictions [3] from the dataset of [2]
  • Estimate the driver SA with the model of [1]
  • Compare [1] and [2]
  • (Optional) Extend driving experiments

References

[1] Markus Hofbauer, Christopher Kuhn, Lukas Puettner, Goran Petrovic, and Eckehard Steinbach. Measuring driver situation awareness using region-of-interest prediction and eye tracking. In 22nd IEEE International Symposium on Multimedia (ISM), Naples, Italy, Dec 2020.
[2] Pierluigi Vito Amadori, Tobias Fischer, Ruohan Wang, and Yiannis Demiris. Decision Anticipation for Driving Assistance Systems. June 2020.
[3] Markus Hofbauer, Christopher Kuhn, Jiaming Meng, Goran Petrovic, and Eckehard Steinbach. Multi-view region of interest prediction for autonomous driving using semisupervised labeling. In IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, Sep 2020.

Prerequisites

  • Experience with ROS and Python
  • Basic knowledge of Linux

Supervisor:

3D object model reconstruction from RGB-D scenes

Description

The robots should be able to discover their environments and learn new objects in order to be a part of daily human life. There are still challenges to detect or recognize objects in unstructured environments like a household environment. For robotic grasping and manipulation, knowing 3D models of the objects are beneficial, hence the robot needs to infer the 3D shape of an object upon observation. In this project, we will investigate methods that can infer or produce 3D models of novel objects by observing RGB-D scenes. We will analyze the methods to reconstruct 3D information with different arrangements of an RGB-D camera.

 

 

 

 

 

 

 

 

 

Prerequisites

  • Basic knowledge of digital signal processing / computer vision
  • Experience with ROS, C++, Python.
  • Experience with Artificial Neural Network libraries or motivation to learn them
  • Motivation to yield a successful work

Contact

furkan.kaynar@tum.de

Supervisor:

Hasan Furkan Kaynar

Research on the implementation of an (automated) solution for the analysis of surface impurities on endoscope tubes

Description

...

Supervisor:

AI-Enhanced Tool for Desk Research – Smart Analytical Engine

Description

...

Supervisor:

Algorithm evaluation for robot grasping with compliant jaws

Keywords:
python, ROS, robot grasping
Short Description:
Apply state-of-the-art contact model for robot grasp planning with a customized physical setup including a KUKA robot arm and a parallel-jaw gripper with compliant materials.

Description

Model-based grasp planning algorithms depend on friction analysis since friction between objects and gripper-jaws highly affect the grasp robustness. A state-of-the-art friction analysis algorithm for grasp planning is evaluated with plastic robot fingers and achieved promising results, but will it work if grippers are mounted with compliant materials such as rubber and silicon, compared to more advanced contact models?

The task of this work is to create a new dataset and retrain an existing deep network by applying a state-of-the-art contact model for grasp planning.

 

 

Supervisor:

Adaptive LiDAR data update rate control based on motion estimation

Keywords:
SLAM, Sensor Fusion, ROS

Description

...

Supervisor:

Perceptual-oriented objective quality assessment for time-delayed teleoperation

Description

Recent advances in haptic communication cast light onto the promise of full immersion into remote real or virtual environments.

The quality of compressed haptic signals is crucial to fulfill this promise. Traditionally, the quality of haptic signals is evaluated through a series of subjective experiments. So far, only very limited attention was directed toward developing objective quality measures for haptic communication. In this work, we focus on the compression distortion and the delay compensation distortion that contaminate the force/velocity haptic signals generated by physical interaction with objects in the remote real or virtual environment.

 

Prerequisites

 

Contact

 

Supervisor:

UWB localization by Kalman filter and particle filter

Description

...

Supervisor:

Investigating the Potential of Machine Learning to Map Changes in Forest based on Earth Observation

Description

...

Supervisor:

Internships

Recording of Robotic Grasping Failures

Description

The aim of this project is collecting data by robotic grasping experiments and creating a largescale labeled dataset. We will conduct experiments while attempting to grasp known or unknown objects autonomously. The complete pipeline includes:

- Estimating grasp poses via computer vision

- Robotic motion planning

- Executing the grasp physically

- Recording necessary data

- Organizing the recorded data into a well-structured dataset

 

Most of the data collection pipeline has been already developed, additions and modifications may be needed.

Prerequisites

Useful background:

- Digital signal processing

- Computer vision

- Dataset handling

 

Requirements:

- Experience with Python and ROS

- Motivation to yield a good outcome

Contact

furkan.kaynar@tum.de

 

(Please provide your CV and transcript in your application)

 

Supervisor:

Hasan Furkan Kaynar

Student Assistant Jobs

Student Assistant Software Engineering Lab

Keywords:
Software Engineering, Unit Testing, TDD, C++

Description

We are looking for a teaching assistant student of our new Software Engineering Lab. In this course we explain basic principles of software engineering such as unit testing, test driven development and how to collaborate in teams [1].

You will act as a teaching assistant to supervise students during the lab session working on their practical homeworks. The tasks of the homeworks are generally C++ coding exercises where the students contribute to a common codebase. This means you should have a good experience in C++, unit testing, and git as this will be an essential part of the homeworks.

References

[1] Winters, Titus, Tom Manshreck, and Hyrum Wright, eds. Software Engineering at Google: Lessons Learned from Programming Over Time. O'Reilly Media, Incorporated, 2020

Prerequisites

  • Very good knowledge in C++
  • Experience with unit testing
  • Good understanding of git and collaborative software development

Supervisor:

MATLAB tutor for Digital Signal Processing lecture in summer semester 2022

Description

Tasks:

  • Help students with the basics of MATLAB (e.g. matrix operations, filtering, image processing, runtime errors)
  • Correct some of the homework problems 
  • Understand the DSP coursework material

 

We offer:

  • Payment according to the working hours and academic qualification
  • The workload is approximately 6 hours per week from May 2022 to August 2022
  • Technische Universität München especially welcomes applications from female applicants

 

Application:

  • Please send your application with a CV and transcript per e-mail to basak.guelecyuez@tum.de 
  • Students who have taken DSP course preferred.

Supervisor:

implementation of teleoperation systems using Raspberry PI

Description

We have already a framework of teleoperation system running in Windows, where two haptic devices are connected through UDP protocol, one as the leader device and the other is the follower.

Your tasks are:

1. move the framework to Linux system.

2. setup a ROS-based virtual teleoperation environment.

3. shift the framework to a raspberry PI.

Prerequisites

Linux, socket programming (e.g. UDP protocol), C and C++, ROS

Supervisor:

Student Assistant Software Engineering Lab

Keywords:
Software Engineering, Unit Testing, TDD, C++

Description

We are looking for a teaching assistant student of our new Software Engineering Lab. In this course we explain basic principles of software engineering such as unit testing, test driven development and how to collaborate in teams [1].

You will act as a teaching assistant to supervise students during the lab session working on their practical homeworks. The tasks of the homeworks are generally C++ coding exercises where the students contribute to a common codebase. This means you should have a good experience in C++, unit testing, and git as this will be an essential part of the homeworks.

References

[1] Winters, Titus, Tom Manshreck, and Hyrum Wright, eds. Software Engineering at Google: Lessons Learned from Programming Over Time. O'Reilly Media, Incorporated, 2020

Prerequisites

  • Very good knowledge in C++
  • Experience with unit testing
  • Good understanding of git and collaborative software development

Supervisor:

Student Assistant for distributed haptic training system

Keywords:
server-client, UDP, GUI programming

Description

Your tasks:

1.      build a serve-client telehaptic training system based on current code.

2.     develop GUI for client side.

required skills:

- c++

- knowledge about socket programming, e.g. UDP

- GUI programming, e.g. QT

- working environment: windows+visual studio



This work has a tight connection with the project Teleoperation over 5G networks and the IEEE standardization P1918.1.1.

https://www.ei.tum.de/lmt/forschung/ausgewaehlte-projekte/dfg-teleoperation-over-5g/

https://www.ei.tum.de/lmt/forschung/ausgewaehlte-projekte/ieee-p191811-haptic-codecs

Supervisor:

Student Assistant for reference software of time-delayed teleoperation with control schemes and haptic codec

Short Description:
In this work, you need to extend the current teleoperation reference software for enabling different control schemes and haptic data reduction approaches.

Description

Your tasks:

1.      Current code refactoring and optimization

2.      New algorithm implementation

 

This work has a tight connection with the project Teleoperation over 5G networks and the IEEE standardization P1918.1.1.

https://www.ei.tum.de/lmt/forschung/ausgewaehlte-projekte/dfg-teleoperation-over-5g/

https://www.ei.tum.de/lmt/forschung/ausgewaehlte-projekte/ieee-p191811-haptic-codecs

Prerequisites

 

C++, knowledge for control engineering and communication protocol

Supervisor: