Forschungspraxen / MSCE Research Internships

Einzelne angebotene Forschungspraxen oder MSCE Internships können auch als Aufgabe im Rahmen des Projektpraktikums Integrated Systems durchgeführt werden. Für die betreffenden Ausschreibungen ist dies im Ausschreibungstext explizit angegeben.

Offene Arbeiten

Interesse an einer Studien- oder Abschlussarbeit?
In unseren Arbeitsgruppen sind oftmals Arbeiten in Vorbereitung, die hier noch nicht aufgelistet sind. Teilweise besteht auch die Möglichkeit, ein Thema entsprechend Ihrer speziellen Interessenslage zu definieren. Kontaktieren Sie hierzu einfach einen Mitarbeiter aus dem entsprechenden Arbeitsgebiet. Falls Sie darüber hinaus allgemeine Fragen zur Durchführung einer Arbeit am LIS haben, wenden Sie sich bitte an Dr. Thomas Wild.

Setting up L4Re on Raspberry Pi

Beschreibung

In this thesis, you lead the setup of the L4Re (L4 Runtime Environment) on Raspberry Pi devices. This is an exciting opportunity to work on cutting-edge technology, combining the flexibility of L4 microkernel architecture with the capabilities of Raspberry Pi to build secure and modular operating systems for embedded systems.

You will delve into the fascinating world of microkernel architecture. Unlike traditional monolithic kernels, where most functionality resides within the kernel, microkernels follow a minimalist approach. The microkernel serves as a lightweight core, providing only essential services such as interprocess communication (IPC) and memory management. Additional operating system services and functionalities, including device drivers and file systems, are implemented as separate user-level processes or servers that run outside the microkernel.

Responsibilities:

  • Set up and configure the development environment for L4Re on Raspberry Pi, including toolchain and cross-compilation environment.
  • Obtain the L4Re source code by downloading from the official repository or using a specific release.
  • Build the L4Re system for Raspberry Pi, ensuring successful compilation and linking of components.
  • Deploy the compiled L4Re system onto Raspberry Pi boards, including bootloader setup and configuration.
  • Test and verify the functionality and performance of the L4Re system on Raspberry Pi, conducting both functional and security evaluations.
  • Develop custom components and applications on top of L4Re, integrating them into the system and extending its functionality.
  • Document the setup process, configurations, and customizations, ensuring clear and concise documentation for future reference.

 

Voraussetzungen

  • Solid understanding of embedded systems development and experience with low-level programming.
  • Proficiency in C/C++ programming languages and familiarity with cross-compilation environments.
  • Strong knowledge of operating system concepts and microkernel architectures, preferably with experience working with L4-based systems.
  • Experience with Raspberry Pi development and configuration, including bootloader setup and deployment.
  • Familiarity with embedded Linux, device drivers, and system-level software development.
  • Excellent problem-solving skills and the ability to troubleshoot complex issues.

Betreuer:

Lars Nolte

SystemC Model for Memory Preloading

Beschreibung

Since DRAM typically come with much higher access latencies than SRAM, many approaches to reduce DRAM latencies have already been explored, such as Caching, Access predictors, Row-buffers etc.

In the CeCaS research project, we plan to employ an additional mechanism, in detail a preloading mechanism of a certain fraction of the DRAM content to a small on-chip SRAM buffer. Thus, it is required to predict potentially next-accessed Cachelines, preload them to the SRAM and answer subsequent memory requests of this data from the SRAM instead forwarding them to the DRAM itself.

This functionality should be implemented as a TLM/SystemC model using Synopsys Platform Architect. A baseline system will bw provided, the goal is to implement this functionality in its simplest form as a baseline. Depending on the progress, this can be extended or refined in subsequent steps.

A close supervision, especially during the inital phase, will be guaranteed. Nevertheless, some experience with TLM modelling (e.g. SystemC Lab of LIS) or C++ programming is required.

Voraussetzungen

  • Experience with TLM modelling (e.g. SystemC Lab of LIS)
  • B.Sc. in Electrical Engineering or similar

Betreuer:

Oliver Lenke

Duckietown - Computer Vision Based Measurements for Performance Analyzes

Beschreibung

At LIS, we want to use the Duckietown hardware and software ecosystem for experimenting with our reinforcement learning based learning classifier tables (LCTs) as part of the control system of the Duckiebots: https://www.ce.cit.tum.de/lis/forschung/aktuelle-projekte/duckietown-lab/

More information on Duckietown can be found on https://www.duckietown.org/.

Currently, we are developing the infrastructure for our LCT experiments.
While several students are improving on the autonomous driving abilities of the Duckiebots, this work's goal is to develop a system which provides insights into the bots' driving performance.
Therefore, we need objective measurements by a third party which allow us to compare the driving performance (average speed, deviation from optimal line, …) between bots which might run different SW.
To compare specific scenarios, these measurements shouldn't only be a time series, but be enhanced by the position of the respective bot at this point in time. Similar to the modern sports apps: https://www.outdooractive.com/de/route/wanderung/tegernsee-schliersee/auf-dem-prinzenweg-vom-schliersee-an-den-tegernsee/1374189/#dm=1.

Towards this goal it's required to develop a camera-based mechanism, which identifies bots in the field, extracts their concrete position and calculates certain metrics like direction, velocity, on / off lane, spinning, deviation from ideal line, ….
This data then should be logged to be analyzed + visualized later.

Depending on the type of work (BA/FP vs MA) this work might include to enhance these measurements with internal data of the bots to compare internal and external perception, and to develop tools for easy analysis.

Voraussetzungen

  • independent work style
  • problem solving skills
  • computer vision knowledge (openCV)
  • programming skills (we are open which framework is used - Python, C++ (Qt), Matlab, ...
  • Linux basics (permissions)

Kontakt

flo.maurer@tum.de

Betreuer:

Bring-up and Evaluation of FPGA Network Accelerator Boards

Beschreibung

With the advent of research on the next generation of
mobile communications 6G, LIS is engaged in exploring
architectures and architecture extensions for networking
hardware. In order to research on energy-efficient and
low-latency network interfaces for next-generation
networks, we are setting up a hardware testbed with
FPGA-based Network Interface Accelerator Cards in the
form of Xilinx Alveo and NetFPGA SUME FPGA boards.
This requires the bring-up and evaluation of the  HDL
Design, the Linux kernel drivers and applications.

OpenNIC provides open-source code for an HDL design and kernel driver as basis for a custom hardware network interface on FPGAs. The goal of this work is to configure and setup the project on the available hardware resources at our chair. This includes analyzing the open-source designs, establishing a workflow to modify given designs with proper hardware and software design tools, documentation and integration and testing the correct functionality of both hardware and software. As an additional task, the setup could be evaluated concerning key performance indicators (KPIs) such as latency, bandwidth, memory footprint, etc. This can also involve the creation of custom tests in (both) hardware and software.

Voraussetzungen

- Good experience with Linux, Command Line Tools and Bash scripting
- Programming skills in VHDL/Verilog and C (and Python)
- Preferably experience with FPGA Design and Implementation

Kontakt

Marco Liess, M.Sc.

Tel.: +49.89.289.23873
Raum: N2139
Email: marco.liess@tum.de

 

Betreuer:

Marco Liess

XCS with dynamic sized experience replay

Stichworte:
XCS, Experience replay, machine learning, reinforcement learning, genetic algorithms

Beschreibung

XCS is a class of human-interpretable machine learning algorithm which combines the concepts of machine learning from reinforcement learning and the pattern recognition property of genetic algorithms. XCS operates my maintaining a population of classifiers. Experience replay is a popular strategy used in machine learning to speed the process of learning and improve performance.

In this work, you will

1. Understand the XCS algrorithm and its implementation in python.

2. Implement dynamic sized ER buffer for the XCS.

3. Experiment your work on various benchmarks used in the machine learning domain.

4. Compare your results against an XCS without ER and an XCS with fixed size ER.

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences:

  • Good python skills
  • Basic knowledge of machine learning
  • Self-motivated and structured work style

 

 

Kontakt

Anmol Surhonne

Technische Universität München
Department of Electrical and Computer Engineering
Chair of Integrated Systems
Arcisstr. 21
80290 München
Germany

Phone: +49.89.289.23872
Fax: +49.89.289.28323
Building: N1 (Theresienstr. 90)
Room: N2137
Email: anmol.surhonne at tum dot de

 

Betreuer:

Anmol Prakash Surhonne

Implement a Neural Network based DVFS controller for runtime SoC performance-power optimization

Stichworte:
Neural Networks, DVFS, Machine learning,

Beschreibung

Reinforcement learning (RL) has been widely used for run-time management on multi-core processors. RL-based controllers can adapt to varying emerging workloads, system goals, constraints and environment changes by learning from their experiences.

Neural Networks are a set of ML methods which are inspired by the human brain, mimicking the way that biological neurons signal to one another.

In this work, you will

1. Understand the working of Neural Networks. Implement a neural network in C.


2. Understand the architecture of the Leon3 based SoC.

3. Use neural networks to learn and control the processor voltage and frequency in runtime to optimize performance and power.

4. Design, test and implement the work on Xilinx FPGA

 

Voraussetzungen

To successfully complete this project, you should already have the following skills and experiences: 
• Good VHDL and C programming skills 
• Good understanding of MPSoCs
• Self-motivated and structured work style
• Knowledge of machine learning algorithms

 

Kontakt

Anmol Surhonne

Technische Universität München
Department of Electrical and Computer Engineering
Chair of Integrated Systems
Arcisstr. 21
80290 München
Germany

Phone: +49.89.289.23872
Fax: +49.89.289.28323
Building: N1 (Theresienstr. 90)
Room: N2137
Email: anmol.surhonne at tum dot de

 

 

Betreuer:

Anmol Prakash Surhonne

Simulation of In-Vehicular-Network

Beschreibung

Context:

Future cars have a wide variety of sensors such as cameras, LiDARs and RADARs that generate a large amount of data. This data has to be sent via an intra-vehicular network (IVN) to further processing nodes and, ultimately, actuators have to react to the sensor input. Inbetween the processing steps the intra-vehicular network has to ensure that all of the data and control signals reach their destination in time. Hence, next to a large amount of data, there are also strict timing constraints that the intra-vehicular network has to cope with. Therefore, the so called time-sensitive networking (TSN) has been introduced. The functional safety of such networks plays an important role against the background of higly automated driving. Emerging errors have to be detected early and potential countermeasures have to be taken to keep the vehicle in a safe state. Therefore, highly sophisticated monitoring and diagnosis algorithms are a key requirement for future cars. (See Project EMDRIVE)

Our approach for such diagnosis builds on non-intrusively monitoring the intra-vehicular network by snooping on data traffic at an interconnect in the car. An analysis of the traffic shall give information about anomalies that occur inside the network as symptoms of an error inside the electrical architecture.

THIS WORK:

Substance of this work is to first work into an existing simulation environment for an IVN with TSN in OMNET++. Based on the already existing work, several extensions have to be implemented (C++ based) in the simulation environment to mimic certain fault classes like delayed messages, broken links, etc. These fault classes can then later on be injected into the IVN simulation. (FP or IDP)

It is desired to combine Forschungspraxis and Master thesis in the context of this project. In this way, the time during Forschungspraxis can be used to familiarize with the OMNET++ simulation environment.

Voraussetzungen

OSI-Layer
Basic knowledge in C++
Basic knowledge in simulations

Kontakt

matthias.ernst@tum.de

Betreuer:

Matthias Ernst

AI based Automotive Ethernet Anomaly Detection

Beschreibung

CONTEXT:

Future cars have a wide variety of sensors such as cameras, LiDARs and RADARs that generate a large amount of data. This data has to be sent via an intra-vehicular network (IVN) to further processing nodes and, ultimately, actuators have to react to the sensor input. Inbetween the processing steps the intra-vehicular network has to ensure that all of the data and control signals reach their destination in time. Hence, next to a large amount of data, there are also strict timing constraints that the intra-vehicular network has to cope with. Therefore, the so called time-sensitive networking (TSN) has been introduced. The functional safety of such networks plays an important role against the background of higly automated driving. Emerging errors have to be detected early and potential countermeasures have to be taken to keep the vehicle in a safe state. Therefore, highly sophisticated monitoring and diagnosis algorithms are a key requirement for future cars. (See Project EMDRIVE)

Our approach for such diagnosis builds on non-intrusively monitoring the intra-vehicular network by snooping on data traffic at an interconnect in the car. An analysis of the traffic shall give information about anomalies that occur inside the network as symptoms of an error inside the electrical architecture.

FORSCHUNGSPRAXIS:

Substance of this work is to first work into an existing simulation environment for an IVN with TSN in OMNET++. Based on the already existing work, several extensions have to be implemented (C++ based) in the simulation environment to mimic certain fault classes like delayed messages, broken links, etc. These fault classes can then later on be injected into the IVN simulation. (FP or IDP)

MASTERARBEIT:

Subsequently, the IVN simulation can be used to generate a stream of Ethernet traffic that can be investigated for aforementioned anomalous behavior. The challenge is to detect and classify anomalous behavior without any preknowledge on timing and nature of the same. Therefore, several methods can be applied, ranging from static measurement methods to AI based anomaly detection. The focus of this work could lie on the application of LSTM or transformer-based models. The preferred development environment builds on python and related libraries like TensorFlow and Keras. Comparing at least two different anomaly detection methods and exploring, as well as discussing their respective advantages rounds up the master thesis. (M)

It is desired to combine Forschungspraxis and Master thesis in the context of this project. In this way, the time during Forschungspraxis can be used to familiarize with the OMNET++ simulation environment.

If you are interested, feel free to contact me! Please send your CV as well as a recent transcript.

Voraussetzungen

In the course of both projects (FP+MA) it is possible to take enough time to work into the related technichal fields. The main skills that will be developed and needed during this project are the following:

OSI-Layer,
Basics in automotive E/E architectures,
Basics in simulations (OMNET++)
Basics in C++ and Python
Experience with AI (TensorFlow and Keras)

Kontakt

matthias.ernst@tum.de

Betreuer:

Matthias Ernst

Porting of HDL Designs for Packet Processing to new FPGA board

Beschreibung

In order to research on energy-efficient and
low-latency network nodes for next-generation
networks, we are setting up a hardware testbed
with FPGA-based Network Interface Accelerator
Cards in the form of Xilinx Alveo FPGA boards.
Previous research in the area of Network Interface
Cards (NICs) at our chair led to HDL Designs of
Packet Processing Pipelines on older generation
FPGAs, which shall serve as basis for further research on Xilinx Alveo FPGA boards.

Therefore, the goal of this work is to port these existing HDL Designs to the new FPGAs. This includes analyzing the existing designs, extracting the required logic and integrating it into the OpenNIC framework. Further, the functionality of the designs has to be verified by testing.

Voraussetzungen

- Experience with Linux, Command Line Tools and Bash scripting
- Programming skills in VHDL/Verilog and C (and Python)
- Experience with FPGA Design and Implementation

Kontakt

Marco Liess, M.Sc. 

Tel.: +49.89.289.23873
Gebäude: N1 (Theresienstr. 90)
Raum: N2139
Email: marco.liess@tum.de

 

Betreuer:

Marco Liess

Laufende Arbeiten

Duckietown Autonomous Driving Pipeline - Assignement of Processing Ressources

Beschreibung

At LIS we want to use the Duckietown hardware and software ecosystem for experimenting with our reinforcement learning based learning classifier tables (LCT) as part of the control system of the Duckiebots: https://www.ce.cit.tum.de/lis/forschung/aktuelle-projekte/duckietown-lab/

More information on Duckietown can be found on https://www.duckietown.org/.

In this student work, we want to investigate the different processing resources of the Duckiebots.
A first step is to analyse the HW + SW hierarchy of the bots, in order to develope mechanisms to actively assign certain stages of the image processing pipeline to either the CPU or the GPU (in the future also FPGA). Developing this assignement mechanism is already the next step. In a last step, a certain pipeline stage should be ported to the newly used processing ressource and the differences in perferomance to before should be measured.

Betreuer:

Development and Evaluation of a Crossbar for Packet Processing Architectures

Beschreibung

The goal of this project is to design, implement and evaluate a crossbar. The crossbar should be compatible with AXI4-Stream and support at least two arbitration modes when multiple initiators want to access the same target: round-robin and priority-based. It should introduce no idle clock cycles when switching from a waiting state to a forwarding state and when switching between different initiators that want to access the same target.

The crossbar should be implemented in Verilog and integrated into an existing packet processing architecture. Furthermore, it should be compared to other interconnects in terms of throughput and latency via behavioral simulations in Vivado. Optionally the design can be implemented and tested on FPGA.

Betreuer:

Klajd Zyla

Analyzing Power Consumption in a Simulation Model

Beschreibung

Next to the raw computational performance of an system on chip (SoC), the power consumption is an important trait. Especially in the mobile domain, where the efficiency of the SoC is critical, the power consumption of the whole system needs to be considered. Multiple factors contribute to the overall power consumption, such as activity/state of the CPU, memory accesses including cache misses and use of potential hardware acceleration.

In this work the power consumption of a SoC should be analyzed, herefor an gem5 simulation model exists.

Betreuer:

Tim Twardzik

Bring-up and Evaluation of FPGA Network Accelerator Boards

Beschreibung

With the advent of research on the next generation of
mobile communications 6G, LIS is engaged in exploring
architectures and architecture extensions for networking
hardware. In order to research on energy-efficient and
low-latency network interfaces for next-generation
networks, we are setting up a hardware testbed with
FPGA-based Network Interface Accelerator Cards in the
form of Xilinx Alveo and NetFPGA SUME FPGA boards.
This requires the bring-up and evaluation of the  HDL
Design, the Linux kernel drivers and applications.

OpenNIC provides open-source code for an HDL design and kernel driver as basis for a custom hardware network interface on FPGAs. The goal of this work is to configure and setup the project on the available hardware resources at our chair. This includes analyzing the open-source designs, establishing a workflow to modify given designs with proper hardware and software design tools, documentation and integration and testing the correct functionality of both hardware and software. As an additional task, the setup could be evaluated concerning key performance indicators (KPIs) such as latency, bandwidth, memory footprint, etc. This can also involve the creation of custom tests in (both) hardware and software.

Voraussetzungen

- Good experience with Linux, Command Line Tools and Bash scripting
- Programming skills in VHDL/Verilog and C (and Python)
- Preferably experience with FPGA Design and Implementation

Kontakt

Marco Liess, M.Sc.

Technische Universität München
Lehrstuhl für Integrierte Systeme
Arcisstr. 21, 80333 München

Tel.: +49.89.289.23873
Raum: N2139
Email: marco.liess@tum.de

 

Betreuer:

Marco Liess

Investigation and Implementation of Approximate Comparators for FPGA

Beschreibung

Approximate computing is an emerging design paradigm that trades in accuracy for resource consumption, i.e. a certain inaccuracy of the calculations is allowed with the goal of reducing the overall resource consumption of the implemented design. One branch in this research field focuses on the approximation of arithmetic units, such as adders, subtractors, multipliers, and dividers. In this research internship, approximate dividers suitable for implementation on FPGA should be investigated.

The research internship starts with a literature research about state-of-the-art approximate dividers. Relevant literature must be searched and surveyed. Afterwards, the most promising approximate divider designs have to be selected based on the literature research. These designs must then be implemented in VHDL targeted for an FPGA design. Finally, a rudimentary evaluation of the implemented dividers has to be performed.

Voraussetzungen

The student should have the following skills in order to successfully complete the research internship:     

  • Good ability to understand technical and scientific literature (e.g IEEE or ACM papers)     
  • Analytical thinking     
  • Good programming skills in VHDL     
  • The ability to work independently     
  • High motivation
  • Previous experience with approximate computing is helpful, but not essentially required.

The student can work on the research internship remotely from his home office.

Kontakt

Arne Kreddig
Doctoral Candidate at LIS, TUM 
FPGA Design Engineer at SmartRay GmbH

arne.kreddig@smartray.com

Betreuer:

Arne Kreddig

Investigation and Implementation of Approximate Dividers for FPGA

Beschreibung

Approximate computing is an emerging design paradigm that trades in accuracy for resource consumption, i.e. a certain inaccuracy of the calculations is allowed with the goal of reducing the overall resource consumption of the implemented design. One branch in this research field focuses on the approximation of arithmetic units, such as adders, subtractors, multipliers, and dividers. In this research internship, approximate dividers suitable for implementation on FPGA should be investigated.

The research internship starts with a literature research about state-of-the-art approximate dividers. Relevant literature must be searched and surveyed. Afterwards, the most promising approximate divider designs have to be selected based on the literature research. These designs must then be implemented in VHDL targeted for an FPGA design. Finally, a rudimentary evaluation of the implemented dividers has to be performed.

Voraussetzungen

The student should have the following skills in order to successfully complete the research internship:     

  • Good ability to understand technical and scientific literature (e.g IEEE or ACM papers)     
  • Analytical thinking     
  • Good programming skills in VHDL     
  • The ability to work independently     
  • High motivation
  • Previous experience with approximate computing is helpful, but not essentially required.

The student can work on the research internship remotely from his home office.

Kontakt

Arne Kreddig
Doctoral Candidate at LIS, TUM 
FPGA Design Engineer at SmartRay GmbH

arne.kreddig@smartray.com

Betreuer:

Arne Kreddig

Investigation and Implementation of Approximate Dividers for FPGA

Beschreibung

Approximate computing is an emerging design paradigm that trades in accuracy for resource consumption, i.e. a certain inaccuracy of the calculations is allowed with the goal of reducing the overall resource consumption of the implemented design. One branch in this research field focuses on the approximation of arithmetic units, such as adders, subtractors, multipliers, and dividers. In this research internship, approximate dividers suitable for implementation on FPGA should be investigated.

The research internship starts with a literature research about state-of-the-art approximate dividers. Relevant literature must be searched and surveyed. Afterwards, the most promising approximate divider designs have to be selected based on the literature research. These designs must then be implemented in VHDL targeted for an FPGA design. Finally, a rudimentary evaluation of the implemented dividers has to be performed.

Voraussetzungen

The student should have the following skills in order to successfully complete the research internship:     

  • Good ability to understand technical and scientific literature (e.g IEEE or ACM papers)     
  • Analytical thinking     
  • Good programming skills in VHDL     
  • The ability to work independently     
  • High motivation
  • Previous experience with approximate computing is helpful, but not essentially required.

The student can work on the research internship remotely from his home office.

Kontakt

Arne Kreddig
Doctoral Candidate at LIS, TUM 
FPGA Design Engineer at SmartRay GmbH

arne.kreddig@smartray.com

Betreuer:

Arne Kreddig

Introspective Failure Prediction Algorithms on FPGAs

Beschreibung

Failure cases in autonomous driving are important to collect and investigate to improve the performance  of the system before large-scale deployment. Additionally, disengaging the autonomous driving system before the failure takes place can allow the human to take over in good time and maintain safety in uncertain situations. It is important to accelerate these algorithms on low-power, low-latency hardware, as they must run alongside the more compute intensive autonomous driving stack.

In this research internship, an image-classification convolutional neural network will be trained on a failure predicition dataset, then deployed on a dataflow based accelerator. The accelerator will be optimized for speed and efficiency, and HW-error cases will be investigated.

Voraussetzungen

To successfully complete this project, you should have the following skills and experiences:

  • Good prgramming skills in Python and Pytorch
  • Basic programming skills in HDL/HLS
  • Good knowledge of neural networks, particularly convolutional neural networks

The student is expected to be highly motivated and independent. By completing this project, you will be able to:

  • Optimize CNNs and their target hardware accelerator to improve overall system performance
  • Test and evaluate solutions for correctness and applicability
  • Present your work in the form of a scientific report

Kontakt

Nael Fasfous
Department of Electrical and Computer Engineering
Chair of Integrated Systems

Phone: +49.89.289.23858
Building: N1 (Theresienstr. 90)
Room: N2116
Email: nael.fasfous@tum.de

This project is in cooperation with BMW AG.

Betreuer:

Nael Yousef Abdullah Al-Fasfous

Accelerating Object-Detection Algorithms on NVDLA

Beschreibung

Convolutional neural networks (CNNs) are the state of the art for most computer vision tasks. Although their accuracy is unrivaled when compared to classical segmentation and classification algorithms, they present many challenges for implementation on hardware platforms. Most performant CNNs tend to be computationally complex for low-power embedded applications. Finding a good trade-off between accuracy and efficiency can be critical when deciding the network architecture and the target hardware.

This work focuses on acclerating CNNs for object-detection on the NVIDIA Deep Learning Accelerator (NVDLA). Different CNNs can be benchmarked, new layers must be added, and execution must be optimized to maintain minimum latency.

Voraussetzungen

To successfully complete this project, you should have the following skills and experiences:

  • Good prgramming skills in C++, Python and Tensorflow
  • Good programming skills in HDL
  • Good knowledge of neural networks, particularly convolutional neural networks

The student is expected to be highly motivated and independent. By completing this project, you will be able to:

  • Implement object-detection CNNs on a state-of-the-art accelerator
  • Optimizing CNNs through quantization and pruning to improve overall system performance
  • Test and evaluate solutions for correctness and applicability
  • Present your work in the form of a scientific report

Kontakt

Nael Fasfous
Department of Electrical and Computer Engineering
Chair of Integrated Systems
Arcisstr. 21
80333 Munich
Germany

Phone: +49.89.289.23858
Building: N1 (Theresienstr. 90)
Room: N2116
Email: nael.fasfous@tum.de

This project is in cooperation with BMW AG.

Betreuer:

Nael Yousef Abdullah Al-Fasfous

Implementation of an Approximated FIR Filter on FPGA for Laser Line Extraction from Pixel Data

Beschreibung

Current 3D laser line scanners have precision in the range of a micrometer. These scanners work on the principle of laser triangulation and use a camera chip in the receive path. The captured pixel data is then processed on an FPGA to generate 3D profile data. In order to do this, the lsaser line, as seen by the camera, must be extracted from the pixel data. For this task, several methods have been proposed. One of these methods employs an FIR filter to calculate the derivative of the incoming pixel stream orthogonally to the laser line direction. Afterwards, the zero crossing of this derivative is detected. The position of the zero crossing marks the position of the laser line in the camera image. From this position, the distance of the laser scanner to the scanned object can be derived.

Approximate computing is an emerging design paradigm that trades in accuracy for resource consumption, i.e. a certain inaccuracy of the calculations is allowed with the goal of reducing the overall resource consumption of the implemented design. In this thesis, such approximation methods should be integrated to the data processing pipeline and the results should be evaluated.

This thesis includes the implementation of a simple data processing pipeline for the extraction of the laser line from pixel data using an FIR filter-based approach. The implementation should be done in VHDL. Furthermore, the necessity for prefiltering (e.g. smoothing) of the pixel data should be assessed and implemented if necessary. Finally, the potential for the integration of approximate computing methods into the data processing pipeline should be evaluated.

Voraussetzungen

The student should have the following skills in order to successfully complete the thesis.

  • Good programming skills in VHDL
  • A basic understanding of FIR filter design
  • A basic understanding of image processing
  • The ability to work independently
  • Previous experience with approximate computing is helpful, but not essentially required

The student can work on the thesis remotely from his home office.

Kontakt

Arne Kreddig
Doctoral Candidate and FPGA Design Engineer
SmartRay GmbH


arne.kreddig@smartray.com

Betreuer:

Arne Kreddig - (SmartRay GmbH)

Graph Neural Network-based Pruning

Beschreibung

Convolutional neural networks (CNNs) are the defacto standard for many computer vision (CV) applications. These range from medical technology, robotics applications to autonomous driving. However, most modern CNNs are very memory and compute intensive, particularly when they are dimensioned for complex CV problems.


Compressing neural networks is essential for a variety of real-world applications. Pruning is a widely used technique for reducing the complexity of a neural network by removing redundant and superfluous parameters. One characteristic of this approach is the pruning granularity, which describes the substructures that should be removed from the neural network. Another aspect is the method for finding the redundant and unused structures, which plays a central role in effective pruning without loss of task-related accuracy. The optimization goal determines which elements (kernel, filter, channel) can be removed from the topology of the CNN.

The goal of this work is to learn the internal relationships between the channels, filters, kernels of the layers by means of a graph neural network, and identify their relevance to the classification task of the CNN. The learned relationships are then used for pruning the neural network.

Voraussetzungen

Prerequisites

To successfully complete this project, you should have the following skills and experiences:

  • Good programming skills in Python and Tensorflow
  • Good knowledge of neural networks, particularly convolutional neural networks

The student is expected to be highly motivated and independent.

Kontakt

Nael Fasfous
Department of Electrical and Computer Engineering
Chair of Integrated Systems

Phone: +49.89.289.23858
Building: N1 (Theresienstr. 90)
Room: N2116
Email: nael.fasfous@tum.de

Betreuer:

Alexander Frickenstein, Nael Yousef Abdullah Al-Fasfous