HSIS - Lehrstuhl für Integrierte Systeme

Wissenschaftliches Seminar Integrierte Systeme

Vortragende/r (Mitwirkende/r)	Shichen Huang [L] Shichen Huang [L] Andreas Herkersdorf Thomas Wild
Art	Seminar
Umfang	3 SWS
Semester	Sommersemester 2025
Unterrichtssprache	Deutsch

Termine

24.04.2025 15:00-16:30 N2128, Seminarraum

Teilnahmekriterien

Anmerkung: Anmerkung: Begrenzte Teilnehmerzahl! Anmeldung in TUMonline vom 27.03.2025 - 23.04.2025. Jeder Student muss ein Seminarthema vor der Einführungsveranstaltung wählen. Dazu muss er Kontakt mit dem entsprechenden Themenbetreuer aufnehmen. Die Themen werden in der Reihenfolge der Anfragen vergeben. Die einzelnen Themen werden ab 07.04.2025 unter <a href="https://www.ce.cit.tum.de/lis/lehre/seminare/seminar-integrierte-systeme/">https://www.ce.cit.tum.de/lis/lehre/seminare/seminar-integrierte-systeme/</a> bekannt gegeben.

Lernziele

Nach der Teilnahme an den Modulveranstaltungen sind die Studierenden in der Lage Problemstellungen aus dem Bereich integrierter Systeme und deren Anwendungen zu analysieren und Lösungsansätze zu bewerten. Aufgabenstellungen aus einem aktuellen Themengebiet der integrierten Systeme können selbstständig auf wissenschaftliche Weise bearbeitet, selbstständige Literaturrecherchen dazu durchgeführt und eine schriftliche Ausarbeitung dazu angefertigt werden. Darüber hinaus können die Studierenden die erarbeiteten Erkenntnisse vor einem fachlichen Publikum präsentieren.

Beschreibung

Wechselnde Schwerpunktthemen zu integrierten Schaltungen und Systemen, sowie deren Anwendungen.

Die Modulteilnehmer erarbeiten selbstständig aktuelle wissenschaftliche Beiträge, fertigen eine zu bewertende schriftliche Ausarbeitung an, präsentieren ihren Beitrag im Rahmen eines Kolloquiums und tragen mit Diskussionsbeiträgen zum Kolloquium bei.

Inhaltliche Voraussetzungen

Basiskenntnisse integrierter Schaltungen und Systeme
sowie deren Anwendungen.

Lehr- und Lernmethoden

Jeder Teilnehmer bearbeitet eine individuelle fachliche Aufgabenstellung. Dies geschieht insbesondere in selbständiger Einzelarbeit des Studierenden.
Der Teilnehmer bekommt - abhängig von seinem individuellen Thema - einen eigenen Betreuer zugeordnet. Der Betreuer hilft dem Studierenden insbesondere zu Beginn der Arbeit, indem er in das Fachthema einführt, geeignete Literatur zur Verfügung stellt und hilfreiche Tipps sowohl bei der fachlichen Arbeit als auch bei der Erstellung der schriftlichen Ausarbeitung und des Vortrags gibt.

Studien-, Prüfungsleistung

Die Endnote setzt sich aus folgenden Prüfungselementen zusammen:
- 50 % schriftliche Ausarbeitung (typisch 4 Seiten)
- 50 % Vortrag 15-20 Minuten plus Diskussion 5 Minuten

Empfohlene Literatur

Themen-spezifische Literatur wird vom jeweiligen Betreuer empfohlen und soll durch eigene Recherchen ergänzt werden.

Angebotene Themen

Vergebene Themen

Download Arbeit als PDF

High Dynamic Range Camera Sensors for Advanced Driver Assistance Systems and Autonomous Drive

Beschreibung

Camera sensors are an important input to Advanced Driver Assistance Systems (ADAS) and Autonomous Drive (AD) of cars. A challenge for the camera sensors are the very high dynamic ranges of the input signal and the variation of the illumination of the environment. The candidate should work on understanding principles of high dynamic range (HDR) image capturing, different pixel technologies for HDR sensing, exposure control for HDR images, relations to LED flicker mitigation, algorithms to create HDR images from the captured input data and algorithms to compress the high dynamic range images to display the images to a human driver or vision processing system.

Kontakt

Dr. Stephan Herrmann

NXP Semiconductors Germany, Munich

Email: stephan.herrmann@nxp.com

Betreuer:

Shichen Huang - Dr. Stephan Herrmann (NXP Semiconductors Germany)

Download Arbeit als PDF

CPU-GPU Heterogeneous Computing Techniques

Beschreibung

As computing demands grow increasingly complex, heterogeneous systems that combine CPUs and GPUs have emerged as a powerful solution for accelerating a wide range of workloads. These systems leverage the complementary strengths of CPUs (for control-intensive tasks) and GPUs (for data-parallel processing), enabling more efficient computation. However, coordinating execution and data sharing between these distinct architectures introduces challenges in workload partitioning, memory coherence, and performance optimization.

This seminar topic explores modern techniques for CPU-GPU heterogeneous computing, including task offloading strategies, unified memory models, and runtime scheduling mechanisms. Foundational and recent literature will be provided to support the study.

Through this seminar, participants will develop a deeper understanding of how to design, analyze, and optimize applications that harness both CPU and GPU resources effectively in high-performance and energy-efficient computing environments.

Kontakt

shichen.huang@tum.de

Betreuer:

Shichen Huang

Download Arbeit als PDF

Time-Division-Multiplexed Network-on-Chips

Beschreibung

Overview:

Modern MPSoCs are heavily reliant on efficient and scalable interconnects. However fast or numerous the processors may be, the system will not be able to take advantage of these compute resources unless data and messages can be shared effectively. For this reason, network-on-chips (NoCs) are a virtual part of the design of modern SoCs. NoCs are highly scalable, while still being able to achieve low latency and high bandwidth utilisation.

However, current NoCs are not always suited for time-sensitive applications. Standard NoC designs use a "best effort" approach; this offers good average performance and can be used with the vast majority of workloads without requiring any modification of NoC components. But best effort NoCs offer no guarantee that a given transaction is completed in a given timeframe, which makes them wholly unsuited for real-time systems with hard deadlines.

A well-known alternative approach to best effort is time-division-multiplexing (TDM). In TDM NoCs a global schedule is made and each node is allocated a certain time-slot in which it may transmit information. This approach therefore allows for transmission times for a given program to be determined exactly at compile time.

Task:

For this seminar, the student will investigate TDM and mixed best-effort/TDM NoCs, with the goal of exploring and summarising state-of-the-art TDM NoC techniques, as well as the performance trade-offs of TDM NoCs compared to standard best-effort NoCs.

Relevant literature:

R. A. Stefan, A. Molnos and K. Goossens, "dAElite: A TDM NoC Supporting QoS, Multicast, and Fast Connection Set-Up," in IEEE Transactions on Computers, vol. 63, no. 3, pp. 583-594, March 2014

M. Schoeberl, F. Brandner, J. Sparsø and E. Kasapaki, "A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems," 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, Lyngby, Denmark, 2012

S. Hesham, D. Goehringer and M. A. Abd El Ghany, "HPPT-NoC: A Dark-Silicon Inspired Hierarchical TDM NoC with Efficient Power-Performance Trading," in IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 3, pp. 675-694, 1 March 2020

Kontakt

William Wulff
william.wulff@tum.de

Betreuer:

William Wulff

Download Arbeit als PDF

On-The-Fly Lossless Data Compression Techniques

Beschreibung

As systems-on-chip (SoCs) consist of increasing numbers of processing elements, be it separate CPU cores in traditional SoCs or distinct processing dies in chiplet-based architectures, the requirements for transmission bandwidth between these and other system elements keep rising. While this can be achieved by scaling the transfer rate and the number of parallel transmission lanes, see, for example, the evolution over PCIe's generations, area and power consumption for the interconnect are rising as well.

Another approach to manage the interconnect load is to reduce the amount of data to transfer in the first place. Besides optimizing the system architecture and applications to require fewer transfers between system components, on-the-fly data compression can be used in systems where area and power consumption are more critical than transmission latency. After data is generated or given as input, it can be compressed before transmitting it over particularly longer-distance links. It can be either decompressed or used as is on the receiving end, depending on the application and the destination element. Example use cases are compressing sensor values or camera images before being processed or stored in memory on the other side of the link or compressing (sparse) matrices into a certain format to be processed by AI applications.

Many algorithms for lossless data compression exist. Those focusing on compression and/or decompression speed instead of compression ratio are more relevant for a system as described. A potential candidate could be the Lempel-Ziv 4 algorithm. This seminar work should investigate the viability of this and other lossless compression algorithms, how data should be structured for efficient operation, and what applications could especially benefit from this approach compared to more classical methods to handle high interconnect bandwidth requirements. Further literature research could look into hardware implementations of the considered algorithms.

Potential starting points could be the following papers:
https://ieeexplore.ieee.org/abstract/document/1549812
https://ieeexplore.ieee.org/abstract/document/7818601
https://koreascience.kr/article/JAKO201313660603091.page
https://cdn.zeropoint-tech.com/f/174713/x/2ef77c7d31/ziptilion-memorycompression-
ip-zeropoint-technology-whitepaper-2023-10-18-ver-2-6.pdf

Kontakt

michael.meidinger@tum.de

Betreuer:

Michael Meidinger

Download Arbeit als PDF

High Dynamic Range Camera Sensors for Advanced Driver Assistance Systems and Autonomous Drive

Beschreibung

Kontakt

Dr. Stephan Herrmann

NXP Semiconductors Germany, Munich

Email: stephan.herrmann@nxp.com

Betreuer:

Shichen Huang - Dr. Stephan Herrmann (NXP Semiconductors Germany)

Download Arbeit als PDF

Modern GPU Synchronization Methods in Parallel Computing

Stichworte:
GPU, multi-threading, synchronization

Beschreibung

As GPU architectures continue to evolve, their ability to execute thousands of parallel threads has become fundamental to accelerating workloads in fields such as deep learning, scientific computing, and real-time graphics. However, this massive parallelism introduces significant challenges in coordinating thread execution and data access across GPU cores and multiple GPUs. Effective synchronization is therefore critical to ensure correct program behaviour, maximize hardware utilization, and achieve optimal performance.

This seminar topic focuses on investigating modern GPU synchronization methods, which provide the necessary mechanisms to coordinate parallel execution while minimizing overhead. A starting point of literature will be provided.

Through this seminar, participants are expected to gain more insights into parallel execution and GPU synchronization, preparing them to tackle synchronization challenges in high-performance computing and heterogeneous system design and GPU programming scenarios.

Voraussetzungen

Have a fundemental understanding of how GPU works

Kontakt

shichen.huang@tum.de

Betreuer:

Shichen Huang

Download Arbeit als PDF

Categorization of Ethernet-Detected Anomalies Induced by Processing Unit Deviations

Beschreibung

Sporadic anomalies in automotive systems can degrade performance over time and may originate from various system components. In automotive applications, anomalies are often observed at the sensor and ECU levels, with potential propagation through the in-vehicle network via Ethernet. Such anomalies may be the result of deviations in electronic control units, highlighting the importance of monitoring these signals over Ethernet.

Not all processing anomalies are equally detectable over Ethernet due to inherent limitations in the monitoring techniques and the nature of the anomalies. This seminar will explore various anomaly categories, investigate their potential causes, and assess the likelihood of their propagation through the network.

The goal of this seminar is to provide a comprehensive analysis of these anomaly categories, evaluate the underlying causes, and discuss the potential for their detection and mitigation when monitored over Ethernet.

Kontakt

Zafer Attal

zafer.attal@tum.de

Betreuer:

Zafer Attal

Download Arbeit als PDF

Analysis Algorithms for Processor Traces and Instructions

Beschreibung

Modern CPUs execute a vast number of instructions while managing large volumes of data. On-chip debugging modules, located adjacent to the CPU, play a critical role in capturing valuable execution information. This data is essential for analyzing system behavior and detecting anomalies—such as timing issues or execution faults—that may occur in the processing unit.

Over time, various algorithms have been developed to analyze processor traces and instructions. These algorithms not only deepen our understanding of system behavior but also support the debugging of potential faults and anomalies.

The goal of this seminar is to explore and compare different trace analysis algorithms, and that is by evaluating their efficiency, performance, and potential applications in debugging and optimizing processor operations.

Kontakt

Zafer Attal

zafer.attal@tum.de

Betreuer:

Zafer Attal

Download Arbeit als PDF

Prefetching Techniques Based on Machine Learning

Beschreibung

Prefetching techniques are widely used in digital systems to enhance performance. A prefetcher predicts and fetches data before it is actually accessed, thereby hiding memory access latency.

Traditional prefetchers typically consider only one program context and work well with regular memory access patterns. Recently, machine learning techniques such as neural networks and reinforcement learning have been employed in prefetcher design. These machine learning based prefetchers take into account more program and system-level information, allowing them to make smarter decisions. As a result, they often achieve higher accuracy, coverage, and timeliness, leading to improved system performance.

The goal of this seminar is to study and compare prefetching mechanisms based on different machine learning methodologies. After reading some papers, you should know the advantages of using machine learning in prefetching, as well as the challenges associated with its implementation. A starting point literature will be provided.

Voraussetzungen

For MSCE/MSEI student

Kontakt

Yuanji Ye

yuanji.ye@tum.de

Betreuer:

Yuanji Ye

To top

Lehrstuhl für Integrierte Systeme

Prof. Dr. sc.techn. Andreas Herkersdorf

Technische Universität München
Arcisstraße 21
80333 München

Tel.: +49.89.289.22515

Fax: +49.89.289.28323

E-Mail: lis(at)ei.tum.de

Wissenschaftliches Seminar Integrierte Systeme

Termine

Teilnahmekriterien

Lernziele

Beschreibung

Inhaltliche Voraussetzungen

Lehr- und Lernmethoden

Studien-, Prüfungsleistung

Empfohlene Literatur

Links

Angebotene Themen

Vergebene Themen

SEM: High Dynamic Range Camera Sensors for Advanced Driver Assistance Systems and Autonomous Drive

High Dynamic Range Camera Sensors for Advanced Driver Assistance Systems and Autonomous Drive

Beschreibung

Kontakt

Betreuer:

SEM: CPU-GPU Heterogeneous Computing Techniques

CPU-GPU Heterogeneous Computing Techniques

Beschreibung

Kontakt

Betreuer:

SEM: Time-Division-Multiplexed Network-on-Chips

Time-Division-Multiplexed Network-on-Chips

Beschreibung

Kontakt

Betreuer:

SEM: On-The-Fly Lossless Data Compression Techniques

On-The-Fly Lossless Data Compression Techniques

Beschreibung

Kontakt

Betreuer:

SEM: High Dynamic Range Camera Sensors for Advanced Driver Assistance Systems and Autonomous Drive

High Dynamic Range Camera Sensors for Advanced Driver Assistance Systems and Autonomous Drive

Beschreibung

Kontakt

Betreuer:

SEM: Modern GPU Synchronization Methods in Parallel Computing

Modern GPU Synchronization Methods in Parallel Computing

Beschreibung

Voraussetzungen

Kontakt

Betreuer:

SEM: Categorization of Ethernet-Detected Anomalies Induced by Processing Unit Deviations

Categorization of Ethernet-Detected Anomalies Induced by Processing Unit Deviations

Beschreibung

Kontakt

Betreuer:

SEM: Analysis Algorithms for Processor Traces and Instructions

Analysis Algorithms for Processor Traces and Instructions

Beschreibung

Kontakt

Betreuer:

SEM: Prefetching Techniques Based on Machine Learning

Prefetching Techniques Based on Machine Learning

Beschreibung

Voraussetzungen

Kontakt

Betreuer: