Seminars
A Survey on Cloud-based Software Architectures
Description
- Scalability
- Compelxity
- Servicability
Supervisor:
Anomaly Detection in Video/Sequential Images
Description
In the context of rapid advancements in intelligent transportation and autonomous driving technologies, Anomaly Detection in Video/Sequential Images has emerged as a core technology for ensuring driving safety and enhancing system robustness. Unlike traditional static image recognition, vehicles operate in a highly dynamic, complex, and unpredictable three-dimensional world. Therefore, the ability to process continuous visual information in real-time and identify "unusual" events is crucial for intelligent vehicles.
Here, we aim to investigate which algorithms are best suited to the application and what advantages and disadvantages they offer.
Supervisor:
Dynamic Resource Allocation in Heterogeneous Multi-core Systems using Reinforcement Learning
Description
With diminishing performance gains from node shrinking in recent years, hardware manufacturers have been gravitating towards architectural improvements to squeeze more performance out of hardware. This shift has pushed the industry increasingly towards heterogeneous multi-core systems. Prominent examples of this are Big.LITTLE cores on ARM CPUs [1] and P-cores and E-cores on Intel CPUs [2]: two types of CPU cores used on the same chip to balance performance and power efficiency. This trend extends beyond the use of multiple core types to include on-chip hardware accelerators.
Resource allocation refers to the assignment of a task to a specific hardware resource (e.g., a CPU core or an Accelerator). A naive approach to perform resource allocation would be to statically assign each task to a specific resource at design time. This, however, can quickly lead to over-provisioning and, consequently, low resource utilization. Also, task execution time can fluctuate vastly depending on the input data. Consequently, dynamic resource allocation, in which task/resource mappings are determined at runtime, is explored as an alternative to static allocation. This is a difficult problem to solve, especially considering the hybrid nature of heterogeneous systems.
In this seminar work, we want to survey the literature for different approaches to the dynamic resource allocation problem in heterogeneous multicore systems using Reinforcement Learning (RL). RL has proven to be a solid choice for solving similar problems where the environment dynamics are either stochastic or too complex to model . There are multiple directions this work can take, such as DAG-based, model-free, and energy-aware methods. The student is free to pursue the work as they please within the pre-defined scope in the description.
Here are a couple of papers as a starting point. These are just examples. The student is free to include any papers they deem fit the topic in their work:
-
Grinsztajn, N., Beaumont, O., Jeannot, E., & Preux, P. (2021, September). Readys: A reinforcement learning based strategy for heterogeneous dynamic scheduling. In 2021 IEEE International Conference on Cluster Computing (CLUSTER) (pp. 70-81). IEEE.
-
Liu, Y., & Wu, J. (2025, October). Dynamic Scheduling in Heterogeneous Non-Preemptive Pipeline Systems: A Reinforcement Learning Approach. In 2025 5th International Conference on Electronic Information Engineering and Computer Technology (EIECT) (pp. 480-484). IEEE.
-
Lin, Z., Li, C., Tian, L., & Zhang, B. (2022). A scheduling algorithm based on reinforcement learning for heterogeneous environments. Applied Soft Computing, 130, 109707.
-
Mao, H., Schwarzkopf, M., Venkatakrishnan, S. B., Meng, Z., & Alizadeh, M. (2019). Learning scheduling algorithms for data processing clusters. In Proceedings of the ACM special interest group on data communication (pp. 270-288).
References:
[1] https://www.arm.com/technologies/big-little
[2] https://www.intel.com/content/www/us/en/gaming/resources/how-hybrid-design-works.html
Contact
mohamed.amine.kthiri@tum.de
Supervisor:
Prefetching the Translation Path: MMU-Prefetch Co-Design for I/O Devices and Accelerators
Description
Modern I/O devices and accelerators increasingly rely on virtual memory support in order to simplify programming, improve isolation, and enable shared virtual address spaces with CPUs. However, address translation on the device side is often expensive. When an IOMMU or device TLB misses, the accelerator may suffer significant stalls due to multi-level page table walks and limited translation locality. Prior work has shown that translation overhead can become a major bottleneck for accelerators and that hiding or restructuring translation latency is an important architectural problem.
At the same time, traditional prefetching research mainly focuses on data accesses, while the translation path itself can also be seen as a target for latency hiding. This raises an interesting question: instead of only prefetching data, can we also prefetch translation-related information, such as page-table entries, or redesign the translation path so that address translation and data access can overlap more effectively? Recent and prior studies suggest that this direction is promising for accelerator-centric systems.
The goal of this seminar is to build a clear architectural understanding of how prefetching and address translation can be combined for accelerators or IO systems. The seminar will compare different approaches, identify their main design trade-offs, and discuss whether translation-path prefetching could become an important design direction for future heterogeneous systems
Prerequisites
- Basic Knowledge of Computer Architecure
- Good English Skill
Contact
Yuanji Ye
yuanji.ye@tum.de
Supervisor:
Prefetching for LLM Inference: KV Cache Movement
Description
Large language model(LLM) inference is increasingly limited by memory access rather than pure computation, especially in long-context and decoding scenarios. A major reason is the high cost of moving model weights and KV cache data across the memory hierarchy. Recent work shows that prefetching can be used to overlap memory movement with ongoing computation or communication.
Compared with traditional hardware prefetching, LLM inference introduces a different setting. The prefetched object is no longer a small cacheline but larger units such as KV blocks. Moreover, prefetch timing depends on token generation, layer execution order, and runtime scheduling. Some recent studies also suggest that temporal patterns in attention behavior can be exploited to guide KV cache management and cross-token prefetching more effectively.
The goal of this seminar is to build a clear understanding of how prefetching concepts can be extended to AI inference systems. The student will compare recent approaches for KV cache and weight prefetching, discuss and summarize their architectural trade-offs, and determine whether LLM-serving workloads require new prefetching principles beyond those in CPU memory hierarchies
Prerequisites
- Basic Knowledge of Computer Architecure
- Good English Skill
Contact
Yuanji Ye
yuanji.ye@tum.de
Supervisor:
Software-Based Control Flow Integrity
Description
This seminar will examine software-based Control-Flow Integrity (CFI) methods, focusing on techniques such as static analysis, compiler instrumentation, and runtime checks to enforce control-flow security at the program level. It will evaluate their advantages, including wide applicability and easier integration with existing systems, as well as limitations such as higher runtime overhead and potential precision issues. The seminar will also briefly compare these approaches with hardware-based CFI methods, emphasizing trade-offs between performance, security guarantees, and deployment complexity.
Contact
Technische Universität München
TUM School of Computation, Information and Technology
Lehrstuhl für Integrierte Systeme
Arcisstr. 21
80333 München
Tel.: +49.89.289.22963
Fax: +49.89.289.28323
Gebäude: N1 (Theresienstr. 90)
Raum: N2138
Email: ibai.irigoyen(at)tum.de
Supervisor:
Hardware-based Control-Flow Integrity
Description
This seminar will explore hardware-based Control-Flow Integrity (CFI) methods, focusing on techniques such as shadow stacks, branch monitoring, and specialized instructions to enforce control-flow security at the processor level. It will analyze their benefits, including strong security guarantees and low runtime overhead, as well as drawbacks like hardware complexity and limited compatibility with legacy systems. The seminar will also briefly compare these methods with software-based CFI approaches, highlighting trade-offs such as flexibility versus performance and ease of deployment.
Contact
Technische Universität München
TUM School of Computation, Information and Technology
Lehrstuhl für Integrierte Systeme
Arcisstr. 21
80333 München
Tel.: +49.89.289.22963
Fax: +49.89.289.28323
Gebäude: N1 (Theresienstr. 90)
Raum: N2138
Email: ibai.irigoyen(at)tum.de
Supervisor:
Intrusion Detection Systems for AVTP
Description
As automotive architectures transition from legacy bus systems to high-speed Automotive Ethernet, the Audio Video Transport Protocol (AVTP) has become the standard for transporting time-sensitive data, including ADAS camera feeds, infotainment streams, and critical control traffic. However, this transition opens new attack vectors: an adversary could inject malicious frames to freeze a backup camera or spoof control messages within the AVTP stream, potentially leading to catastrophic failures.
This seminar focuses on the unique challenges and methodologies for implementing Intrusion Detection Systems (IDS) within the IEEE 1722 framework.
Supervisor:
Userspace Scheduling in Linux
Description
Efficient use of the available computing resources is an important to achieve the best possible performance of a computing platform. A key component in such architectures is the operating system, which provides an interface for other software and application to run and use the hardware.
As many applications can run concurrently and therefor compete for the available compute resources, the operating system is responsible for providing the desired scheduling. While this has been historically the job of the kernel, numerous approaches exist to move some functionality of task and thread scheduling into user space.
This seminar topic should analyze the different approaches for userspace scheduling, highlighting their differences by looking at their implementation, performance, and motivation.