
CeCaS
Mannheim CeCaS is a supra-regional research project funded by the BMBF to develop a "Central Car Server" for future automated, connected and electrified vehicles. The project network consists of numerous industrial partners, accompanied by several academic research groups.
Overarching Objective: Automotive Supercomputing Platform - powerful Central Car Server concept based on new automotive qualified high performance processors, in FinFET supported by application specific accelerators and adaptive automotive SW stack for highly automated connected vehicles.
At the Technical University of Munich, three chairs (TUM-AIR, TUM-LIS, TUM-SEC) are involved in the CeCaS project network, contributing in the areas of model-based development, requirements management, software architecture, memory technology, and security.
Contribution of LIS
TUM-LIS is developing approaches for intelligent pre-fetching and write-back of data by the memory controller to increase the performance of the automotive processor. In addition, a prediction model for future addresses and data accesses is being investigated using machine learning methods such as reinforcement learning.
The current approach provides a wrapper layer around the DDR controller that realizes this functionality. It reduces the access latencies to external volatile and non-volatile main memories via adaptive prefetching of data and instructions in fast on-chip SRAM memories and by intelligent write-back of modified data located in the SRAM memory to the external main memory.
In the work on the wrapper layer we cooperate with TUM-SEC who investigate suitable lightweight techniques for transparent on-the-fly en-/de-cryption of data stored on external memory to prevent unauthorized access as well as error correction codes.

Workflow
In the CeCaS project we take a two-sided approach. On the one hand, we examine various implementation concepts and approaches with a SystemC based simulation model together with our partners. On the other hand, we are also working on an FPGA implementation, which offers a deeper level of abstraction for even more precise analyses. In both areas there are often topics for student work.
Involved Researchers
Open Student Work
Current Student Work
FPGA System Extension with Encryption and Multicore Support
Description
As part of this working student position, an existing FPGA-based system architecture and its associated synthesis and configuration toolchain will be extended. The goal is to enhance the platform’s functionality and flexibility by integrating additional hardware components and improving multicore support.
One focus of this work is the integration of an already developed encryption unit into the existing data path architecture. The unit will be placed between the preload unit and the memory controller, enabling transparent encryption and decryption of memory accesses. In addition to the hardware integration itself, the synthesis toolchain must be adapted to properly include and configure the new component within the build process.
A second focus is the extension of the synthesis workflow to support all four instantiated CVA6 cores. The objective is to establish a consistent multicore workflow, including build process adjustments, memory initialization, and program deployment across all cores. As part of this task, a simple demonstration application should be developed that runs on all cores in parallel (e.g., a minimal “Hello World” program without shared variables) to validate correct system behavior.
The tasks include both hardware-related work (such as RTL-level integration and interface adaptation) and activities in the area of build and synthesis automation. The overall goal is to establish a stable and reproducible workflow that seamlessly integrates the new components into the existing infrastructure.
Prerequisites
- Good Knowledge about MPSoCs
- Good C programming skills
- Very good VHDL programming skills
- High motivation
- Self-responsible workstyle
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Entwicklung und Integration eines 1G-Ethernet-Streaming-Moduls für FPGA-basierte Performance-Analyse
Description
Im Rahmen dieser Forschungspraxis soll ein Ethernet-basiertes Kommunikationsmodul zur effizienten Übertragung von Laufzeit- und Performance-Daten aus einem FPGA-System entwickelt und integriert werden. Ziel ist es, einen bestehenden Hardware-Prototyp um eine leistungsfähige, nicht-intrusive Streaming-Schnittstelle zu erweitern, die eine kontinuierliche Analyse des Systemverhaltens während der Ausführung ermöglicht.
Kern der Arbeit ist die Integration eines 1G-Ethernet IP-Cores in das FPGA-Board Xilinx VCU118. Aufbauend darauf soll eine stabile Punkt-zu-Punkt-Verbindung zwischen dem FPGA und einem Host-PC realisiert werden. Die Datenübertragung erfolgt paketbasiert über Ethernet und dient der Ausleitung von Metriken, die innerhalb des Systems generiert werden.
Die zu übertragenden Daten bestehen aus einer Menge konfigurierbarer Performance-Metriken mit unterschiedlicher Bitbreite. Diese werden über eine FIFO-Schnittstelle an das Ethernet-Modul übergeben. Als Datenquellen dienen dabei sowohl eine AXI-basierte Traffic-Analyseeinheit als auch Performance Counter der bestehenden Preload-Unit.
Neben der Hardwareentwicklung umfasst die Arbeit auch die Anpassung und Erweiterung einer bestehenden Python-basierten GUI auf dem Host-PC. Diese dient zur Visualisierung und Analyse der empfangenen Daten in Echtzeit. Die Schnittstelle zwischen Hardware und Software soll dabei so gestaltet werden, dass eine einfache Integration neuer Metriken möglich ist.
Prerequisites
- Good Knowledge about MPSoCs
- Good C programming skills
- Very good VHDL programming skills
- High motivation
- Self-responsible workstyle
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Generischer AXI-Traffic-Analyser zur Charakterisierung von Speicher- und Buszugriffen in eingebetteten Systemen
Description
Ziel dieser Bachelorarbeit ist die Konzeption und prototypische Umsetzung eines generischen AXI-Traffic-Analysers, der an beliebigen Stellen eines Systems mit AXI-Interface instanziiert werden kann. Der Analyser soll es ermöglichen, den Datenverkehr auf dem Bus flexibel an unterschiedlichen Beobachtungspunkten zu erfassen und auszuwerten. Damit soll eine wiederverwendbare Hardware-Komponente entstehen, die zur Charakterisierung von Kommunikations- und Speicherzugriffsmustern eingesetzt werden kann.
Neben klassischen Kenngrößen wie Datenrate, Zugriffsrate oder Verhältnis von Lese- und Schreibzugriffen sollen insbesondere weiterführende Metriken zur Beschreibung des Zugriffsverhaltens untersucht und implementiert werden. Dazu gehören unter anderem die Verteilung von Strides zwischen aufeinanderfolgenden Speicherzugriffen, die Stabilität dieser Strides über die Zeit, die Dominanz bzw. Verteilung einzelner Cores oder Initiatoren auf dem Bus sowie die Abschätzung des aktuellen Working Sets, beispielsweise über die Anzahl gleichzeitig genutzter Speicherseiten. Darüber hinaus soll analysiert werden, ob ein beobachtetes System eher konstantes oder stark schwankendes Zugriffsverhalten aufweist.
Der AXI-Traffic-Analyser soll zunächst so ausgelegt werden, dass seine Messdaten über den Bus ausgelesen werden können. Bereits bei der Architektur und Schnittstellendefinition soll jedoch eine spätere Erweiterung hin zu einer non-intrusiven Ethernet-Streaming-Variante berücksichtigt werden. Perspektivisch soll der Analysator somit nicht nur lokal auslesbare Statistiken bereitstellen, sondern auch eine kontinuierliche Übertragung von Messdaten über Ethernet unterstützen, sobald ein entsprechendes Interface im Gesamtsystem verfügbar ist.
Im Rahmen der Arbeit sollen die Anforderungen an einen solchen Analysator definiert, eine Hardware-Architektur entworfen und ein erster Prototyp implementiert sowie beispielhaft evaluiert werden.
Prerequisites
- Good Knowledge about MPSoCs
- Good C programming skills
- Very good VHDL programming skills
- High motivation
- Self-responsible workstyle
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Evaluation of a Page-Based Memory Preload Architecture Using Standardized Embedded Benchmarks
Description
Modern MPSoC architectures are increasingly limited by off-chip memory latency. To mitigate this bottleneck, a page-based hardware preload unit has been developed that speculatively transfers DRAM pages upon last-level cache misses in order to hide memory access latency.
The goal of this bachelor thesis is to perform a systematic and scientifically sound evaluation of this architecture using internationally recognized embedded benchmark suites. The work will focus on identifying, porting, and executing suitable bare-metal benchmarks on an FPGA-based RISC-V platform (CVA6 architecture). Candidate benchmark suites include Embench, CoreMark, PolyBench/C, MiBench, and other memory-intensive workloads. The final selection will be made during the course of the thesis based on feasibility and relevance.
The thesis involves implementing the benchmarks in the existing hardware/software framework, conducting structured performance measurements, and comparing different system configurations (e.g., with and without the preload unit). Particular emphasis will be placed on analyzing memory behavior, working-set characteristics, and access patterns.
Beyond implementation, the thesis will provide a scientific evaluation of how different workload classes interact with page-based preloading. Results will be analyzed quantitatively and presented in a clear and reproducible manner using normalized speedups and workload classifications.
The outcome of this work will provide a solid experimental foundation for further research and potential publications in the area of memory-optimized MPSoC architectures.
Prerequisites
- Good Knowledge about MPSoCs
- Good C programming skills
- Basic understanding of hardware-oriented programming style
- High motivation
- Self-responsible workstyle
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Balancing Preload Efficiency and Responsiveness through Adaptive Burst Lengths
Description
Page-based memory preloading typically relies on fixed burst lengths to transfer data efficiently from DRAM. While long bursts maximize preload throughput, they reduce responsiveness to demand-driven CPU memory accesses. Short bursts improve reactivity but underutilize available memory bandwidth.
This thesis builds on the existing page-based preload unit and investigates a hardware-based mechanism for dynamically adjusting preload burst length according to current memory system utilization. The goal is to balance preload efficiency and fast reaction to demand accesses at runtime. The proposed mechanism adapts burst length based on simple runtime indicators such as DRAM activity or the presence of competing CPU requests. The implementation extends the existing preload FSM and does not require any modifications to the CPU microarchitecture
Evaluation on an FPGA-based platform analyzes execution time, interference with demand accesses, and bandwidth utilization under different memory-intensive workloads. The results aim to demonstrate that adaptive burst sizing is an effective and low-overhead technique to improve the robustness of memory-side preloading.
Prerequisites
- Good Knowledge about MPSoCs
- Good C programming skills
- High motivation
- Self-responsible workstyle
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Self-Adaptive Control of Page-Based Preloading
Description
Hardware preloading mechanisms, as investigated in the current project, must balance latency hiding against bandwidth efficiency. While page-based preloading can effectively amortize memory access latency, it may also lead to unnecessary bandwidth consumption when memory access patterns change dynamically. This thesis builds upon an existing page-based preload unit that incrementally transfers memory pages. The current baseline design employs fixed policies for page switching and preload completion, independent of the observed usefulness of partially loaded pages.
The objective of this work is to enhance the preload unit with self-adaptive control mechanisms that dynamically adjust preloading behavior based on runtime feedback. In particular, the preload unit shall monitor the usefulness of partially preloaded pages, enabling early termination of preload operations when further progress is unlikely to produce additional cache hits. Furthermore, the student will extend the page switching logic to consider multiple factors, including page priority, observed reuse, preload progress, and interruption frequency.
Experimental observations indicate that pages suspended due to page switching are often not accessed again in the future, raising the question under which conditions such pages should remain in the priority queue for continued preloading. In addition, the thesis will investigate more advanced priority strategies that augment the existing temporal and spatial locality scheme with runtime feedback and observed memory behavior.
All proposed strategies shall be developed and evaluated using RTL-based simulation as well as FPGA-based implementations. The overall goal of this thesis is to derive a self-adaptive and enhanced version of the existing preload unit. The enhanced design will be evaluated using synthetic microbenchmarks and representative benchmarks. The evaluation will quantify improvements in bandwidth efficiency and preload effectiveness under varying memory access characteristics.
Prerequisites
- Strong Experience with VHDL Coding
- Basic Knowledge is C Programmng
- Basic knowledge on MPSoC, cache hierarchies etc.
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Completed Student Work
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
Supervisor:
Student
Contact
Oliver Lenke
o.lenke@tum.de