Interested in an internship or a thesis?
Often, new topics are in preparation for being advertised, which are not yet listed here. Sometimes there is also the possibility to define a topic matching your specific interests. Therefore, do not hesitate to contact our scientific staff, if you are interested in contributing to our work. If you have further questions concerning a thesis at the institute please contact Dr. Thomas Wild.
Design and Deployment of a Lightweight On-Device Classifier for ECU Anomaly Categorization
Description
About the Project
Modern vehicles rely on complex distributed systems and generate extensive runtime data from ECUs and in-vehicle networks. These data streams must be analyzed effectively to detect sporadic anomalies. The Diagnosis Unit (DU) currently has no integration with the cloud, which limits the possibility of remote configuration and coordination of local DU during runtime. In highly automated vehicles, real-time anomaly diagnosis is essential for safety, reliability, and early intervention. The current Diagnosis Unit (DU) architecture detects anomalies via Ethernet snooping and trace monitoring but lacks embedded intelligence to autonomously categorize anomalies.
Project Description
This thesis aims to bridge that gap by developing and deploying a lightweight Machine Learning classifier capable of locally identifying the type of anomaly based on metadata (e.g., message rates, ID sequences) and trace-level indicators (e.g., control flow deviations, instruction durations, executed functions). The classifier must be tailored for low-power, runtime embedded systems like the ZCU102 board, ensuring it meets latency, memory, and CPU constraints.
The key tasks for this internship include:
- Build an anomaly classification dataset using real and synthetic traces.
- Design a minimal-overhead classifier suitable for embedded edge platforms.
- Compare classification techniques (e.g., decision trees, TinyML NNs, rule-based logic).
- Optimize the model for execution speed and memory footprint.
- Integrate and validate the classifier within the DU software stack.
- Quantitatively evaluate accuracy, timing, and resource utilization under realistic conditions
Key Responsibilities:
- Dataset Generation: Create labeled datasets using synthetic trace injections and logged anomaly traces from Aurix boards.
- Model Development: ? Design candidate classifiers using scikit-learn and/or TensorFlow Lite for Microcontrollers. ? Evaluate trade-offs: accuracy vs. latency vs. Footprint.
- Embedded Integration: ? Port the final model to C/C++ for execution on the DU Processing System (Linux). ? Interface classifier with DU anomaly metadata and trace analyzer.
- Evaluation: ? Test classifier on live or replayed data. ? Measure detection latency, false positives/negatives, inference time, and CPU/RAM usage.
- Reporting & Documentation: ? Document training pipeline, performance evaluation, and embedded integration. ? Prepare thesis manuscript and possibly a conference/poster paper.
Prerequisites
Required Skills:
- Proficiency in Python and C/C++.
- Solid understanding of classification algorithms and ML evaluation metrics.
- Knowledge of real-time systems, SoC platforms, or embedded diagnostics.
- Familiarity with Linux-based systems, cross-compilation, and performance profiling.
- (Optional) Experience with Zynq boards, TinyML, or vehicle diagnostics.
Expected Deliverables:
- A functioning, embedded ML-based classification module for the DU.
- Labeled dataset and training pipeline.
- Comprehensive performance report (accuracy, timing, and system load).
- Integration with DU demonstrator showing real-time anomaly categorization.
- Final thesis manuscript and presentation.
Benefits:
- Direct impact on enhancing autonomous diagnosis in smart automotive systems.
- Hands-on deployment of real ML models in embedded systems.
- Contribute to the first intelligent self-assessing DU prototype.
- Potential for academic publication or continuation into research/industry projects.
Contact
Zafer Attal
zafer.attal@tum.de
Supervisor:
A flexible Memory Preload & Security Extension for Multicore Architectues
Description
We are developing a dynamically configurable preload and crypto unit.
Specifically, this means two features: preload and/or dynamic activation/deactivation via software write AND self-adaptive burst length when the decryptor is disabled, otherwise fixed at 64B.
This requires, as discussed, routing the AXI ID to the memory controller, thus enabling stream separation of memory addresses (multicore aspect), and, on the other hand, providing a memory-mapped interface to the CPU.
We will then use this framework to further develop the priority mechanism.
This mechanism has two scientifically interesting features: separate prioritization for each core, arbitration between different cores, for example, based on access frequency, and possibly also externally configurable (e.g., one core is running a particularly important application, which is already determined in advance)
This allows to switch between a self-adaptive process and a user-configurable operation mode.
Optionally, you can also extrapolate additional pages and include them in the prioritization as well.
Prerequisites
- Strong Experience with VHDL Coding
- Basic Knowledge is C Programmng
- Basic knowledge on MPSoC, cache hierarchies etc.
- B.Sc. in Electrical Engineering or similar
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Extended SystemC Model for Design Space Exploration of a Chiplet-Based System
Description
In the BCDC project, a working group at TUM collaborates on designing a RISC-V-based chiplet demonstration chip, of which at least two will be connected via an interposer to simulate a system of interconnected chiplets. At LIS, we work on a high-performance, low-latency chiplet interconnect with additional application-specific features managed by a smart protocol controller. It closes the gap between the underlying physical layer that takes care of data transmission across the interposer and the system bus that attaches the inter-chiplet interface to the other components of the demonstration chip.
In previous work, a high-level simulation of our system has been set up using SystemC Transaction-Level Modeling (TLM). The model represents chiplets, an additional FPGA, and their interconnect in a configurable manner. The user can record a wide range of statistics for the evaluation of different system configurations and design details. This follow-up Master’s thesis serves to extend the model to bring it closer to a real system and to enhance its functionality as a design exploration tool.
As a first step, the overall architecture of the chiplet model is to be reworked. A more realistic system bus, oriented on AXI4, including bursts and congestion handling mechanisms, is to be implemented. Other parts of the TUM demonstration chip, like hardware accelerators with local memory, are to be added.
The capabilities of the interconnect protocol should be extended, e.g., by read/write operations to other memories besides the main chiplet RAM, DMA support, or application-specific extra features as provided by the smart protocol controller. This process will be based on a more formal specification of the interconnect protocol. The modeling of the interconnect itself should also involve further details specific to the investigated standards and custom solutions.
With a more complex chiplet model, more configuration options should be offered. An example could be the instantiation of a pure memory chiplet. The type of interconnect between individual chiplets should also be configurable beyond the currently supported parameters. All of these added possibilities will rely on an overhauled inter-chiplet routing mechanism.
Furthermore, the existing user code applications should be supplemented by ones utilizing the added hardware accelerators or benefiting from multi-core operation.
On the side of usability improvements, further statistics should be collected, automatically processed, and visualized to simplify the evaluation of the explored system configurations. Ultimately, the simulation should help identify the benefits and drawbacks of these configurations and support a future HDL implementation.
Prerequisites
- Understanding of chiplet architectures, especially their interconnect
- Experience with SystemC TLM
- Structured and independent way of working and strong problem-solving skills
Contact
michael.meidinger@tum.de
Supervisor:
Non-Intrusive Performance Monitoring Framework for a Memory Preloading Module
Description
A non-invasive performance monitoring module counter will be developed to evaluate memory prefetcher behavior at the last-level of cache. It will observe prefetch operations without altering system functionality or timing, ensuring accurate measurement of workload characteristics. Configurable counters will capture metrics like bandwidth usage, prefetch coverage, accuracy, and hit/miss ratios, in real time. The module will be implemented on an FPGA together with the rest of the system, and an available communication interface will be used to send the collected data to the attached PC. On the host side, a dedicated application will receive metric streams and update interactive dashboards, that will render live plots illustrating performance trends. Such visualization will facilitate intuitive analysis and real-time insights during live demonstrations under realistic workload conditions, and also help make future architectural decisions. The design of the module will keep in mind future expansion, leaving room for integration of additional performance metrics and advanced analysis capabilities. By combining hardware-based data collection with a flexible host application, the project will deliver a robust tool for cache prefetcher evaluation.
Prerequisites
- Strong Experience with VHDL Coding
- Basic Knowledge is C Programmng
- Basic knowledge on MPSoC, cache hierarchies etc.
- B.Sc. in Electrical Engineering or similar
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Student
SmartNIC Hardware Extensions for Server State Tracking
Description
With the advent of research on the next generation of
mobile communications 6G, we are engaged in exploring
architecture extensions for Smart Network Interface Cards
(SmartNICs). To enable adaptive, energy-efficient and
low-latency network interfaces, we are prototyping a
custom packet processing pipeline on FPGA-based NICs,
partially based on the open-nic project
(https://github.com/Xilinx/open-nic).
To improve the performance and energy efficiency of a
modern server, SmartNICs can be used to preprocess
incoming packets and gather characteristics on traffic and processing requirements. This information can be used to change the processing behavior of the server and react to the dynamic network and processing requirements. Hereby, a decisive task is the performant and accurate tracking of key metrics to characterize the current state of the server, both regarding the incoming traffic and the computational aspects in the processing resources of the server.
The goal of this work is to implement monitoring logic for key metrics, such as packet arrival rate, pipeline utilization, queue fill levels, etc. as hardware extensions in the SmartNIC using HDL. A key research question of this work targets finding out how accurate tracking of the CPU utilization is possible in the SmartNIC with minimal software-side intrusiveness. A useful tool for future proof and software-defined networking processing in the SmartNIC is the P4 framework. Therefore, a possible integration of the developed monitoring output into the P4 framework (as "externs") should be further evaluated. This should be accompanied by a P4 runtime in software to control the hardware pipeline in an asynchronous manner over longer timespans.
Prerequisites
- Programming skills in VHDL/Verilog, C and preferably P4 (and Python)
- Practical experience with FPGA Design and Implementation
- Good Knowledge of computer architecture, low-level software and OSI network model
- Comfortable with the Linux command line and bash
Contact
Marco Liess, M. Sc.
Tel.: +49.89.289.23873
Email: marco.liess@tum.de