Lars Nolte, M.Sc.
Forschung
Informationen zu meinem Forschungsgebiet findest Du auf der PASSTA Projektseite.
Angebotene Arbeiten
Interesse an einer Studien- oder Abschlussarbeit? Schreiben Sie mir gerne eine E-Mail.
Die angegebene Art der Arbeit dient als Richtlinie und kann nach Absprache auch angepasst werden.
Es gibt auch immer wieder Arbeiten, die noch nicht ausgeschrieben sind. Einfach mal nachfragen!
Laufende Arbeiten
Non intrusive hardware tracing over ethernet
Beschreibung
Tracing of events in hardware components is one powerful tool to monitor, debug and improve existing designs. Through this approach detailed insights can be acquired and peak performance can be achieved, while being a challenging task to be integrated with good performance. One of the major challenges of tracing is to collect as much information as possible with ideally no impact on the to-be-analyzed system. Herewith, it can be ensured that the gained insights are representative of an execution without any tracing enabled. In this work, a hardware tracing component should be designed that takes an arbitrary data input and sends it via an ethernet connection to a different PC that performs the postprocessing of the data. The tracing component has to be designed in a way that for sending the data over ethernet no CPU involvement is required to minimize the impact on the traced system. This tracing component should be integrated into the hardware platform based on a Xilinx Zynq board. This features a heterogeneous ARM multicore setup directly integrated into the ASIC, combined with programmable logic in the FPGA part of the chip. In the FPGA a hardware accelerator is already implemented that should be traced with the new component.
Voraussetzungen
To successfully complete this work, you should have:
- good HDL programming skills,
- experience with microcontroller programming,
- basic knowledge about Git,
- first experience with the Linux environment.
The student is expected to be highly motivated and independent.
Kontakt
Email: lars.nolte@tum.de
Betreuer:
Comparing DPDK with traditional Linux based networking
Beschreibung
With the ever-increasing network speeds of physical links, the processing of packets on network nodes is becoming more and more of a bottleneck. Packet processing on a standard Linux-based network node traditionally involves the operating system (OS). Since an OS is usually optimized for a range of tasks rather than a specific task, using conventional Linux kernel functionalities for packet processing can degrade performance. For this reason, approaches to bypass the kernel have been proposed to perform network processing in user space.
One approach of bypassing the kernel that has attracted growing interest in recent years is Data Plane Development Kit (DPDK). By processing packets entirely in user space, DPDK avoids time-consuming context switches between user space and kernel space. This comes at the cost of one CPU core actively polling for new packets, instead of the network interface card (NIC) triggering interrupts for incoming packets. In addition, DPDK itself mainly provides the poll mode drivers for selected NICs, but the processing of the packets is the duty of the application using DPDK. Thus, while DPDK is suitable for certain application scenarios, there are also numerous use cases that are better suited to be implemented using the Linux networking stack. For example, to establish a Transmission Control Protocol (TCP) connection, an additional user space TCP/IP stack must be implemented or taken from open-source projects. These are generally not as feature-rich as the conventional Linux networking stack and do not necessarily improve performance.
This work aims to find a method to compare applications using DPDK with applications using the Linux network stack. Envisioned is a client-server application that uses iperf3 to generate data traffic.
Voraussetzungen
To successfully complete this work, you should have:
- very good programming skills in Python and C/C++,
- basic knowledge about Git,
- first experience with the Linux environment.
The student is expected to be highly motivated and independent.
Kontakt
Email: lars.nolte@tum.de
Betreuer:
Hardware Accelerating the Linux Epoll Mechanism
Beschreibung
An upcoming trend in the development of computer architecture can be seen over the last few years. Next to the ever-increasing number of cores in one system, dedicated hardware accelerators for specific tasks are getting increasingly widespread. On the software side, multithreaded applications are gaining more popularity as one approach to maximize the utilization of the underlying hardware architectures. Here Intra-/Inter-Process Communication (IPC) becomes more and more of a limiting factor in these systems. Besides the data transfer, primarily used in Inter-Process Communication, the notification can be a time-consuming operation in IPC mechanisms.
This work focuses on the epoll mechanism present in Linux. This mechanism can be attached to different file descriptors to be informed of whether a certain event occurred. In case no events are available the thread waits until events occur, which implies being notified by the thread that performs this event. Hence epoll is purely a notification mechanism that can be attached to different IPC mechanisms used to transfer data. This work should focus on eliminating/minimizing the cost of the notification by developing a hardware accelerator that relieves the threads from the notification. Here a solution is envisioned that improves the performance of the thread sending the notification as well as the one waiting for the events. The proof of concept should be accomplished by integrating the framework, software- as well as hardware-wise, into a heterogeneous architecture simulated in full system simulation using Gem5.
To achieve this task, the existing mechanism in Linux has to be analyzed. Therefore, microbenchmarks have to be developed to elaborate the time-consuming operations to be offloaded to hardware. This analysis should result in a concept to be integrated into the simulated system to show in the end the achievable performance improvement. Combined with the development, a related work analysis for existing similar approaches is part of this work.
Voraussetzungen
To successfully complete this work, you should have:
- first experience with embedded programming,
- very good programming skills in Python and C/C++,
- basic knowledge about Git,
- first experience with the Linux environment.
The student is expected to be highly motivated and independent.
Kontakt
Email: lars.nolte@tum.de
Betreuer:
Setting up L4Re on Raspberry Pi
Beschreibung
In this thesis, you lead the setup of the L4Re (L4 Runtime Environment) on Raspberry Pi devices. This is an exciting opportunity to work on cutting-edge technology, combining the flexibility of L4 microkernel architecture with the capabilities of Raspberry Pi to build secure and modular operating systems for embedded systems.
You will delve into the fascinating world of microkernel architecture. Unlike traditional monolithic kernels, where most functionality resides within the kernel, microkernels follow a minimalist approach. The microkernel serves as a lightweight core, providing only essential services such as interprocess communication (IPC) and memory management. Additional operating system services and functionalities, including device drivers and file systems, are implemented as separate user-level processes or servers that run outside the microkernel.
Responsibilities:
- Set up and configure the development environment for L4Re on Raspberry Pi, including toolchain and cross-compilation environment.
- Obtain the L4Re source code by downloading from the official repository or using a specific release.
- Build the L4Re system for Raspberry Pi, ensuring successful compilation and linking of components.
- Deploy the compiled L4Re system onto Raspberry Pi boards, including bootloader setup and configuration.
- Test and verify the functionality and performance of the L4Re system on Raspberry Pi, conducting both functional and security evaluations.
- Develop custom components and applications on top of L4Re, integrating them into the system and extending its functionality.
- Document the setup process, configurations, and customizations, ensuring clear and concise documentation for future reference.
Voraussetzungen
- Solid understanding of embedded systems development and experience with low-level programming.
- Proficiency in C/C++ programming languages and familiarity with cross-compilation environments.
- Strong knowledge of operating system concepts and microkernel architectures, preferably with experience working with L4-based systems.
- Experience with Raspberry Pi development and configuration, including bootloader setup and deployment.
- Familiarity with embedded Linux, device drivers, and system-level software development.
- Excellent problem-solving skills and the ability to troubleshoot complex issues.
Betreuer:
Abgeschlossene Arbeiten
2023
Bachelorarbeiten
-
30.10.2023
Nichtinvasives integriertes Event-Tracing von FPGA über Ethernet
Betreuer: Lars Nolte -
08.09.2023
Non intrusive hardware tracing over ethernet
Betreuer: Lars Nolte -
20.06.2023
Optimization of Hardware Assisted Futex Implementation on Zynq Ultrascale+
Betreuer: Lars Nolte -
20.03.2023 Maximilian Grözinger
Digital Design and Validation of Hardware Assisted Futex - Implementation on Zynq Ultrascale+
Betreuer: Lars Nolte
Masterarbeiten
-
22.05.2023
Hardware-assisting the User-Epoll mechanism in Linux
Betreuer: Lars Nolte -
30.03.2023
Optimizing high-speed network packet processing in Linux
Betreuer: Lars Nolte
Forschungspraxis (Research Internships)
-
15.01.2023
Implementation of a Finite State Machine for Hardware Managed Futexes on Zynq Ultrascale+
Betreuer: Lars Nolte
2022
Bachelorarbeiten
-
09.03.2022
Reduction of the Simulation Time of the Gem5 Simulator
Betreuer: Lars Nolte
Masterarbeiten
-
13.12.2022
Digital Design and Validation of a Futex Hardware Accelerator – Emulation on Zynq Ultrascale+
Betreuer: Lars Nolte
Forschungspraxis (Research Internships)
-
30.09.2022
Cache Coherent Hardware Accelerator Integration into an ARM Multicore Platform with a FPGA extension
Betreuer: Lars Nolte -
12.06.2022
Performance Improvement Evaluation of Hardware Accelerated Linux Thread Wake-ups
Betreuer: Lars Nolte
Seminare
-
20.07.2022
[MSEI] A survey on asynchronous event notification mechanisms in Linux systems.
Betreuer: Lars Nolte -
28.01.2022
Survey on Linux Scheduler and Options to tweak an Application’s Performance
Betreuer: Lars Nolte
Studentische Hilfskräfte
-
31.07.2022
Hardware Accelerator Integration into an ARM Multicore Platform with a FPGA extension
Betreuer: Lars Nolte
2021
Bachelorarbeiten
-
01.12.2021
Integration of Performance Counter into a simulation model of a hardware accelerator in Gem5.
Betreuer: Lars Nolte -
13.09.2021
Setup of an ARM Multicore Platform with a FPGA extension using a Xilinx Zynq Board and a Linux OS.
Betreuer: Lars Nolte -
06.07.2021
Low-intrusive Software Tracing and Profiling using a Gem5 Simulator
Betreuer: Lars Nolte -
06.07.2021
Low-intrusive Software Tracing and Profiling using a Gem5 Simulator
Betreuer: Lars Nolte
Masterarbeiten
-
13.12.2021
Development of a Generic Framework for Linux Task Offloading to Hardware on a Multicore Architecture.
Betreuer: Lars Nolte -
13.12.2021
Development of a Generic Framework for Linux Task Offloading to Hardware on a Multicore Architecture.
Betreuer: Lars Nolte
Forschungspraxis (Research Internships)
-
12.05.2021
Continuous Integration set up for a Gem5 Simulator project
Betreuer: Lars Nolte
Seminare
-
02.02.2021
Application Profiling Tools in Linux
Betreuer: Lars Nolte -
02.02.2021 Jonas Kirf
A Survey on Inter-Process Communication
Betreuer: Lars Nolte
Publikationen
2023
- HAWEN: Hardware Accelerator for Thread Wake-Ups in Linux Event Notification. 2023 60th ACM/IEEE Design Automation Conference (DAC), 2023 mehr… BibTeX
- HW-FUTEX: Hardware-Assisted Futex Syscall. IEEE Transactions on Very Large Scale Integration Systems, 2023 mehr… BibTeX Volltext ( DOI )
2022
2021
2020
- X-CEL: A Method to Estimate Near-Memory Acceleration Potential in Tile-based MPSoCs. ARCS 2020 - 33rd International Conference on Architecture of Computing Systems, 2020 mehr… BibTeX
- DySHARQ: Dynamic Software-Defined Hardware-Managed Queues for Tile-Based Architectures. International Journal of Parallel Programming, 2020 mehr… BibTeX Volltext ( DOI )