Lars Nolte, M.Sc.

Research

Information about my research area can be found on the PASSTA project page.

Thesis Offers

Interested in an internship or a thesis? Please send me an email.
The given type of work is just a guideline and could be changed if needed.
From time to time, there might be some work, that is not announced yet. Feel free to ask!

Ongoing Thesis

Comparison of existing Inter-Process Communication mechanisms in Linux

Description

This thesis entails a comprehensive study and comparison of the various Inter-Process Communication (IPC) mechanisms provided by the Linux operating system. IPC is a fundamental concept in operating systems, enabling processes to communicate, synchronize, and share data. This is especially crucial in multi-process and distributed computing environments, where seamless data exchange and coordination are key to system efficiency. The project begins by exploring several widely used IPC mechanisms in Linux, such as futexes, epoll, semaphores, sockets, and io_uring. Each of these mechanisms serves distinct purposes and use cases, and this study will investigate their specific roles, functionality, and how they can be employed in various application contexts.

A key part of the work will involve evaluating these mechanisms based on several critical factors:

  • Performance:  Detailed analysis of each mechanism's speed, efficiency, and resource consumption, focusing on latency and throughput under different conditions.
  • Ease of Implementation:  A review of how straightforward or complex it is to implement these mechanisms, considering factors such as the required setup, programming effort, and maintainability.
  • Use-Case Suitability:  Examining the appropriateness of each IPC mechanism for specific scenarios.

The practical component of the project will include developing sample applications that utilize each IPC mechanism. This will be followed by benchmarking tests to measure performance in real-world-like conditions. The project will offer concrete insights into the trade-offs between different mechanisms by executing these tests. The final deliverable will provide a thorough comparative analysis, synthesizing the results of the experiments and assessments. Based on this analysis, recommendations will be made regarding the most appropriate IPC mechanisms for specific Linux-based applications, such as server-client architectures, parallel computing environments, or embedded systems. This project aims to serve as a valuable reference for developers, system administrators, and architects who need to make informed choices about IPC in their systems.

Contact

lars.nolte@tum.de

Supervisor:

Lars Nolte

Implementation of a FPGA-based Intersatellite Network Switch for High-Speed Traffic

Description

The demand for high-speed communication links has significantly increased in recent years. Additionally, satellite telecom constellations can extend connectivity to the most remote areas and assist in handling higher payloads associated with emerging technologies like 6G. Each satellite node within these constellations requires routing capabilities to manage such data traffic. In this context, MPLS (Multi-Protocol Label Switching) and high-speed switching hardware are crucial for supporting this growth. MPLS enhances network performance by directing data based on short path labels instead of long network addresses, and advanced hardware enables the efficient handling of this data traffic.

The goal of this work is to validate a High-Speed (~600 Gb/s) Switch that utilizes MPLS technology. The system under evaluation is composed by a Multi-Rate MAC, a traffic generator, a MPLS Switch IP, and a PetaLinux build to manage MPLS traffic. The Multi-Rate MAC, traffic generator, and MPLS Switch are implemented on the FPGA embedded in the Versal ACAP, while PetaLinux operates on one of its ARM cores. The project employs Vivado, Vitis, and XSDB (Xilinx System Debugger) as the primary software tools, and the Versal ACAP and a e DVB-S2X/RCS2 Native Modem as the hardware to integrate and implement different parts and functionalities of the project.

Specific Task

  • Log project telemetry data to Petalinux Filesystem
  • Implement a Script to run on the Versal for tracing of the system
  • Show metrics on a GUI using HTML or a Phython lib (e.g. TKinter)
  • Define a meaningful test case for the system
  • Run the defined tests, adapt the system if necessary and debug

 

Prerequisites

  • Proficient in VHDL/Verilog, C, and Python programming languages
  • Strong understanding of computer networks, including the OSI model and various protocols (with
  • the focus on MPLS and IP)
  • Comfortable with using the Linux command line and writing bash scripts
  • Practical experience in FPGA and ACAP design and implementation

Supervisor:

Lars Nolte - Michael Hanh (Airbus)

Hardware-Assisted event notification for NIC generated events

Description

An upcoming trend in the development of computer architecture can be seen over the last few years. Next to the ever-increasing number of cores in one system, dedicated hardware accelerators for specific tasks are getting increasingly widespread. On the software side, multithreaded applications are gaining more popularity as one approach to maximize the utilization of the underlying hardware architectures. Here Intra-/Inter-Process Communication (IPC) becomes more and more of a limiting factor in these systems. Besides the data transfer, primarily used in Inter-Process Communication, the notification can be a time-consuming operation in IPC mechanisms.

The notification in an IPC is used to notify about some kind of event that occurred. In general, the event source can be either in software (user space or kernel space) or in hardware. However, in current systems, it is not possible to wait directly on events that occur in hardware with traditional notification mechanisms such as epoll but a helper software construct has to be used. For instance, if an application wants to be notified about an ethernet being received by a Network Interface Card (NIC), the application waits on a socket. This socket is a kernel construct that is filled by a NIC device driver after an IRQ is received which results in a notification of the application since the socket has new data.

This work focuses on extending the capabilities of the epoll mechanism present in Linux to also support being notified on events that occur in hardware devices. Epoll can be attached to different file descriptors to be informed of whether a certain event occurred. In case no events are available, the thread waits until events occur, which implies being notified by the thread that performs this event. This work will exemplarily focus on the event source in hardware being a ethernet packet being received by a NIC. The proof of concept should be implemented on a development platform based on the ZCU102 equipped with a 10G ethernet connection routed through the FPGA part of the Zynq MPSoC.

Prerequisites

To successfully complete this work, you should have:

  • first experience with embedded programming,
  • very good programming skills in System Verilog,
  • basic knowledge about Git,
  • first experience with the Linux environment.

The student is expected to be highly motivated and independent.

Supervisor:

Lars Nolte

Hardware-Accelerated Linux Kernel Tracing

Description

Tracing events with hardware components is one powerful tool to monitor, debug, and improve existing designs. Through this approach, detailed insights can be acquired, and peak performance can be achieved, while being a challenging task to be integrated with good performance. One of the major challenges of tracing is to collect as much information as possible with ideally no impact on the to-be-analyzed system. Herewith, it can be ensured that the gained insights are representative of an execution without any tracing enabled. In this work, a hardware tracing component should be leveraged to reduce the intrusiveness of existing software tracing mechanisms in the Linux kernel. 

This should be integrated and tested on a hardware platform based on a Xilinx Zynq board. This features a heterogeneous ARM multicore setup directly integrated into the ASIC, combined with programmable logic in the FPGA part of the chip. In the FPGA a hardware accelerator is already implemented that should be traced with the new component.

Prerequisites

To successfully complete this work, you should have:

  • experience with microcontroller programming,
  • basic knowledge about Git,
  • first experience with the Linux environment.

The student is expected to be highly motivated and independent.

Supervisor:

Lars Nolte

Completed Theses

2024

Bachelor's Theses

  • 24.01.2024
    Interprocess Communication: Signal events in user space with ueventfd and upipe
    Supervisor: Lars Nolte

Research Internships (Forschungspraxis)

  • 15.02.2024
    Multithreaded UDP Server und Parser für Tracing Daten in Rust
    Supervisor: Lars Nolte

Interdisciplinary Projects

  • 17.07.2024
    Development of a web application to control a hardware demonstration platform
    Supervisor: Lars Nolte

2023

Bachelor's Theses

  • 30.10.2023
    Non-invasive integrated event tracing of FPGA via Ethernet
    Supervisor: Lars Nolte
  • 08.09.2023
    Non intrusive hardware tracing over ethernet
    Supervisor: Lars Nolte
  • 20.06.2023
    Optimization of Hardware Assisted Futex Implementation on Zynq Ultrascale+
    Supervisor: Lars Nolte
  • 20.03.2023 Maximilian Grözinger
    Digital Design and Validation of Hardware Assisted Futex - Implementation on Zynq Ultrascale+
    Supervisor: Lars Nolte

Master's Theses

  • 22.05.2023
    Hardware-assisting the User-Epoll mechanism in Linux
    Supervisor: Lars Nolte
  • 30.03.2023
    Optimizing high-speed network packet processing in Linux
    Supervisor: Lars Nolte

Research Internships (Forschungspraxis)

  • 20.12.2023
    Setting up L4Re on a Raspberry Pi
    Supervisor: Lars Nolte
  • 15.01.2023
    Implementation of a Finite State Machine for Hardware Managed Futexes on Zynq Ultrascale+
    Supervisor: Lars Nolte

2022

Bachelor's Theses

  • 09.03.2022
    Reduction of the Simulation Time of the Gem5 Simulator
    Supervisor: Lars Nolte

Master's Theses

  • 13.12.2022
    Digital Design and Validation of a Futex Hardware Accelerator – Emulation on Zynq Ultrascale+
    Supervisor: Lars Nolte

Research Internships (Forschungspraxis)

  • 30.09.2022
    Cache Coherent Hardware Accelerator Integration into an ARM Multicore Platform with a FPGA extension
    Supervisor: Lars Nolte
  • 12.06.2022
    Performance Improvement Evaluation of Hardware Accelerated Linux Thread Wake-ups
    Supervisor: Lars Nolte

Seminars

  • 20.07.2022
    [MSEI] A survey on asynchronous event notification mechanisms in Linux systems.
    Supervisor: Lars Nolte
  • 28.01.2022
    Survey on Linux Scheduler and Options to tweak an Application’s Performance
    Supervisor: Lars Nolte

Student Assistant Jobs

  • 31.07.2022
    Hardware Accelerator Integration into an ARM Multicore Platform with a FPGA extension
    Supervisor: Lars Nolte

2021

Bachelor's Theses

  • 01.12.2021
    Integration of Performance Counter into a simulation model of a hardware accelerator in Gem5.
    Supervisor: Lars Nolte
  • 13.09.2021
    Setup of an ARM Multicore Platform with a FPGA extension using a Xilinx Zynq Board and a Linux OS.
    Supervisor: Lars Nolte
  • 06.07.2021
    Low-intrusive Software Tracing and Profiling using a Gem5 Simulator
    Supervisor: Lars Nolte
  • 06.07.2021
    Low-intrusive Software Tracing and Profiling using a Gem5 Simulator
    Supervisor: Lars Nolte

Master's Theses

  • 13.12.2021
    Development of a Generic Framework for Linux Task Offloading to Hardware on a Multicore Architecture.
    Supervisor: Lars Nolte
  • 13.12.2021
    Development of a Generic Framework for Linux Task Offloading to Hardware on a Multicore Architecture.
    Supervisor: Lars Nolte

Research Internships (Forschungspraxis)

  • 12.05.2021
    Continuous Integration set up for a Gem5 Simulator project
    Supervisor: Lars Nolte

Seminars

  • 02.02.2021
    Application Profiling Tools in Linux
    Supervisor: Lars Nolte
  • 02.02.2021 Jonas Kirf
    A Survey on Inter-Process Communication
    Supervisor: Lars Nolte

Publications

2024

  • Lars Nolte, Tim Twardzik, Camille Jalier, Jiyuan Shi, Thomas Wild, Andreas Herkersdorf: HW-EPOLL: Hardware-Assisted User Space Event Notification for Epoll Syscall. International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation 2024, 2024 more… BibTeX
  • Lars Nolte, Tim Twardzik, Camille Jalier, Jiyuan Shi, Thomas Wild, Andreas Herkersdorf: POSTER: Hardware Assist for Linux IPC on an FPGA Platform. 21st ACM International Conference on Computing Frontiers, 2024 more… BibTeX Full text ( DOI )
  • Tim Twardzik, Lars Nolte, Camille Jalier, Jiyuan Shi, Thomas Wild, Andreas Herkersdorf: HASIIL: Hardware-Assisted Scheduling to Improve IPC Latency in Linux. 21st ACM International Conference on Computing Frontiers, 2024 more… BibTeX Full text ( DOI )

2023

  • Lars Nolte, Tim Twardzik, Camille Jalier, Zhigang Huang, Jiyuan Shi, Clara Kowalsky, Thomas Wild, Andreas Herkersdorf: HAWEN: Hardware Accelerator for Thread Wake-Ups in Linux Event Notification. 2023 60th ACM/IEEE Design Automation Conference (DAC), 2023 more… BibTeX
  • Lars Nolte, Tim Twardzik, Camille Jalier, Zhigang Huang, Jiyuan Shi, Thomas Wild, Andreas Herkersdorf: HW-FUTEX: Hardware-Assisted Futex Syscall. IEEE Transactions on Very Large Scale Integration Systems, 2023 more… BibTeX Full text ( DOI )

2022

  • Lars Nolte, Tim Twardzik, Camille Jalier, Zhigang Huang, Jiyuan Shi, Thomas Wild, Andreas Herkersdorf: GLS Tracing: Gem5-based Low-intrusive Software Tracing. 2022 IEEE Nordic Circuits and Systems Conference (NorCAS), 2022 more… BibTeX

2021

  • Sven Rheindt, Akshay Srivatsa, Oliver Lenke, Lars Nolte, Thomas Wild, Andreas Herkersdorf: Tackling the MPSoC Data Locality Challenge – Part 2 / Chapter 5. In: Multi-Processor System-on-Chip 1. Wiley Online Library, 2021, 87-114 more… BibTeX

2020

  • Sven Rheindt, Andreas Fried, Oliver Lenke, Lars Nolte, Temur Sabirov, Tim Twardzik, Thomas Wild, Andreas Herkersdorf: X-CEL: A Method to Estimate Near-Memory Acceleration Potential in Tile-based MPSoCs. ARCS 2020 - 33rd International Conference on Architecture of Computing Systems, 2020 more… BibTeX
  • Sven Rheindt, Sebastian Maier, Nora Pohle, Lars Nolte, Oliver Lenke, Florian Schmaus, Thomas Wild, Wolfgang Schröder-Preikschat, Andreas Herkersdorf: DySHARQ: Dynamic Software-Defined Hardware-Managed Queues for Tile-Based Architectures. International Journal of Parallel Programming, 2020 more… BibTeX Full text ( DOI )

2019

  • Sven Rheindt, Andreas Fried, Oliver Lenke, Lars Nolte, Thomas Wild, Andreas Herkersdorf: NEMESYS: Near-Memory Graph Copy Enhanced System-Software. MEMSYS 19: The International Symposium on Memory Systems, 2019 more… BibTeX