Seminare
Weight Clustering and Sparsity Acceleration Techniques for Efficient Deep Learning Inference on Low-Power Processors
ML, Machine learning, Sparsity Acceleration, Weight Clustering
Beschreibung
This survey explores algorithmic and architectural techniques for improving the efficiency of deep neural network (DNN) inference on resource-constrained processors such as embedded CPUs and microcontrollers. Students will survey recent research on weight clustering (quantization-aware weight grouping to reduce compute and memory costs) and sparsity acceleration (exploiting zero-valued weights or activations to skip computations). The survey paper may cover software-level optimizations, compiler-assisted techniques, and hardware extensions — with optional emphasis on RISC-V CPUs or other low-power architectures.
Sample References:
- https://ieeexplore.ieee.org/abstract/document/10546844
- https://ieeexplore.ieee.org/document/11113397
- https://ieeexplore.ieee.org/ielx7/43/9716240/09380321.pdf?tp=&arnumber=9380321&isnumber=9716240&ref=aHR0cHM6Ly9zY2hvbGFyLmdvb2dsZS5jb20v
- https://upcommons.upc.edu/server/api/core/bitstreams/bcfd5ee6-208e-42ba-bf98-e040809f4443/content
Kontakt
Philipp van Kempen
Betreuer:
Multi-DNN scheduling and mapping
DNN, scheduling, mapping
Beschreibung
A significant portion of DNN inference has shifted from cloud execution to edge execution due to concerns over data privacy and the constant need for connectivity to the cloud. Nevertheless, this presents its own challenges, since edge devices are resource-constrained. By using multiple distributed devices, good performance can be achieved. However, this necessitates the creation of novel scheduling and mapping approaches to coordinate the execution of tasks between devices. Within the scope of this topic, the student will familiarize themselves with the problem and learn about various mathematical formulations designed to optimally synchronize the execution of DNNs across multiple devices.
Voraussetzungen
- Constrained Optimization
- Interest in scheduling and mapping
- Integer Linear Programming knowledge is beneficial
Kontakt
Leonidas Kontopoulos, M.Sc.
leonidas.kontopoulos@tum.de
Betreuer:
Physical/Analog Foundation Models
Beschreibung
Physical neural networks (PNNs) are a class of neural-like networks that make use of analogue physical systems to perform computations. Although at present confined to small-scale laboratory demonstrations, PNNs could one day transform how artificial intelligence (AI) calculations are performed. Could we train AI models many orders of magnitude larger than present ones? Could we perform model inference locally and privately on edge devices? Research over the past few years has shown that the answer to these questions is probably “yes, with enough research”. Because PNNs can make use of analogue physical computations more directly, flexibly and opportunistically than traditional computing hardware, they could change what is possible and practical for AI systems. To do this, however, will require notable progress, rethinking both how AI models work and how they are trained—primarily by considering the problems through the constraints of the underlying hardware physics. To train PNNs, backpropagation-based and backpropagation-free approaches are now being explored. These methods have various trade-offs and, so far, no method has been shown to scale to large models with the same performance as the backpropagation algorithm widely used in deep learning today. However, this challenge has been rapidly changing and a diverse ecosystem of training techniques provides clues for how PNNs may one day be used to create both more efficient and larger-scale realizations of present-scale AI models.
Based on: Momeni, Ali, et al. "Training of physical neural networks." Nature 645.8079 (2025): 53-61.
Also see: Büchel, Julian, et al. "Analog Foundation Models." arXiv preprint arXiv:2505.09663 (2025).
Kontakt
ch.wolters@tum.de
Betreuer:
Equality Saturation for Tensor Graph Optimizations
Equality Saturation, Graph Optimization, Intermediate Representation
Beschreibung
For deep neural networks have a low latency and high accuracy are of big importance during inference. To achieve that ML compilers can apply certain transformations on the networks graph. Traditionally, such transformations are applied sequentially, which introduces the phase-ordering problem, in which certain transformations may yield a better result if they are applied in a later stage. Equality saturation aims at tackling this issue by first creating an Intermediate Representation of the network and storing different optimized versions in its first phase. In the second phase, it chooses the best solution. For this topic, the student will familiarize themselves wiith Equality Saturation and different techniques that can lead to optimized Neural Network compilation.
Voraussetzungen
- Interest in ML Compilers
- Good understanding of ML architectures
- Very good math skills
Kontakt
Leonidas Kontopoulos, M.Sc.
leonidas.kontopoulos@tum.de
Betreuer:
Modeling and Simulation of Silicon Photonics Systems in SystemVerilog/XMODEL
Beschreibung
Silicon photonics integrates both photonic and electronic components on the same silicon chip and promises ultra-dense, high-bandwidth interconnects via wavelength division multiplexing (WDM). However, when verifying such silicon photonic systems, the existing IC simulators face challenges due to the WDM signals containing multiple frequency tones at ~200-THz with ~50-GHz spacing. In this seminar, the student will investigate the modeling approach for the silicon photonic elements and devices as equivalent multi-port transmission lines using XMODEL primitives and simulating the WDM link models in an efficient, event-driven fashion in SystemVerilog.
Kontakt
liaoyuan.cheng@tum.de
Betreuer:
SPICE-Compatible Modeling and Design for Electronic-Photonic Integrated Circuits
Beschreibung
Electronic-photonic integrated circuit (EPIC) technologies are revolutionizing computing systems by improving their performance and energy efficiency. However, simulating EPIC is challenging and time-consuming. In this seminar, the student will investigate the modeling method for EPIC.
Kontakt
liaoyuan.cheng@tum.de
Betreuer:
Percolation on complex networks: Theory and application
Beschreibung
In the last two decades, network science has blossomed and influenced various fields, such as statistical physics, computer science, biology and sociology, from the perspective of the heterogeneous interaction patterns of components composing the complex systems. As a paradigm for random and semi-random connectivity, percolation model plays a key role in the development of network science and its applications. On the one hand, the concepts and analytical methods, such as the emergence of the giant cluster, the finite-size scaling, and the mean-field method, which are intimately related to the percolation theory, are employed to quantify and solve some core problems of networks. On the other hand, the insights into the percolation theory also facilitate the understanding of networked systems, such as robustness, epidemic spreading, vital node identification, and community detection. Meanwhile, network science also brings some new issues to the percolation theory itself, such as percolation of strong heterogeneous systems, topological transition of networks beyond pairwise interactions, and emergence of a giant cluster with mutual connections. So far, the percolation theory has already percolated into the researches of structure analysis and dynamic modeling in network science. Understanding the percolation theory should help the study of many fields in network science, including the still opening questions in the frontiers of networks, such as networks beyond pairwise interactions, temporal networks, and network of networks. The intention of this paper is to offer an overview of these applications, as well as the basic theory of percolation transition on network systems.
Kontakt
m.lian@tum.de