Seminar on Topics in Integrated Systems
Vortragende/r (Mitwirkende/r) | |
---|---|
Art | Seminar |
Umfang | 3 SWS |
Semester | Wintersemester 2023/24 |
Unterrichtssprache | Englisch |
Termine
Teilnahmekriterien
Anmerkung: Begrenzte Teilnehmerzahl! Anmeldung in TUMonline vom 25.09. - 22.10.2023 Jeder Student muss ein Seminarthema vor der Einführungsveranstaltung wählen. Dazu muss er Kontakt mit dem entsprechenden Themenbetreuer aufnehmen. Die Themen werden in der Reihenfolge der Anfragen vergeben. Die einzelnen Themen werden unter <a href="https://www.ei.tum.de/lis/lehre/seminare/seminar-on-topics-in-integrated-system-design/"> https://www.ei.tum.de/lis/lehre/seminare/seminar-on-topics-in-integrated-system-design/</a> ab 09.10.2023 bekannt gegeben.
Lernziele
Beschreibung
Inhaltliche Voraussetzungen
Lehr- und Lernmethoden
Studien-, Prüfungsleistung
Empfohlene Literatur
Links
Angebotene Themen
Vergebene Themen
Seminare
Topic on Caches and Memories
Beschreibung
To be discussed individually with the student
Voraussetzungen
B.Sc. in Electrical engineering or similar degree
Kontakt
Oliver Lenke
o.lenke@tum.de
Betreuer:
DRAM Controller with built-in Caches
Beschreibung
DRAM modules are indispensable for modern computer architectures. Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density.
However, DRAM accesses are rather slow and require a dedicated DRAM controller that coordinates the read and write accesses to the DRAM as well as the refresh cycles.
In order to reduce the DRAM access latency, DRAM controllers provide sophisticated mechanisms, such as access predictors or built-in caches. The goal of this Seminar is to study and compare DRAM controller designs with built-in caches and present their benefits and usecases. A starting point of literature will be provided.
Voraussetzungen
B.Sc. in Electrical engineering or similar degree
Kontakt
Oliver Lenke
o.lenke@tum.de
Betreuer:
Hardware-Aware Layer Fusion of Deep Neural Networks
Beschreibung
Deep Neural Network (DNN) workloads can be scheduled onto DNN accelerators in different ways: from layer-by-layer scheduling to cross-layer depth-first scheduling (a.k.a. layer fusion, or cascaded execution). Dataflow and mapping of a DNN influences its compute and energy efficiency on edge accelerators. Layer fusion is a concept which enables the processing of consecutive CNN layers exploiting data reuse and layer dependencies without resorting to costly off-chip memory accesses. In order to optimally implement layer fusion, different combinations of mapping and scheduling parameters need to be evaluated. This seminar explores the broad search space for scheduling a given workload on a wide variety of DNN architectures through analytical cost models and estimates the scheduling effects on the target architectures.
Kontakt
Shambhavi.Balamuthu-Sampath@bmw.de
Betreuer:
Automation of efficient Multi-task learning for Computer Vision applications
Beschreibung
AI-powered applications increasingly adopt Convolutional Neural Networks (CNNs) for solving many vision-related tasks (e.g., semantic segmentation, object detection), leading to more than one CNNs running on resource-constrained devices. Supporting many models simultaneously on a device is challenging due to the increased computation, energy, and storage costs. An effective approach to address the problem is multi-task learning (MTL) where a set of tasks are learned jointly to allow some parameter sharing among tasks. MTL creates multi-task models based on CNN architectures called backbone models, and has shown significantly reduced inference costs and improved generalization performance. The challenge however lies in the resource-efficient architecture design that determines what parameters of a backbone model to share across tasks. Manual architecture design calls for significant domain expertise when tweaking neural network architectures for every possible combination of learning tasks. The goal of this seminar is to investigate the advances in automating the design of MTL models for computer vision tasks.
Kontakt
Shambhavi.Balamuthu-Sampath@bmw.de
Betreuer:
Neural Network Approaches for Camera Image Signal Processors
Beschreibung
Today, embedded camera applications become more and more popular, doing vision processing for autonomous drive, surveillance , or user customization of appliance devices.
This Seminar presentation shall give an overview about utilizing neural network methods for ISP processing (turning raw camera data into RGB), either full ISP pipeline or selected ISP functions or ISP post processing (image enhancement) or ISP control functions & tuning.
Kontakt
stephan.herrmann@nxp.com
Betreuer:
Building reliable latency predictors for DNNs on target accelerators
Beschreibung
Fast energy and latency estimation are crucial during the early design stages of a Deep Neural Networks (DNNs) and their hardware accelerators to assess the optimality of various design choices with regards to algorithms, hardware, and algorithm-to-hardware mapping. It particularly helps to measure the latency breakdown of the workload, and pinpoint the bottlenecks that are responsible for performance degradation. In order to obtain reliable precision, various memory characteristics, potential memory sharing scenarios for operands, and as the effects on dataflow have to be considered. This seminar aims to investigate fast methods to build a reliable latency predictor for efficiently deploying DNNs on embedded hardware.
Kontakt
Shambhavi.Balamuthu-Sampath@bmw.de
Betreuer:
Efficient hardware acceleration of convolutional variants in Deep Neural Networks
Beschreibung
Deep Neural Networks (DNNs) have found their way into real-time safety-critical applications like autonomus driving for a wide range of computer vision tasks like object detection, semantic segmentation, etc. Convolution layers have dominated the DNN architectures and a variety of domain-specific accelerator architectures have been proposed to maximize their energy efficieny and performance. Recent research advocates the use of convolutional variants like dilated convolutions, grouped convolutions, depth-wise convolutions and point-wise convolutions and it has been challenging to maximize the inference performance for these variants on resource-constrained embedded hardware. This seminar offers an exciting opportunity to investigate the impact of the DNN accelerator's memory hierarchy, their dataflows and optimal algorithms for efficient DNN inference on edge.
Kontakt
Shambhavi.Balamuthu-Sampath@bmw.de
Betreuer:
Innovations of Binary Neural Networks
Beschreibung
Convolutional neural networks have shown to be effective for many computer vision applications, such as in the field of autonomous driving. However, their recent advancements in prediction capability were brought about by employing more complex architectures, making them extremely difficult to implement on embedded accelerators. Binarization of neural networks, i.e. the quantization of neural network parameters and activations to 1-bit, shows compelling efficiency benefits, however with drawbacks in terms of their prediction capabilities. Recent works focus on improving the prediction capabilities by introducing binarization-friendly architectures.
This work aims to compare and investigate novel binarization-friendly architectures in terms of their computational complexity and their prediction capabilities.
Kontakt
lukas.frickenstein@bmw.de
Betreuer:
Algorithms for Memory Prefetching
Beschreibung
DRAM modules are indispensable for modern computer architectures. Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density.
However, DRAM accesses are rather slow and require a dedicated DRAM controller that coordinates the read and write accesses to the DRAM as well as the refresh cycles.
In order to reduce the DRAM access latency,memory prefetching is a common technique to access data prior to their actual usage. However, this requires sophisticated prediction algorithms in order to prefetch the right data at the right time.
The goal of this Seminar is to study and compare several memory prefetching algorithms and present their benefits and usecases. A starting point of literature will be provided.
Voraussetzungen
B.Sc. in Electrical engineering or similar degree
Kontakt
Oliver Lenke
o.lenke@tum.de
Betreuer:
Analysis and Comparison of Interconnects for Packet-Processing Designs
Beschreibung
Switches and network interface cards receive traffic at hundreds of Gbit/s and process packets with stringent latency requirements. The bandwidth and forwarding latency of the interconnect highly affects the performance of the whole design. The goal of this seminar work is to investigate and compare state-of-the-art interconnects in terms of metrics such as throughput, latency, priority-awareness, scalability and chip area.
[1] Kurth, Andreas, et al. "An open-source platform for high-performance non-coherent on-chip communication." IEEE Transactions on Computers 71.8 (2021): 1794-1809.
[2] Lin, Jiaxin, et al. "PANIC: A high-performance programmable NIC for multi-tenant networks." Proceedings of the 14th USENIX Conference on Operating Systems Design and Implementation. 2020.
Kontakt
Email: klajd.zyla@tum.de
Betreuer:
Power Management for Network Packet Processing
Beschreibung
The incoming traffic load and with it the computing requirements on network processing nodes such as edge servers can span multiple magnitudes in a matter of milliseconds and less. The systems are ususally dimensioned to cope with peak loads, resulting in a low resource utilization when traffic is little. To ensure a highly energy-efficient resource utilization, dynamic power management with respect to fluctuating processing requirements is necessary to reduce the power consumption (e.g. by dynamic voltage frequency scaling (DVFS) or power gating) in times of low traffic load.
The goal of this seminar is to establish an overview on current approaches for power management in network processing nodes such as edge servers. Hereby, the focus lies on the employed methods, the timescales with which they operate and their impact on processing performance indicators such as throughput and latency.
A starting point for the literature research can be:
https://dl.acm.org/doi/pdf/10.1145/3466752.3480098
Kontakt
marco.liess@tum.de
Betreuer:
Cache Prefetching Mechanisms
Beschreibung
Cache Design have by design some compulsory cache misses, i.e. the first access of a certain cacheline will typically result in a cache miss, since the data is not present in the cache hierarchy yet.
In order to reduce this, caches can be extended by prefetching mechanisms that speculatively prefetch some cachelines before they first get accessed.
The goal of this Seminar is to study and compare different cache prefetcher designs and present their benefits and usecases. A starting point of literature will be provided.
Voraussetzungen
B.Sc. in Electrical engineering or similar degree
Kontakt
Oliver Lenke
o.lenke@tum.de
Betreuer:
DRAM Controller with built-in Caches
Beschreibung
DRAM modules are indispensable for modern computer architectures. Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density.
However, DRAM accesses are rather slow and require a dedicated DRAM controller that coordinates the read and write accesses to the DRAM as well as the refresh cycles.
In order to reduce the DRAM access latency, DRAM controllers provide sophisticated mechanisms, such as access predictors or built-in caches. The goal of this Seminar is to study and compare DRAM controller designs with built-in caches and present their benefits and usecases. A starting point of literature will be provided.
Voraussetzungen
B.Sc. in Electrical engineering or similar degree
Kontakt
Oliver Lenke
o.lenke@tum.de
Betreuer:
Prefetching mechanism for DRAM systems
Beschreibung
DRAM modules are indispensable for modern computer architectures. Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density.
However, DRAM accesses are rather slow and require a dedicated DRAM controller that coordinates the read and write accesses to the DRAM as well as the refresh cycles.
In order to reduce the DRAM access latency, DRAM systems are often combined with prefetching mechanisms. The goal of this Seminar is to study and compare Prefetching mechanism for DRAM systems and present their benefits and usecases. A starting point of literature will be provided.
Voraussetzungen
B.Sc. in Electrical engineering or similar degree
Kontakt
Oliver Lenke
o.lenke@tum.de
Betreuer:
Dynamic Voltage and Frequency Scaling (DVFS) Algorithms for Modern Multi-Core CPUs
Beschreibung
Multi-Core Processors find their way into more and more application domains and allow for the parallel execution of multiple tasks which are independent from each other. Especially in areas like vehicle networks or wearable computing, not only processing performance, but also energy efficiency is of high importance. To save power during periods of low processing demand, Dynamic Voltage and Frequency Scaling (DVFS) is a mechanism to adaptively scale the clock frequency and supply voltage. Modern CPUs even allow for individual frequency and voltage configurations per CPU core. And while the discrete voltage and clock levels might be set by the used CPU design, the algorithm deciding on when to increase or decrease these metrics can sometimes be freely customized by the user.
The goal of this seminar topic is to investigae state of the art approaches for high efficiency DVFS scaling algorithms.
Betreuer:
Hyperdimensional Computing and Integer Echo State Networks
Hyperdimensional Computing, Reservoir Computing, Echo State Networks
In this seminar work, the student is asked to discuss Hyperdimensional Computing and its application in Reservoir Computing.
Beschreibung
Echo State Networks (ESN) form a recent approach for adaptive and AI-based time series processing and analysis. In contrast to classical deep neural networks, they consist of only three layers: an input layer, the reservoir, and an output layer. Therefore, ESNs have a promising architecture for efficient hardware implementations.
Hyperdimensional Computing is a mathematical framework for computing in distributed representations with high-dimensional random vectors and neural symbolic representation.
Recently the reservoir of an ESN has been realized by a hypervector of n-bit integers and based on hyperdimensional computing resulting in an approximation of ESNs, called Integer Echo State Networks (intESN) [1].
In this seminar work, the student shall summarize the hyperdimensional computing framework. Furthermore, the application of hyperdimensional computing on ESNs in the form of the intESN.
Relevant Paper:
[1] Kleyko, D., Frady, E. P., Kheffache, M., & Osipov, E. (2017). Integer Echo State Networks: Efficient Reservoir Computing for Digital Hardware.
ArXiv. https://doi.org/10.1109/TNNLS.2020.3043309
Voraussetzungen
To successfully carry out this seminar work, you should:
- work independently and self-organized
- have strong reading and writing comprehension of scientific papers
- perform structured literature research
Kontakt
Betreuer:
Types of Caching
Beschreibung
When we talk about "caches" is's clear, that we talk about dedicated hardware.
But there exist many other types of caching in the context of IT. Just have a look at the search results here: https://scholar.google.de/scholar?q=caching
This seminar should investigate in which contexts computer scientists talk about caching, which purpose it has and what are the HW and SW requirements.
Kontakt
flo.maurer@tum.de
Betreuer:
Cache Coherency Between Compute Nodes
Beschreibung
Although its age, cache coherency is part of lots of ongoing research.
With the developement of Network on Chips (NoCs) to place more CPUs on a single compute node directory based coherency protocols got used more.
This seminar should look at coherency beyond the scope of single compute nodes (chips) using NOCs, but at interconnected compute nodes.
A first idea for coherency between connected compute nodes was developed already in the 90s: https://ieeexplore.ieee.org/document/6182605
Starting from there, we want to see which developement was in this topic and which types of architectures were targeted.
Kontakt
flo.maurer@tum.de