Seminar on Topics in Electronic Design Automation

Lecturer (assistant)	Conrad Foik Mikhael Djajapermana Philipp Fengler Ulf Schlichtmann
Number	0000003270
Type	seminar
Duration	3 SWS
Term	Wintersemester 2025/26
Language of instruction	English
Position within curricula	See TUMonline

Dates

20.10.2025 15:00-16:30 2999, Seminarraum
03.11.2025 15:00-16:30 2999, Seminarraum
17.11.2025 15:00-16:30 2999, Seminarraum
30.01.2026 08:00-17:00 2999, Seminarraum

Admission information

Note: Students have to choose a seminar topic BEFORE the introduction lesson. To do so, students directly contact the supervisor of the topic they are interested in. Topics are assigned on a first-come-first-served basis. The supervisor needs to confirm that a topic has been assigned to the student. First after the confirmation the students is considered to be registered for the seminar. For a list of topics refer to the following link: https://www.ce.cit.tum.de/eda/lehrveranstaltungen/seminare/seminar-on-topics-in-electronic-design-automation/

Objectives

At the end of the seminar, the students are able to present a new idea or an existing approach in the area of computer-aided circuit and system design in an understandable and convincing manner.
For this purpose, the following competencies will be acquired:
* The students are able to independently familiarize themselves with a scientific topic in the field of electronic design automation
* The students are able to present their topic in a structured way according to problem formulation, state of the art, goals, methods and results.
* The students can present their topic, according to the above mentinoned structure, orally in form of a presentation, visually as a poster and with a set of slides, and in form of a writen report.

Description

Specific seminar topics in the area of electronic design automation will be offered. Examples are analog design methodology, digital design methodology, layout synthesis, and system-level design methodology.
The students work independently on a scientific topic and write a paper of 4 pages. At the end of the seminar, the students present their topic during a scientific talk. In a subsequent discussion the topic will be treated in-depth.

Prerequisites

No specific previous knowledge required.

Teaching and learning methods

Learning method:
Students elaborate a given scientific topic by themselves and are advised by a research assistant.
Teaching method:
Introductory lessons will be given, which cover advice on the work procedure during the seminar, scientific writing techniques as well as the preparation of an oral presentation.
The students discuss further (specific) details with the advising research assistants on an individual basis.
Media:
All current techniques for preparing and presenting papers and talks will be applied, e.g.
- blackboard, whiteboard
- electronic slides, beamer
- electronic word processing
- electronic slide processing

Examination

The examination is based on a scientific elaboration. This examination consist of a written part (50%) in form of a paper (4 pages), and of an oral part (50%) in form a presentation of approximatly 30 minutes (including a subsequent discussion). Through the scientific elaboration students show that they can prepare, structure and present, e.g., the state-of-the-art, a new idea or an existing approach in the area of electronic design automation.

Recommended literature

A set of topics and related literature is given at the start of the course. Each participant selects his/her topic.

Topics - online

Please find the topics for the WT 25/26 below.

The topics are handed out on a first-come-first served basis. Contact the supervisor of the topic to get more information and reserve a topic. Please make sure, that you get a confirmation of your supervisor onlce you selected a topic. We are looking forward to see you in the seminar.

Seminars

Download thesis as PDF

Weight Clustering and Sparsity Acceleration Techniques for Efficient Deep Learning Inference on Low-Power Processors

Keywords:
ML, Machine learning, Sparsity Acceleration, Weight Clustering

Description

This survey explores algorithmic and architectural techniques for improving the efficiency of deep neural network (DNN) inference on resource-constrained processors such as embedded CPUs and microcontrollers. Students will survey recent research on weight clustering (quantization-aware weight grouping to reduce compute and memory costs) and sparsity acceleration (exploiting zero-valued weights or activations to skip computations). The survey paper may cover software-level optimizations, compiler-assisted techniques, and hardware extensions — with optional emphasis on RISC-V CPUs or other low-power architectures.

Sample References:

https://ieeexplore.ieee.org/abstract/document/10546844
https://ieeexplore.ieee.org/document/11113397
https://ieeexplore.ieee.org/ielx7/43/9716240/09380321.pdf?tp=&arnumber=9380321&isnumber=9716240&ref=aHR0cHM6Ly9zY2hvbGFyLmdvb2dsZS5jb20v
https://upcommons.upc.edu/server/api/core/bitstreams/bcfd5ee6-208e-42ba-bf98-e040809f4443/content

Contact

Philipp van Kempen

Supervisor:

Philipp van Kempen

Download thesis as PDF

Placement of Systolic Arrays for Neural Network Accelerators

Description

Systolic arrays are a proven architecture for parallel processing across various applications, offering design flexibility, scalability, and high efficiency. With the growing importance of neural networks in many areas, there is a need for efficient processing of the underlying computations, such as matrix multiplications and convolutions. These computations can be executed with a high degree of parallelism on neural network accelerators utilizing systolic arrays.

Just as any application-specific integrated circuit (ASIC) or field-programmable gate array (FPGA) design, neural network accelerators go through the standard phases of chip design, however, treating systolic array hardware designs the same way as any other design may lead to suboptimal results, as utilizing the regular structure of systolic arrays can lead to better solution quality[1].

Relevant works for this seminar topic include the work of Fang et al. [2], where a regular placement is used as an initial solution and then iteratively improved using the RePlAce[3] placement algorithm. The placement of systolic arrays on FPGAs is discussed by Hu et al., where the processing elements of the systolic array are placed on the DSP columns in a manner that is more efficient than the default placement of commercial placement tools[4].

In this seminar, you will investigate different macro and cell placement approaches, focusing on methods that specifically consider systolic array placement. If you have questions regarding this topic, please feel free to contact me.

[1] S. I. Ward et al., "Structure-Aware Placement Techniques for Designs With Datapaths," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 32, no. 2, pp. 228-241, Feb. 2013, doi: https://doi.org/10.1109/TCAD.2012.2233862

[2] D. Fang, B. Zhang, H. Hu, W. Li, B. Yuan and J. Hu, "Global Placement Exploiting Soft 2D Regularity". in ACM Transactions on Design Automation of Electronic Systems, vol. 30, no. 2, pp. 1-21, Jan. 2025, doi: https://doi.org/10.1145/3705729

[3] C. -K. Cheng, A. B. Kahng, I. Kang and L. Wang, "RePlAce: Advancing Solution Quality and Routability Validation in Global Placement," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 38, no. 9, pp. 1717-1730, Sept. 2019, doi: https://doi.org/10.1109/TCAD.2018.2859220

[4] H. Hu, D. Fang, W. Li, B. Yuan and J. Hu, "Systolic Array Placement on FPGAs," 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), San Francisco, CA, USA, 2023, pp. 1-9, doi: https://doi.org/10.1109/ICCAD57390.2023.10323742

Contact

benedikt.schaible@tum.de

Supervisor:

Benedikt Schaible

Download thesis as PDF

Innovative Memory Architectures in DNN Accelerators

Description

With the growing complexity of neural networks, more efficient and faster processing solutions are vital to enable the widespread use of artificial intelligence. Systolic arrays are among the most popular architectures for energy-efficient and high-throughput DNN hardware accelerators.

While many works implement DNN accelerators using systolic arrays on FPGAs, several (ASIC) designs from industry and academia have been presented [1-3]. To fulfill the requirements that such accelerators place on memory accesses, both in terms of data availability and latency hiding, innovative memory architectures can enable more efficient data access, reducing latency and bridging the gap towards even more powerful DNN accelerators.

One example is the Eyeriss v2 ASIC [1], which uses a distributed Global Buffer (GB) layout tailored to the demands of their row-stationary systolic array dataflow.

In this seminar, a survey of state-of-the-art DNN accelerator designs and design frameworks shall be created, focusing on their memory hierarchy.

References and Further Resources:

[1] Y. -H. Chen, T. -J. Yang, J. Emer and V. Sze. 2019 "Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices," in IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 9, no. 2, pp. 292-308, June 2019, doi: https://doi.org/10.1109/JETCAS.2019.2910232

[2] Yunji Chen, Tianshi Chen, Zhiwei Xu, Ninghui Sun, and Olivier Temam. 2016. "DianNao family: energy-efficient hardware accelerators for machine learning." In Commun. ACM 59, 11 (November 2016), 105–112. https://doi.org/10.1145/2996864

[3] Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, et al. 2017. "In-Datacenter Performance Analysis of a Tensor Processing Unit." In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA '17). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3079856.3080246

[4] Rui Xu, Sheng Ma, Yang Guo, and Dongsheng Li. 2023. A Survey of Design and Optimization for Systolic Array-based DNN Accelerators. ACM Comput. Surv. 56, 1, Article 20 (January 2024), 37 pages. https://doi.org/10.1145/3604802

[5] Bo Wang, Sheng Ma, Shengbai Luo, Lizhou Wu, Jianmin Zhang, Chunyuan Zhang, and Tiejun Li. 2024. "SparGD: A Sparse GEMM Accelerator with Dynamic Dataflow." ACM Trans. Des. Autom. Electron. Syst. 29, 2, Article 26 (March 2024), 32 pages. https://doi.org/10.1145/3634703

Contact

benedikt.schaible@tum.de

Supervisor:

Benedikt Schaible

To top

Lehrstuhl für Entwurfs- automatisierung

Prof. Dr.-Ing. Ulf Schlichtmann

Technische Universität München
Arcisstr. 21
80333 München

Tel: +49.89.289.23666
Fax: +49.89.289.63666
office.eda(at)xcit.tum.de