Seminar on Next Generation AI Hardware (IN 2107)

Dr. Juan Durillo (LRZ), Dr. Josef Weidendorfer (LRZ, TUM), Dr. Nicolay Hammer (LRZ)

Dates:	tbd. (possible in block format towards the end of the semester)
Planning meeting:	No specific planning meeting is planned.
First meeting:
ECTS:	5
Language:	English
Type:	Seminar, 2S
Moodle course:
Registration:	Matching System

Motivation

Artificial Intelligence (AI), and particularly Machine Learning (ML), have changed the way many industries addressed numerous challenges over the last decade. They also aim to revolutionize (for the better) many aspects of our life (e.g., AI-assisted medicine, autonomous driving, etc.). ML methods are mathematical expressions, determined by a so-called neural architecture (e.g., convolutional neural networks, recurrent neural networks, transformers, etc.), whose coefficients are sought to best describe the structures in a dataset. This latter defines a workload whose characteristics are also neural architecture dependent. Despite CPUs and GPUs being extensively used for running these workloads, they are usually optimized for other tasks, resulting in inefficiencies (wall time, energy consumption, etc.). This has led to the appearance of numerous specialized hardware for ML: Google TPU, Apple M1 Neural Engines, GraphCore IPU processors, Cerebras CS Wafer Engine, and Sambanova among others. The rapid transformations that the ML field is experiencing makes it difficult to design a single optimize-all workloads architecture. Understanding the benefits of different hardware architectures under different constrains (size, latency, power, etc.) is important to sustain and accelerate ML progress.

The seminar goal is to dive into the existing panorama of hardware for AI by analyzing trade-offs between performance and functionality. The list of potential topics to be covered includes (but is not limited to):

ML on the Edge
ML on FPGAs
Review of state-of-the-art AI HW (ASICs, GPU and CPU special instructions, etc.)
Neural architectures and Hardware Accelerators
Streaming Execution Models
Sparsity Opportunities and Hardware Support
Instruction Set Architecture for ML
ML Workloads Bottleneck Analysis

The students will select material for the presentation, provide a report, and give a presentation.

To top

Informatik 10 - Lehrstuhl für Rechnerarchitektur & Parallele Systeme

Prof. Dr. Martin Schulz
schulzm(at)in.tum.de

Prof. Dr. Michael Gerndt
gerndt(at)in.tum.de

Prof. Dr.-Ing. Carsten Trinitis
Carsten.Trinitis(at)tum.de

Adresse:
Technische Universität München
Boltzmannstraße 3
85748 Garching
Deutschland

Sekretariat:
Raum 01.04.40
Tel.: +49 89 289-17659
Fax: +49 89 289-17662

Intranet