Motivation

Artificial Intelligence (AI), and particularly Machine Learning (ML), have changed the way many industries addressed numerous challenges over the last decade. They also aim to revolutionize (for the better) many aspects of our life (e.g., AI-assisted medicine, autonomous driving, etc.). ML methods are mathematical expressions, determined by a so-called neural architecture (e.g., convolutional neural networks, recurrent neural networks, transformers, etc.), whose coefficients are sought to best describe the structures in a dataset. This latter defines a workload whose characteristics are also neural architecture dependent. Despite CPUs and GPUs being extensively used for running these workloads, they are usually optimized for other tasks, resulting in inefficiencies (wall time, energy consumption, etc.). This has led to the appearance of numerous specialized hardware for ML: Google TPU, Apple M1 Neural Engines, GraphCore IPU processors, Cerebras CS Wafer Engine, and Sambanova among others. The rapid transformations that the ML field is experiencing makes it difficult to design a single optimize-all workloads architecture. Understanding the benefits of different hardware architectures under different constrains (size, latency, power, etc.) is important to sustain and accelerate ML progress.  

 

The seminar goal is to dive into the existing panorama of hardware for AI by analyzing trade-offs between performance and functionality. The list of potential topics to be covered includes (but is not limited to):    

  • ML on the Edge    
  • ML on FPGAs    
  • Review of state-of-the-art AI HW (ASICs, GPU and CPU special instructions, etc.)    
  • Neural architectures and Hardware Accelerators    
  • Streaming Execution Models    
  • Sparsity Opportunities and Hardware Support    
  • Instruction Set Architecture for ML    
  • ML Workloads Bottleneck Analysis  

The students will select material for the presentation, provide a report, and give a presentation.