Accelerating Convolutional Neural Networks using Programmable Logic

Dates:

Tuesday 10:00-12:00 (Lecture) (Room: 01.06.020)

Friday 14:00-16:00 (Lab Session) (Room: 01.06.020)

First meeting:

Tuesday 16.04  at 10:00-12:00 (Room: 01.06.020)

Preliminary Meeting:

slides (Additional Information)
ECTS: 10
Language: English
Type: Bachelor/Master lab course (IN0012, IN2106)
Moodle course: t.b.d

Registration:

Registration is through the matching system

Please sign up to the course: "Practical: Accelerating Convolutional Neural Networks using Programmable Logic"

Questions? 

Contact dirk.stober(at)tum.de

 

This course is part of the BB-KI (Brandenburg / Bayern Aktion für KI-Hardware) chips project, aimed at offering practical courses in the area of dedicated AI Hardware.

It is provided seperately for bachelor and master students, but carried out as one event.


Content

The course consists of a weekly lecture to teach the required concepts, introduce the practical exercises and student presentations. In addition, a weekly lab slot is offered for students to ask questions and for help regarding the practical exercises. The course will cover the following:

  • Introduction to Convolutional Neural Networks (CNNs) and implementation of CNN inference 
  • Understanding of the building blocks of FPGAs and their purpose using SystemVerilog
  • Project in simulation and synthesis, co-designing your own CNN accelerator using HLS
  • You will implement the accelerator on an FPGA and integrate it with a CPU using the Pynq Z2 board
  • Evaluation of key performance metrics and comparison of SW/HW implementations

The main focus of the course is the acceleration of algorithms using FPGAs not on AI! 


Grading

The lab will be done in small groups (max. 3 students) and consists of minor non-grade labs, as well as a mid-term report. The final grade will be based on a Project (HW/SW co-design of CNN inference) including a Report, Presentation and an individual discussion of the implementation


Learning Outcomes

  • Basic understanding of Convolutional Neural Networks (mainly Inference)
  • Basic Knowledge of existing AI Accelerators
  • Understanding the challenges of using PL to accelerate workloads
  • Ability to design simple digital circuits using RTL and HLS languages
  • Implementation and Integration of both SW and PL on a SoC platform (Pynq Z2)
  • Co-design of SW and HW
  • Ability to reason about the performance of different implementations

Prerequisites

  • Experience in Programming C/C++ required
  • Basic knowledge of Microcontrollers recommended
  • Basic knowledge of a RTL language (Verilog/VHDL) recommended or willingness to learn on your own
  • Knowledge of Machine Learning not required