M.Sc. Dirk Stober
(PhD Student)
dirk.stober(at)tum.de
Boltzmannstr. 3
85748 Garching b. München
Room: 5606.01.033
Research Interest
- Memory Accesses on FPGAs
- Benchmarking different Access Patterns and FPGA devices
- Optimizing memory accesses
- Configuring memory systems on FPGAs to match the bitstream
- High-Level Synthesis (HLS)
- Developing Accelerators using HLS
- Extracting a memory access graph from HLS
- Integration of FPGAs
- Simplifying communication between host and FPGA for both SoC and Datacenter devices
- Open-Source EDA tools
Student Thesis (GR/BA/MA)
If you are interested in one of the following topics, feel free to contact me about a thesis opportunity (Please attach your ToR and any relevant experience). All topics that are not specifically annotated, can be modified to a Guided Research (GR), Bachelor Thesis (BA) or Master Thesis (MA). Most of the topics require previous experience in FPGA development. In addition to the listed topics, I can also supervise external theses if it is related to FPGAs, EDA or embedded systems.
Open-Source FPGA/EDA tools
I want to explore the usability, availability and performance of open source tools, as an alternative to the commonly used proprietary toolchains (Vivado,Vitis HLS etc.). You would use existing designs or implement simple example designs to evaluate a single open-source tool, for example:
- HLS: Panda Bambu
- Synthesis: Yosys
- Place+Route: VTR, nextpnr
or evaluate a complete pass from either C++ or Verilog to a post P&R representation that can be converted to a bitstream. You should be interested in understanding EDA tools and have experience in developing FPGAs. Extensive manual configuration of the toolchains will be required due to the incomplete nature of these projects.
Extracting memory access patterns from Vitis HLS LLVM frontend
High Level Synthesis (HLS) tools simplify the design process of FPGAs
for Software Engineers. HLS provides fast feedback for on-chip
performance of functions, however it can’t model the performance of
off-chip accesses to memory. This leads designers of memory bound
algorithms to have to implement their circuits to get feedback requiring
multiple hours for larger chips. In this tasks you would start by
implementing a custom pass to the Vitis HLS LLVM toolchain to extract
information of accesses to off-chip arguments (top-level arguments) and
how they connect to each other. Depending on the type of the thesis you
work either on visualizing the memory access graph or on converting the
graph to instructions that can be simulated on the FPGA using an
existing traffic generator.
This thesis does not require previous FPGA knowledge, you should
be interested in compilers and some experience in LLVM would be
helpful
Implementing benchmarks for FPGAs using HLS
In this thesis you will implement one or multiple kernels that are mostly memory bound on FPGAs. Potential kernels could include SpMV, KNN, BFS, SSSP, NW , Kmeans and Pathfinder. You can select the kernel depending on your preference, with some of them you can rely on already existing implementations. This thesis requires in-depth knowledge of FPGA development, and ideally experience with HLS tools.
Teaching
I give a practical course, that I have designed as part of the BB-KI project during the Summer Semester, and a Seminar during the Winter Semester. You can find more information about the courses on the Chair’s website or by sending me an email.
Practical: Accelerating CNNs using PL (SS)
The course teaches the relevant skills for students to develop their own FPGA designs using HLS. It consists of a weekly lecture and accompanying labs. It is roughly split into the following blocks:
- Digital Design and FPGAs: Recaps Digital Design Basics and introduces building blocks of FPGAs. Students use SystemVerilog and Vivado to synthesize simple designs
- High-Level Synthesis: Main part of the course covers writing HLS code, simulation, implementation as well as integration on a Zynq board using XRT.
- Project Accelerating CNNs: Students accelerate a simple CNN starting from a provided C++ implementation. We use the PYNQZ2 board to implement the design
Seminar: Integration and Development of HW Accelerators (WS)
In the seminar students can pick between different topics, or suggest their own topics related to the area. Students prepare a literature survey or replicate parts of existing research, and present their results in form of a report, and a presentation.