Seminar on Dataflow-based Computing System for Irregular Parallelism
Yicheng Zhang and Kun Qin
Dates: | Kick-off Meeting mid October |
Kick-off meeting: | 17.10.2025/tentative |
Presentations: | t.b.d |
ECTS: | 5 |
Language: | English |
Type: | Seminar, 2SWS |
Registration: | Matching System |
Questions: | Contact yicheng.zhang(at)tum.de |
Requirements: | sufficient understanding of computer architecture |
Planning meeting: | Time: Jul 7, 2025 02:30 PM Amsterdam, Berlin, Rome, Stockholm, Vienna Meeting ID: 676 2535 5451 |
Motivation
Irregular parallelism is a critical type of workload like sparse linear algebra and graph analytics, being pervasive in scientific computing and simulation, and machine learning tasks. However, they feature data dependent branch and pointer-based data structure, challenging the traditional computer system with frequent synchronization, irregular memory access and dynamic generation of non-uniform tasks. Dataflow computing, as opposed to the control flow, triggers execution by the availability of data and only real dependency blocks processing, which makes it inherently suitable for exploiting irregular parallelism.
Despite of the limitation of historic pure dataflow ISA, in recent years industry and academia have proposed various modern computing systems absorbing the idea of dataflow processing at multiple levels of computation like microarchitecture or programming. Those systems are optimized on both hardware architectures ranging from general purpose manycore architecture, coarse grained reconfigurable arrays (CGRAs) to specialized accelerators, and the software infrastructures such as the programming model, compilation toolchain and runtime system. However, a trade-off arises because improving one metric—such as flexibility, performance, or area/power efficiency—often comes at the expense of others. In addition, to further scale-out the system at node level to handle larger datasets, programmable network interface cards are the emerging contributor. These devices accelerate packet processing pipelines, offloading tasks such as packet classification, encryption, and telemetry functions, thereby reducing host load and synchronization overhead.
The seminar goal is to dive into the state-of-the-art dataflow inspired computing systems for irregular parallelism covering hardware architecture, software infrastructure, and benchmarks. The list of potential topics to be investigated includes (but is not limited to):
Manycore architecture
(Cerebras WSE, GraphCore IPU, Tesla Dojo, Manticore, Hammerblade, Dalorex)
- Microarchitecture: memory hierarchy(Scratchpad vs cache), inner-core parallelism, prefetching, speculation, thread level parallelism, pipeline parallelism;
- System organization: clustering vs flat, network-on-chip;
- Dynamic scheduling: hardware task scheduler, runtime system (Intel TBB, OpenCilk, StarPU, ParSEC)
- Infrastructure for efficient scale-out: compute in network, smart switches, smart network interface card (NIC)
Coarse Grained Reconfigurable Arrays (CGRA)
- Architecture, programming, dynamic mapper for data flow graph (DFG) (DynMap, Fifer, OpenCGRA, CGRA-ME, LISA, TaskStream, DIPRS)
Specialized hardware accelerators for
- Sparse linear algebra kernels: SpMV, SpMM (Inner/Outer Product, Gustavson), LU & Cholesky factorization. etc.
- Graph Algorithms: SSSP, PageRank, BFS, WCC, etc.
Organization
Kick-off meeting
- date: tentatively on 17.10.2025
- Topic overview - Dataflow-based computing systems: history and trend
Mid-term meetings
- date: By appointment with your supervisor
- one mandatory meeting
Presentation
- date: t.b.d.
- hand in your draft report one week in advance
Final report submission
- date: until the end of the semester
Evaluation
- Attendance of the kick-off meeting, and one mandatory mid-term meeting
- Presentation with the draft report ready one week before
- Final report