Stepan Vanecek

Doctoral Candidate & Scientific Employee
Lehrstuhl für Rechnerarchitektur & Parallele Systeme (Prof. Schulz)
Technische Universität München
85748 Garching b. München, Boltzmannstr. 3/I
Raum 01.04.039
Tel: +49 (89) 289 - 17683
Email: stepan.vanecek (at) tum.de
Research interests
- Modern HPC architectures and topologies
- Topological details in HPC and their effect on performance
- Performance modelling on heterogeneous memory systems
- Benchmarking fine-grained system characteristics
- Performance analysis of parallel applications
- Memory and performance optimizations on CPUs and GPUs
Publications
- Stepan Vanecek, Manuel Walter Mußbacher, Dominik Größler, Urvij Saroliya and Martin Schulz, MT4G: A Tool for Reliable Auto-Discovery of NVIDIA and AMD GPU Compute and Memory Topologies (SC ProTools 2025, to appear)
- Stepan Vanecek, Managing Heterogeneous Topologies and Understanding Their Impact on Performance (SC'25 Doctoral Showcase Poster, to appear)
- Durganshu Mishra, Stepan Vanecek, Jorge Echavarria, Xiaolong Deng, Burak Mete, Laura Schulz and Martin Schulz, Towards a Unified Architectural Representation in HPCQC: Extending sys-sage for Quantum Technologies (ISC'25, Hans Meuer Award winner) paper
- Stepan Vanecek and Martin Schulz, sys-sage: A Unified Representation of Dynamic Topologies & Attributes on HPC Systems, (ICS'24) paper slides
- Soumya Sen, Stepan Vanecek, and Martin Schulz, GPUscout: Locating Data Movement-related Bottlenecks on GPUs, (ProTools 2023) paper slides
- Stepan Vanecek and Martin Schulz, sys-sage: A Fresh View on Dynamic Topologies & Attributes of HPC Systems, (SC'23 Research Poster) poster extended abstract
Tools
- sys-sage: Library for capturing and manipulating hadrware topology of compute systems ( https://github.com/caps-tum/sys-sage )
- Mitos: Memory access sample collection tool ( https://github.com/caps-tum/mitos )
- MemAxes: Memory access visualisation tool ( https://github.com/caps-tum/MemAxes )
- GPUscout: Tool for memory-related bottleneck analysis on GPUs ( https://github.com/caps-tum/GPUscout )
- MT4G: Tool for reverse-engineering GPU memory topologies ( https://github.com/caps-tum/mt4g )
Projects
- DEEP-SEA ( www.deep-projects.eu )
- Plasma-PEPSC ( https://plasma-pepsc.eu )
- SEANERGYS ( https://www.ce.cit.tum.de/caps/laufende-projekte/seanergys-eurohpc/ )
Teaching & Theses
Available Student Theses:
This list may be out-of-date (updated 09/2025). Please refer to Open Theses page of CAPS or email me personally.
- MA: Exploring the Effect of System Configuration on Performance Variability in HPC systems (co-advised with LRZ)
- BA/GR: Extending Mitos PEBS sample collection to Icelake architecture
- BA/MA/GR: Reverse-engineering GPU architectures and capabilities (AMD CDNA3)
- MA/GR: Reverse-engineering GPU compute capabilities with MT4G
- MA/GR: Extending MPI-over-CXL model with collectives
Please reach out to me to ask about the currently offered topics; there are usually more topics in the pipeline than what is published. I am looking for motivated students interested in writing a Thesis/Guided Research with me. Many of the topics can be wrapped up by co-writing a conference/workshop paper.
Student Theses (completed):
- [BA, 2025] Jonathan Wieser: Design, Implementation, and Test of a Python API for sys-sage
- [MA, 2025] Tobias Stuckenberger: Visual Presentation of GPUscout GPU Bottleneck Analysis
- [MA, 2025] Benedikt Deike: Extending GPUscout analysis to AMD GPUs
- [MA, 2024] Durganshu Mishra: Representing Quantum System Information and Topologies in sys-sage
- [BA, 2024] Finn Romaneessen: Transferring Hardware-related Information in sys-sage Across Processes and HPC Nodes
- [MA, 2024] Philip Sändig: Exploring Performance Variability in CPUs with sys-sage
- [BA, 2024] Jazmin Dojcsak: Integration of PAPI Metrics Collection Functionality Into the sys-sage Library
- [BA, 2023] Mohammad Khadem (co-advised with Eishi Arima)
- [BA, 2023] James Sutanto: Ranking Nodes by Energy Efficiency Using sys-sage (co-advised with Eishi Arima)
- [BA, 2023] Matthias Süß: Exploration of Performance Variations of GPU Cores
- [BA, 2023] Felix Sandmair: Collection of HPC Network Topology Information in sys-sage
- [BA, 2023] Daniel Schenk: Extension of MemAxes sample visualization to process AMD IBS samples
- [BA, 2023] Maksym Azatian: Capturing Memory Topology Across Different GPU Architectures
- [MA, 2023] Maximilian Geitner: An Analysis and Implementation of AMD IBS Sample Collection for Performance Analysis Purposes
- [MA, 2023] Soumya Sen: Discovering data movement-related bottlenecks on NVidia GPUs
- [BA, 2023] Jim Eckerlein: An Environment for Continuous Integration and Software Testing for sys-sage
- [MA, 2023] Ke Wang: Reverse Engineering Hardware Prefetching Mechanisms for Use in sys-sage (co-advised with Eishi Arima)
- [BA, 2022] Mark Pirkowski: A visual presentation of an HPC system state as stored in sys-sage
- [BA, 2022] Dominik Grössler: Capturing the memory topology of GPUs
- [BA, 2022] Ruilin Qi: Analysis of data movements over the PCIe bus in heterogeneous systems
Lectures/Seminars:
- [SS25] Seminar: Performance Modeling for Parallel Applications and Emerging Architectures
- [SS23] Seminar: History of Computer Architecture (Geschichte der Rechnerarchitektur)
- [SS22] Seminar: History of Computer Architecture (Geschichte der Rechnerarchitektur)