The ACM Student Research Competition (SRC) provides a platform for undergraduate and graduate students to showcase their original research before a panel of judges and conference attendees at top ACM-sponsored events. At SC25, our work advanced through two rounds of evaluation by an expert panel of judges and was ultimately selected as the First Place winner. First-place undergraduate and graduate winners from SRC events throughout the year progress to the SRC Grand Finals, where they compete against the first place winners from various ACM conferences across different areas of computer science.
Full Title: Unified Performance Modeling Stack for Distributed GPU Applications: Complementing Analytical Insights with Machine Learning
Abstract: Modern HPC applications increasingly use GPUs to solve larger problems with higher accuracy and speed. However, committing resources to these large-scale systems is often costly and time-consuming. Hence, performance modeling enables developers to estimate runtime, analyze scalability, and identify resource bottlenecks in advance. In this work, we propose a unified software ecosystem for end-to-end performance modeling of distributed GPU applications. To this end, we propose a combination of analytical and machine learning based modeling methodology, and design a comprehensive software stack to combine the various components for implementing such an approach. We validate the proposed framework using two real-life applications and provide performance estimations for the GPU kernel and inter-GPU MPI communications.
Link(s):
https://sc25.supercomputing.org/program/awards/
sc25.supercomputing.org/proceedings/posters/poster_pages/post143.html