Efficient Programming of Multicore Processors and Supercomputers (IN2106)
Prof. Dr. Michael Gerndt, Prof. Dr. Martin Schulz
|Dates:||Monday, 12:30 - 14:00, presence meetings|
|Type:||Master lab course, 6P|
|Registration:||Registration is through the matching system|
All manufacturers are currently offering multicore microprocessors. These are used in any PC and laptop. To speedup applications the programmers now face the problem of parallelization. Only programs that can efficiently use multiple cores will benefit from the current trend of increasing the number of cores every year. herefore, computer experts will need deep knowledge in parallel programming interfaces, performance analysis and optimization tools.
Parallel programming is already done for supercomputers since many decades. Supercomputers (www.top500.org) are used in science and research but also in commercial computing. Many of today services in the weg are also based on big parallel clusters, e.g., Amazon and Google. Based on hundreds of thousands of cores, today's supercomputers are able to execute applications that would have taken month of execution time only a few years ago.
High performance architectures are not only special purpose parallel systems like the SuperMUC at LRZ, but also clusters of desktop workstations that are connected by a fast interconnection network, elg., Infiniband. The design decision and the actual coding of parallel applications determine have to be based on application properties and on the characteristics of the target architecture to obtain efficient programs.
Programming multicore processors is done based on explicit thread programming or, more easily, on OpenMP. OpenMP is a standard programming interface for shared memory systems based on C/C++ or Fortran. The cores communicate via a shared memory. This shared memory concept is also offered by small scale HPC systems. Large systems are typically programmed in MPI (Message Passing Interface). This interface offers more control over the communication among the processors of the system by explicit message passing. Both interfaces, OpenMP and MPI, are standardized and thus are available on all parallel systems. For analyzing the performance of parallel applications, some graphical tools are offered. These tools help in program tuning since they point to inefficient program regions.
Within this lab course you will learn about OpenMP and MPI programming on multicore processors and supercomputers. We will use the SuperMUC at LRZ which has a peak performance of 3 petaflop, one of the fastest computers in Germany.
The first project you will work on is the parallelization of a heat simulation in a solid body. The pictures shows the resulting heat distribution where a single heat source is place in the lower right corner. This application will first be optimized for sequential execution, then be parallelized with OpenMP and finally with MPI.
The second project is the parallelization of the game Abalone. You will parallelize the given sequential search algorithms, a so called Alpha/Beta search strategy. This strategies determines the best next move for a player. To be able to compare the implementations of the different groups we will run a competition between the groups.
Both applications will be programmed in C/C++.
We will have weekly meetings where the projects are introduced, presentations are given, and results are presented by the students. The projects will be executed by groups of two students.
More information: Michael Gerndt