Semantics driven dynamic partial-order reduction of MPI-based parallel programs
Most distributed parallel programs in the high performance computing (HPC) arena are written using the MPI library. There is growing interest in using model checking for debugging these MPI programs. In this context, partial-order reduction has considerable potential for containing state explosion, given the distributed memory nature of MPI programs. This potential is largely unmet. In this paper, we first define the formal semantics for a non-trivial subset of MPI. We then prove independence theorems based on theformal semantics, paving the way to a semantically clear and general partial-order reduction approach for MPI. Our work describes, for the first time, the exact dependencies between MPI non-blocking send operations and their tests for completion, namely wait and test. We also offer a cleaner solution than in previous works for MPI wildcard receives,a proper handling of which requires knowledge of the future course of computations. We show that Flanagan and Godefroid's dynamic patial-order reduction algorithm offers a natural way to handle the need for future information. Our initial experimental results are encouraging.