Evaluating static analysis techniques to accelerate data race detection for MPI RMA

Oraji, Yussur Mustafa; Müller, Matthias S.; Schwitanski, Simon; Noll, Thomas
doi:10.18154/RWTH-2023-05106
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@MASTERSTHESIS{Oraji:958085,
      author       = {Oraji, Yussur Mustafa},
      othercontributors = {Müller, Matthias S. and Noll, Thomas and Schwitanski,
                          Simon},
      title        = {{E}valuating static analysis techniques to accelerate data
                      race detection for {MPI} {RMA}},
      school       = {RWTH Aachen University},
      type         = {Bachelorarbeit},
      address      = {Aachen},
      publisher    = {RWTH Aachen University},
      reportid     = {RWTH-2023-05106},
      pages        = {1 Online-Ressource : Diagramme},
      year         = {2023},
      note         = {Veröffentlicht auf dem Publikationsserver der RWTH Aachen
                      University; Bachelorarbeit, RWTH Aachen University, 2023},
      abstract     = {Most high-performance computing systems utilize a
                      distributed memory system, where a message-passing
                      specification such as MPI is required for data communication
                      across processes. MPI especially allows for one-sided
                      communication, where message passing requires only one
                      process to start the communication while the other is not
                      required to perform a corresponding MPI call. Both standard
                      MPI and MPI RMA are prone to data races however, requiring
                      significant effort to find and fix. While MPI RMA data race
                      detectors exist, they often significantly slow down program
                      execution. This is especially the case for dynamic analysis
                      tools which perform race detection at runtime. MUST-RMA, one
                      such tool, can cause a slowdown of up to a factor of 16. In
                      contrast, static tools can run cheaply at compile time with
                      minimal overhead. The combination of both dynamic and static
                      analysis may therefore prove useful: This thesis presents
                      three static optimization approaches for MPI RMA data race
                      detection based on MUST-RMA. The first approach generates a
                      whitelist of relevant values and instructions to inspect for
                      the dynamic tool, while others may be ignored. Though
                      similar to the approach used in MC-Checker, the
                      implementation is more generally applicable and extensible,
                      for example, to additional programming languages such as
                      Fortran. This whitelist may also be extended with additional
                      information, more specifically on which type each value
                      stored corresponds to. By checking whether or not the code
                      only performs remote reads, writes or both additional
                      filtering of this whitelist is possible for potential speed
                      gain. Finally, the race detection itself may simply be
                      delayed until the moment it is required, which is the moment
                      the MPI RMA window is created. Additionally, the race
                      detection may be turned off again when this window is
                      destroyed. These optimization approaches were built on top
                      of the LLVM framework as compile time passes, with the
                      implementation general enough to support both C and C++ at
                      this time. All optimizations used support interprocedural
                      analysis, and, through the use of a modified compilation
                      pipeline, may also be used across translation units. While
                      introducing some false negatives, applying these
                      optimizations provides a 2x speedup compared to normal MUST
                      execution in most cases, with best case scenarios reaching a
                      speedup of 4x.},
      cin          = {123010 / 120000},
      ddc          = {004},
      cid          = {$I:(DE-82)123010_20140620$ / $I:(DE-82)120000_20140620$},
      typ          = {PUB:(DE-HGF)2},
      doi          = {10.18154/RWTH-2023-05106},
      url          = {https://publications.rwth-aachen.de/record/958085},
}
h1

h2

h3

h4

h5

h6

RWTH

Kontakt

RWTH Publications

Allgemeines