; ; ; ;
2018
Online
DOI: 10.18154/RWTH-2017-10493
URL: https://publications.rwth-aachen.de/record/710311/files/data.zip
Einrichtungen
Projekte
Kurzfassung
Accelerator devices are increasingly used to build large supercomputers and current installations usually include more than one accelerator per system node. To keep all devices busy, kernels have to be executed concurrently which can be achieved via asynchronous kernel launches. Our work compares the performance for an implementation of the Conjugate Gradient method with CUDA, OpenCL, and OpenACC on NVIDIA Pascal GPUs. Furthermore, it takes a look at Intel Xeon Phi coprocessors when programmed with OpenCL and OpenMP. In doing so, it tries to answer the question of whether the higher abstraction level of directive based models is inferior to lower level paradigms in terms of performance.This archive contains the modications to liboffload, all binaries and libraries including their respective commit ids, and the raw data of ourmeasurements.
OpenAccess:
ZIP
Dokumenttyp
Dataset
Format
online
Sprache
English
Interne Identnummern
RWTH-2017-10493
Datensatz-ID: 710311
Beteiligte Länder
Germany, UK
Contribution to a book/Contribution to a conference proceedings
Evaluation of Asynchronous Offloading Capabilities of Accelerator Programming Models for Multiple Devices
Accelerator Programming Using Directives : 4th International Workshop, WACCPD 2017, Held in Conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017, Denver, CO, USA, November 13, 2017, Proceedings / edited by Sunita Chandrasekaran, Guido Juckeland
4. International Workshop on Accelerator Programming Using Directives, WACCPD 2017, Denver, CODenver, CO, USA, 13 Nov 2017 - 13 Nov 2017
Cham : Springer International Publishing, Lecture notes in computer science 10732, 160-182 (2018) [10.1007/978-3-319-74896-2_9]
BibTeX |
EndNote:
XML,
Text |
RIS