TY - GEN
T1 - Quantifying differences between openMP and MPI using a large-scale application suite
AU - Armstrong, Brian
AU - Kim, Seon Wook
AU - Eigenmann, Rudolf
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2000.
PY - 2000
Y1 - 2000
N2 - In this paper we provide quantitative information about the performance differences between the OpenMP and the MPI version of a large-scale application benchmark suite, SPECseis. We have gathered extensive performance data using hardware counters on a 4-processor Sun Enterprise system. For the presentation of this information we use a Speedup Component Model, which is able to precisely show the impact of various overheads on the program speedup. We have found that overall, the performance figures of both program versions match closely. However, our analysis also shows interesting differences in individual program phases and in overhead categories incurred. Our work gives initial answers to a largely unanswered research question: what are the sources of inefficiencies of OpenMP programs relative to other programming paradigms on large, realistic applications. Our results indicate that the OpenMP and MPI models are basically performance-equivalent on shared-memory architectures. However, we also found interesting differences in behavioral details, such as the number of instructions executed, and the incurred memory latencies and processor stalls.
AB - In this paper we provide quantitative information about the performance differences between the OpenMP and the MPI version of a large-scale application benchmark suite, SPECseis. We have gathered extensive performance data using hardware counters on a 4-processor Sun Enterprise system. For the presentation of this information we use a Speedup Component Model, which is able to precisely show the impact of various overheads on the program speedup. We have found that overall, the performance figures of both program versions match closely. However, our analysis also shows interesting differences in individual program phases and in overhead categories incurred. Our work gives initial answers to a largely unanswered research question: what are the sources of inefficiencies of OpenMP programs relative to other programming paradigms on large, realistic applications. Our results indicate that the OpenMP and MPI models are basically performance-equivalent on shared-memory architectures. However, we also found interesting differences in behavioral details, such as the number of instructions executed, and the incurred memory latencies and processor stalls.
UR - http://www.scopus.com/inward/record.url?scp=77957065596&partnerID=8YFLogxK
U2 - 10.1007/3-540-39999-2_45
DO - 10.1007/3-540-39999-2_45
M3 - Conference contribution
AN - SCOPUS:77957065596
SN - 9783540411284
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 482
EP - 493
BT - High Performance Computing - 3rd International Symposium, ISHPC 2000, Proceedings
A2 - Valero, Mateo
A2 - Joe, Kazuki
A2 - Kitsuregawa, Masaru
A2 - Tanaka, Hidehiko
PB - Springer Verlag
T2 - 3rd International Symposium on High Performance Computing, ISHPC 2000
Y2 - 16 October 2000 through 18 October 2000
ER -