TY - GEN
T1 - Parallelizing merge sort onto distributed memory parallel computers
AU - Jeon, Minsoo
AU - Kim, Dongseung
PY - 2002
Y1 - 2002
N2 - Merge sort is useful in sorting a great number of data progressively, especially when they can be partitioned and easily collected to a few processors. Merge sort can be parallelized, however, conventional algorithms using distributed memory computers have poor performance due to the successive reduction of the number of participating processors by a half, up to one in the last merging stage. This paper presents load-balanced parallel merge sort where all processors do the merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in all merging phases. An analysis shows the upper bound of the speedup of the merge time as (P- 1)/log P where P is the number of processors. We have reached a speedup of 8.2 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers.
AB - Merge sort is useful in sorting a great number of data progressively, especially when they can be partitioned and easily collected to a few processors. Merge sort can be parallelized, however, conventional algorithms using distributed memory computers have poor performance due to the successive reduction of the number of participating processors by a half, up to one in the last merging stage. This paper presents load-balanced parallel merge sort where all processors do the merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in all merging phases. An analysis shows the upper bound of the speedup of the merge time as (P- 1)/log P where P is the number of processors. We have reached a speedup of 8.2 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers.
UR - http://www.scopus.com/inward/record.url?scp=68749103290&partnerID=8YFLogxK
U2 - 10.1007/3-540-47847-7_5
DO - 10.1007/3-540-47847-7_5
M3 - Conference contribution
AN - SCOPUS:68749103290
SN - 354043674X
SN - 9783540436744
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 25
EP - 34
BT - High Performance Computing - 4th International Symposium, ISHPC 2002, Proceedings
T2 - 4th International Symposium on High Performance Computing, ISHPC 2002
Y2 - 15 May 2002 through 17 May 2002
ER -