TY - GEN
T1 - Load-balanced parallel merge sort on distributed memory parallel computers
AU - Jeon, Minsoo
AU - Kim, Dongseung
N1 - Publisher Copyright:
© 2002 IEEE.
PY - 2002
Y1 - 2002
N2 - Sort can be speeded up on parallel computers by dividing and computing data individually in parallel. Merge sort can be parallelized, however, the conventional algorithm implemented on distributed memory computers has poor performance due to the successive reduction of the number of active (non-idling) processors by a half, up to one in the last merging stage. This paper presents load-balanced parallel merge sort algorithm where all processors participate in merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in merging phase. Significant enhancement of the performance has been achieved. Our analysis shows the upper bound of the speedup of the merge time as (P - 1)/ log P. We have had a speedup of 9.6 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers. The same idea can be applied to parallellize other sorting algorithms.
AB - Sort can be speeded up on parallel computers by dividing and computing data individually in parallel. Merge sort can be parallelized, however, the conventional algorithm implemented on distributed memory computers has poor performance due to the successive reduction of the number of active (non-idling) processors by a half, up to one in the last merging stage. This paper presents load-balanced parallel merge sort algorithm where all processors participate in merging throughout the computation. Data are evenly distributed to all processors, and every processor is forced to work in merging phase. Significant enhancement of the performance has been achieved. Our analysis shows the upper bound of the speedup of the merge time as (P - 1)/ log P. We have had a speedup of 9.6 (upper bound is 10.5) on 32-processor Cray T3E in sorting of 4M 32-bit integers. The same idea can be applied to parallellize other sorting algorithms.
UR - http://www.scopus.com/inward/record.url?scp=84966648411&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2002.1016670
DO - 10.1109/IPDPS.2002.1016670
M3 - Conference contribution
AN - SCOPUS:84966648411
T3 - Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002
SP - 248
BT - Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2002
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th International Parallel and Distributed Processing Symposium, IPDPS 2002
Y2 - 15 April 2002 through 19 April 2002
ER -