Parallel merge sort is useful for sorting a large quantity of data progressively. The merge sort should be parallelized carefully since the conventional algorithm has poor performance due to the successive reduction of the number of participating processors by half, and down to one in the last merging stage. The proposed load-balanced merge sort utilizes all processors throughout the computation. It evenly distributes data to all processors in each stage. Thus every processor is forced to work in all phases. Significant performance enhancement has been achieved up to a speedup of (P-1)/log P where P is the number of processors. Experimental results demonstrate a speedup of 9. 6 (upper bound of 10.7) on 32-processor Cray T3E when sorting 4M 32-bit integers, and a speed up of 2.3 (upper bound of 2.8) on an 8-node PC cluster.
Bibliographical noteFunding Information:
This research was supported by KOSEF Grant (No. R01-2001-00341).
- Load balancing
- Merge sort
- Parallel algorithm
ASJC Scopus subject areas
- Theoretical Computer Science
- Information Systems