Distribution-insensitive parallel external sorting on PC clusters

Minsoo Jeon, Dongseung Kim

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

There have been many parallel external sorting algorithms reported such as NOW-Sort, SPsort, and hill sort, etc. They are for sorting large-scale data stored in the disk, but they differ in the speed, throughput, and cost-effectiveness. Mostly they deal with data that are uniformly distributed in their value range. Few research results have been yet reported for parallel external sort for data with arbitrary distribution. In this paper, we present two distribution-insensitive parallel external sorting algorithms that use sampling technique and histogram counts to achieve even distribution of data among processors, which eventually contribute to achieve superb performance. Experimental results on a cluster of Linux workstations show up to 63% reduction in the execution time compared to previous NOW-sort.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditorsAlex Veidenbaum, Kazuki Joe, Hideharu Amano, Hideo Aiso
PublisherSpringer Verlag
Pages202-213
Number of pages12
ISBN (Print)3540203591, 9783540397076
DOIs
Publication statusPublished - 2003

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2858
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Distribution-insensitive parallel external sorting on PC clusters'. Together they form a unique fingerprint.

Cite this