Hardware-based job queue management for manycore architectures and openMP environments

Junghee Lee, Chrysostomos Nicopoulos, Yongjae Lee, Hyung Gyu Lee, Jongman Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

The seemingly interminable dwindle of technology feature sizes well into the nano-scale regime has afforded computer architects with an abundance of computational resources on a single chip. The Chip Multi-Processor (CMP) paradigm is now seen as the de facto architecture for years to come. However, in order to efficiently exploit the increasing number of on-chip processing cores, it is imperative to achieve and maintain efficient utilization of the resources at run time. Uneven and skewed distribution of workloads misuses the CMP resources and may even lead to such undesired effects as traffic and temperature hotspots. While existing techniques rely mostly on software for the undertaking of load balancing duties and exploit hardware mainly for synchronization, we will demonstrate that there are wider opportunities for hardware support of load balancing in CMP systems. Based on this fact, this paper proposes IsoNet, a conflict-free dynamic load distribution engine that exploits hardware aggressively to reinforce massively parallel computation in many core settings. Moreover, the proposed architecture provides extensive fault-tolerance against both CPU faults and intra-IsoNet faults. The hardware takes charge of both (1) the management of the list of jobs to be executed, and (2) the transfer of jobs between processing elements to maintain load balance. Experimental results show that, unlike the existing popular techniques of blocking and job stealing, IsoNet is scalable with as many as 1024 processing cores.

Original languageEnglish
Title of host publicationProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
Pages407-418
Number of pages12
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011 - Anchorage, AK, United States
Duration: 2011 May 162011 May 20

Publication series

NameProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

Conference

Conference25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
Country/TerritoryUnited States
CityAnchorage, AK
Period11/5/1611/5/20

Keywords

  • OpenMP
  • fault-tolerant
  • job queue
  • manycore

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Hardware-based job queue management for manycore architectures and openMP environments'. Together they form a unique fingerprint.

Cite this