Performance impact of JobTracker failure in Hadoop

Young Pil Kim, Cheol Ho Hong, Chuck Yoo

    Research output: Contribution to journalArticlepeer-review

    6 Citations (Scopus)

    Abstract

    In this paper, we analyze the performance impact of JobTracker failure in Hadoop. A JobTracker failure is a serious problem that affects the overall job processing performance. We describe the cause of failure and the system behaviors because of failed job processing in the Hadoop. On the basis of the analysis, we build a job completion time model that reflects failure effects. Our model is based on a stochastic process with a node crash probability. With our model, we run simulation of performance impact with very credible failure data available from USENIX called computer failure data repository that have been collected for past 9-years. The results show that the performance impact is very severe in that the job completion time increases about four times typically, and in a worst case, it increases up to 68 times.

    Original languageEnglish
    Pages (from-to)1265-1281
    Number of pages17
    JournalInternational Journal of Communication Systems
    Volume28
    Issue number7
    DOIs
    Publication statusPublished - 2015 May 10

    Bibliographical note

    Publisher Copyright:
    Copyright © 2014 John Wiley & Sons, Ltd.

    Keywords

    • Hadoop
    • JobTracker
    • failure analysis
    • large-scale data processing

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Electrical and Electronic Engineering

    Fingerprint

    Dive into the research topics of 'Performance impact of JobTracker failure in Hadoop'. Together they form a unique fingerprint.

    Cite this