Abstract
For an efficient processing of large data in a distributed system, Hadoop MapReduce performs task scheduling such that tasks are distributed with consideration of the data locality. The data locality, however, is limitedly exploited, since it is pursued one node at a time basis without considering the global optimality. In this paper, we propose a novel task scheduling algorithm that globally considers the data locality. Through experiments, we show our algorithm improves the performance of MapReduce in various situations.
Original language | English |
---|---|
Pages (from-to) | 2377-2380 |
Number of pages | 4 |
Journal | IEICE Transactions on Information and Systems |
Volume | E99D |
Issue number | 9 |
DOIs | |
Publication status | Published - 2016 Sept |
Keywords
- Data locality
- MapReduce
- Task scheduling algorithm
ASJC Scopus subject areas
- Software
- Hardware and Architecture
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering
- Artificial Intelligence