A data locality based scheduler to enhance MapReduce performance in heterogeneous environments

dc.contributor.author Naik, Nenavath Srinivas
dc.contributor.author Negi, Atul
dc.contributor.author Tapas, Tapas Bapu
dc.contributor.author Anitha, R.
dc.date.accessioned 2022-03-27T05:52:44Z
dc.date.available 2022-03-27T05:52:44Z
dc.date.issued 2019-01-01
dc.description.abstract MapReduce is an essential framework for distributed storage and parallel processing for large-scale data-intensive jobs proposed in recent times. Hadoop default scheduler assumes homogeneous environment. This assumption of homogeneity does not work at all times in practice and limits the performance of MapReduce. Data locality is essentially moving computation closer (faster access) to the input data. Fundamentally, MapReduce does not always look into the heterogeneity from a data locality perspective. Improving data locality for MapReduce framework is an important issue to improve the performance of large-scale Hadoop clusters. This paper proposes a novel data locality based scheduler which allocates input data blocks to the nodes based on their processing capacity. Also schedules map andreduce tasks to the nodes based on their computing ability in the heterogeneous Hadoop cluster. We evaluate proposed scheduler using different workloads from Hi-Bench benchmark suite. The experimental results prove that our proposed scheduler enhances the MapReduce performance in heterogeneous environments. Minimizes job execution time, and also improves data locality for different parameters as compared to the Hadoop default scheduler, Matchmaking scheduler and Delay scheduler respectively.
dc.identifier.citation Future Generation Computer Systems. v.90
dc.identifier.issn 0167739X
dc.identifier.uri 10.1016/j.future.2018.07.043
dc.identifier.uri https://www.sciencedirect.com/science/article/abs/pii/S0167739X18308379
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/8559
dc.subject Data locality
dc.subject Heterogeneous environments
dc.subject MapReduce
dc.subject Task scheduler
dc.title A data locality based scheduler to enhance MapReduce performance in heterogeneous environments
dc.type Journal. Article
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: