e-ISSN:0976-5166
p-ISSN:2231-3850


INDIAN JOURNAL OF COMPUTER SCIENCE AND ENGINEERING

Call for Papers 2024

Oct 2024 - Volume 15, Issue 5
Deadline: 15 Sep 2024
Publication: 20 Oct 2024

Dec 2024 - Volume 15, Issue 6
Deadline: 15 Nov 2024
Publication: 20 Dec 2024

More

 

ABSTRACT

Title : Efficient Processing and recouping of data using combiners in Map Reduce framework
Authors : V HEMANTH KUMAR, M PURNA CHANDRA RAO, CH NARAYANARAO, P HARIBABU
Keywords : Map Reduce; Cluster; HDFS; Yarn; Combiners; Hadoop
Issue Date : Dec 2017-Jan 2018
Abstract :
Consider any data structure, an Array for instance and declare the size of an Array either using static approach or dynamic approach. This cannot be a generic solution for large text files as this involves in huge memory allocations for the data structure. Even this can be a difficult procedure as the data size increases, processing the data will be time consuming process. Existing solutions such as lists and even heap will process the data effectively for large text files even to a certain boundary level (depends on the ram constraint). Addressing these huge volumes of data, the solution will not work in a single node and it has to spread across the cluster (storing data on the disk) .Hadoop will address all these big data problems using map reduce technique, as processing will be done in parallel manner. Map reduce is a functional programming model which has two functions map and reduce and will perform distributed parallel processing. In order to make the retrieval much faster, introducing the concept of implementing combiners between mapper and reducer. Implement a combiner function after the mapper function as the mapper generates output. The combined data that is performed by the combiners will be sent to the shuffle and sort functionality. And then from there it sends to the reduce function for obtaining the final output. The time taken to retrieve the data after processing by map reduce without using combiners will be more when compared with the map reduce processing using combiners. We generally make use of computation time and data transfer time constraints to support the above statement. This paper presents an effective approach for processing big data using combiners which will be also considered as map side reducers or mini reducers.
Page(s) : 674-678
ISSN : 0976-5166
Source : Vol. 8, No.6