Call for Papers 2024 |
Oct 2024 - Volume 15, Issue 5
Deadline: 15 Sep 2024
Publication: 20 Oct 2024
Dec 2024 - Volume 15, Issue 6
Deadline: 15 Nov 2024
Publication: 20 Dec 2024
More
|
|
|
ABSTRACT
Title |
: |
Efficient Processing and recouping of data using combiners in Map Reduce framework |
Authors |
: |
V HEMANTH KUMAR, M PURNA CHANDRA RAO, CH NARAYANARAO, P HARIBABU |
Keywords |
: |
Map Reduce; Cluster; HDFS; Yarn; Combiners; Hadoop |
Issue Date |
: |
Dec 2017-Jan 2018 |
Abstract |
: |
Consider any data structure, an Array for instance and declare the size of an Array either using
static approach or dynamic approach. This cannot be a generic solution for large text files as this involves in huge memory allocations for the data structure. Even this can be a difficult procedure as the data size increases, processing the data will be time consuming process. Existing solutions such as lists and even heap will process the data effectively for large text files even to a certain boundary level (depends on the ram constraint). Addressing these huge volumes of data, the solution will not work in a single node and it has to spread across the cluster (storing data on the disk) .Hadoop will address all these big data problems using map reduce technique, as processing will be done in parallel manner. Map reduce is a functional programming model which has two functions map and reduce and will perform distributed parallel processing. In order to make the retrieval much faster, introducing the concept of implementing combiners between mapper and reducer. Implement a combiner function after the mapper function as the mapper generates output. The combined data that is performed by the combiners will be sent to the shuffle and sort functionality. And then from there it sends to the reduce function for obtaining the final output. The time taken to retrieve the data after processing by map reduce without using combiners will be more when compared with the map reduce processing using combiners. We generally make use of computation time and data transfer time constraints to support the above statement. This paper presents an effective approach for processing big data using combiners which will be also considered as map side reducers or mini reducers. |
Page(s) |
: |
674-678 |
ISSN |
: |
0976-5166 |
Source |
: |
Vol. 8, No.6 |
|