Hadoop setup Method cleanup Method example in Mapreduce

In hadoop Mapreduce program basically two methods are there Hadoop setup Method and hadoop cleanup method.This two methods we will use in our mapreduce program based on our requirement.We can write Hadoop setup method and cleanup method in both Mapper and reducer where we required in our mapreduce code.

Hadoop setup method

What typically happens during setup() is that you may read parameters from the configuration object to customize your processing logic.The lifecycle of a map/reduce task is (from a programmer’s point of view):

setup -> map -> cleanup

setup -> reduce -> cleanup

Here is the main confusion is where exactly use hadoop setup method and hadoop clean up method and which situations we have to use hadoop setup and hadoop cleanup methods in our mapreduce program.In simple words we can write hadoop setup method before Map task or before reducer task.

setup: Called once at the beginning of the task

We can use hadoop setup method if you want to define user defined parameters to your map task or reducer task and you want to compare the words with mapreduce input file and if you want to use hadoop distributed cache in your mapreduce program at that time we will hadoop setup method.

Hadoop Cleanup Method

cleanup: Called once at the end of the task.


What typically happens during cleanup() is that you clean up any resources you may have allocated. There are other uses too, which is to flush out any accumulation of aggregate results.

As already mentioned, setup() and cleanup() are methods you can override, if you choose, and they are there for you to initialize and clean up your map/reduce tasks. You actually don’t have access to any data from the input split directly during these phases. The lifecycle of a map/reduce task is (from a programmer’s point of view):

setup -> map -> cleanup

setup -> reduce -> cleanup

So, for each mapreduce first setup() method is called then map()/reduce() method is called and later cleanup() method is called before exiting the task.

For example, in the canonical word count example, let’s say you want to exclude certain words from being counted (e.g. stop words such as “the”, “is”, etc…). When you configure your MapReduce Job, you can pass a list (comma-delimited) of these words as a parameter (key-value pair) into the configuration object. Then in your map code, during setup(), you can acquire the stop words and store them in some global variable (global variable to the map task) and exclude counting these words during your map logic



Hadoop setup method Driver Code

Hadoop setup method Mapper code

Hadoop Setup method Reducer code


This is the basic example for how to write Hadoop setup method and cleanup method in your mapreduce program . Please do subscribe for more updates from us,please share and comment your opinion about this post .

Speak Your Mind