Hadoop MapReduce Counters

Hadoop MapReduce Counters

Hadoop MapReduce Features:
They are more advanced features are available in Hadoop MapReduce such as counters and sorting and joining datasets.

Hadoop MapReduce Counters:
Counters are a useful channel for gathering statistics about the job which means it show for quality control or for application level-statistics.They are also useful for problem diagnosis.Counter values being much easier to retrieve than log output for large distributed jobs.
                        If you aretempted to put a log message into your map or reduce task, then it is often better to see whether you can use a counter instead to record that a particular condition occurred.

Built-in Counters:

Hadoop should maintain a built-in counters for eveyjob,which report various metrics for your job.for example there are counters for the number of input files and records processed.

Hadoop MapReduce Counters are divided into two groups:
1)Task Counters
2)Job Counters

Hadoop MapReduce Counters

There are several groups for the built-in counters
1)MapReduceTask Counters
2)Filesystem Counters
3)FileInput-Format Counters
4)FileOutput-Format Counters
5)Job Counters

Task counters:
Task counters gather information about tasks over the course of their execution, and the results are aggregated over all the tasks in a job.Task counters are maintained by each task attempt, and periodically sent to the tasktracker and then to the jobtracker. Counter values are definitive only once a job has successfully completed. However,some counters provide useful diagnostic information as a task is progressing, and it can
be useful to monitor them with the web UI.
                  For example,PHYSICAL_MEMORY_BYTES,VIRTUAL_MEMORY_BYTES, and COMMITTED_HEAP_BYTES provide an indication of how memory usage varies over the course of a particular task attempt.

Job counters:

Job counters are maintained by jobtracker,which measures the job level statistics.For example, TOTAL_LAUNCHED_MAPS counts the number of map tasks that were launched over the course of a job.

User-Defined Java Counters:

MapReduce can allow the userdefined java counters by using java “enum” keyword.A job may define an arbitrary number of enums, each with an arbitrary number of fields. The name of the enum is the group name, and the enum’s fields are the counter names.
public class MaxTemperatureWithCounters extends Configured implements Tool {
enum Temperature {
System.err.println(“Ignoring possibly corrupt input: ” + value);
reporter.incrCounter(Temperature.MALFORMED, 1);
} else if (parser.isMissingTemperature()) {
reporter.incrCounter(Temperature.MISSING, 1);}
2014/11/20 06:33:36 INFO mapred.JobClient: Air Temperature Records
2014/11/20 06:33:36 INFO mapred.JobClient: Malformed=3——-> userdefined counters
2014/11/20 06:33:36 INFO mapred.JobClient: Missing=661368—-> userdefined counters

Speak Your Mind