Hadoop Falcon Tutorial

Hadoop Falcon Tutorial

Hadoop Falcon is a framework for managing data processing in Hadoop clusters or a data life cycle management framework.In Understanding way Falcon is a framework for simplifying and planning data management and pipeline processing in hadoop.Falcon can manage data management and processing pipelines,replication,work flow and compliance use cases.It also easily integrated with YARN.

This Falcon architecture is center of hadoop,to centrally manage the cluster’s data governance, maximize data pipeline reuse and enforce consistent data lifecycles.

Advantages of Hadoop Falcon

The Apache Falcon community is working to enhance operations, support for transactional applications and improved tooling.

Advantages of Hadoop Falcon

What Hadoop Falcon Does

Falcon simplifies the development and management of data processing pipelines with a higher layer of abstraction, taking the complex coding out of data processing applications by providing out-of-the-box data management services. This simplifies the configuration and orchestration of data motion, disaster recovery and data retention workflows.

Apache Falcon meets enterprise data governance needs in three areas:

What Hadoop Falcon Does

How Hadoop Falcon Works

How Hadoop Falcon Works

How Hadoop Falcon Works Internally 

How Hadoop Falcon Works internally

A user creates entity specifications and submits to Falcon using the Command Line Interface (CLI) or REST API. Falcon transforms the entity specifications into repeated actions through a Hadoop workflow scheduler. All the functions and workflow state management requirements are delegated to the scheduler. By default, Falcon uses Apache Oozie as the scheduler.

Recent Versions of Hadoop Falcon

Recent versions of Hadoop falcon


Speak Your Mind