Apache Spark Tutorial for beginners

apache-spark-tutorial

Apache Spark is a open source processing engine.Apache Spark is a fast and general engine for large-scale data processing.Spark is a lightning-fast cluster computing designed for fast computation. Apache spark: Streaming Data Apache Spark’s key use case is its ability to process streaming data. With so much data being processed on a daily basis, it […]

Apache Spark Supported File Systems

Apache Spark Supported File Systems

Apache Spark Supported File Systems :Apache Spark is one of the trending technology in today IT industry.Apache Spark supports many file systems to process data.File Systems supported by Apache spark described below.Here is complete details about Apache Spark Supported File Systems .  Apache Spark Supported File Systems: Spark supports a large number of file systems for […]

Sqoop Import

sqoop import

SQOOP IMPORT: The sqoop import tool imports individual tables from RDBMS to HDFS.Sqoop import tool takes data from RDBMS with four mappers by default,to HDFS.In the process of importing sqoop provides java classes and jar file. Sqoop  import <——->  either Database(tables) or Mainframe(datasets)   Sqoop tool ‘import’ is used to import table data from the […]

Apache Sqoop Introduction

apache-hadoop-sqoop

RDBMS is the one of the source to genereate GB’s of data.This Big Data storages and analyzers such as MapReduce, Hive, HBase, Cassandra, Pig,Sqoop,etc. of the Hadoop ecosystem came into picture, they required a tool to interact with the relational database servers for importing and exporting the Big Data residing in them. Here is description […]

Installation of Sqoop

sqoop installation

 Installation of sqoop: Installation of sqoop is very easy.In the process  installation of sqoop no need to update the sqoop-site.xml for basic sqoop commands.Just download the sqoop installation tar file from apache mirrors,then copy it into specified directory,and then extract it,finally update the bashrc file. Sqoop is used to transfer data between Hadoop and relational databases or […]