Bigdata Hadoop Interview Questions

hadoop interview questions

Bigdata Hadoop Interview Questions 1)What is BigData? Now a days ,data which is comes from different sources such as facebook,twitter,gmail,supermarket,sensors,e-commerce ,hospital,offices… should be available on both structured and unstructured format.Bigdata may be important to business and society.The real issue is not that you are acquiring large amounts of data. It’s what you do with the […]

Hadoop Distributed File System


Hadoop Distributed File System¬† HDFS was based on a paper Google published about their Google File System.Hadoop Distributed File System (HDFS) is a Java-based file system that provides scalable and reliable data storage that is designed to span large clusters of commodity servers. HDFS runs on top of the existing file systems on each node […]

How Facebook uses Hadoop and Hive

Hadoop Facebook

How Facebook uses Hadoop and Hive Most of the IT Companies are using Hadoop technology why because which can store large datasets and process large datasets.In Hadoop ecosystem which have database(HBase),datawarehouse(Hive),these two components are very useful to storing transcational data in hbase and generate reports by using hive.In traditional RDBMS supports upto certain limit of […]

NoSQL Apache Hbase Data Model Design

NoSQL Apache Hbase Data Model Design

NoSQL Apache Hbase Data Model Design Data Model In Hbase In Hbase,data is stored as a table(have rows and columns) similar to RDBMS but this is not a helpful analogy. Instead, it can be helpful to think of an HBase table as a multi-dimensional map. Hbase Data model Terminology Table(Hbase table consists of rows) row(Row […]

Key Features in Hbase

key features in hbase

Key Features in Hbase Key Features in Hbase are¬†HBase is not an “eventually consistent” DataStore. This makes it very suitable for tasks such as high-speed counter aggregation(Strongly consistent reads/writes) 2) HBase tables are distributed on the cluster via regions, and regions are automatically split and re-distributed as your data grows(Automatic sharding) 3)Automatic RegionServer failover 4)HBase […]