Hadoop Hive Interview Questions

Hive Interview Questions

What is Hive ?

Hive is a data warehouse software which is used for facilitates querying and managing large data sets residing in distributed storage.Hive language almost look like SQL language called HiveQL.Hive also allows traditional map reduce programs to customize mappers and reducers when it is inconvenient or inefficient to execute the logic in HiveQL (User Defined Functions UDFS)

What is Hive Metastore ?

Hive Meta store is a database that stores metadata of your hive tables like table name,column name,data types,table location,number of buckets in the table etc.

Hadoop Hive Interview Questions

What is Hive Present Version ?


What is the stable version of Hive ?


Hive new version supported Hadoop Versions ?

This release works with Hadoop 0.20.x, 0.23.x.y, 1.x.y, 2.x.y

Where we have to set the Hive Installation Path ?

we can set hive path in ~/.bashrc file or hadoop-env.sh file

~/.bashrc file or hadoop-env.sh file which one is better to set the path ?


Why ~/.bashrc.sh is better then hadoop-env.sh ?

~/.bashrc.sh starts the work when system login but hadoop-env.sh starts the work only when hadoop starts in system

What is Hive Installation Path ?

export HIVE_HOME=/home/hadoop/work/hive-x.y.z

export PATH=$PATH:$HIVE_HOME/bin

How to Install Hive ?

check the answer above menu-hive installation tab

Which companies are mostly using Hive ?


Which company initially developed Hive ?


How Facebook Uses Hadoop,Hive and Hbase ?

Facebook data stored on HDFS,everyday millions of photos uploaded into facebook with the help of Hadoop

  1. Facebook Messages,Likes and statues updates running on top of Hbase
  2. Hive to generate reports for third-party developers and advertisers who need to track the success of their applications or campaigns.

What is Apache Hcatalog ?

HCatalog is built on top of the Hive metastore and incorporates Hive’s DDL.Apache Hcatalog is a table and data management layer for hadoop,we can process the data on Hcatalog by using APache pig,Apache Mapreduce and Apache Hive.There is no need to worry in Hcatalog where data is stored and which format of data generated.HCatalog displays data from RCFile format, text files, or sequence files in a tabular view. It also provides REST APIs so that external systems can access these tables’ metadata.

What is the work of Hive/Hcatalog ?

Hive/HCatalog also enables sharing of data structure with external systems including traditional data management tools.

What is WebHCatServer ?

The WebHcatServer provides a REST – like web API for Hcatalog.Applications make HTTP requests to run Pig, Hive, and HCatalog DDL from within applications.

What is SerDe in Apache Hive ?

SerDe full form is  Serializer Deserializer.Hive uses  Serializer Deserializer to read and write the data from hive table.The importent one behind hive is hive does not have own Hadoop distributed file system(HDFS) format that data is stored in.Users have to write store the hive data on HDFS by using (“CREATE EXTERNAL TABLE” or “LOAD DATA INPATH,” ) and use Hive to correctly “parse” that file format in a way that can be used by Hive.Hive uses to “parse” data stored in HDFS to be used by Hive

Is it possible to use same metastore by multiple users, in case of embedded hive?

No, it is not possible to use metastore in sharing mode. It is recommended to use standalone “real” database like MySQL or PostGresSQL.

Is multiline comment supported in Hive Script ?


Difference between SQL and HiveQL ?

Hive I nterview QuestionsDifference-between-SQL-HiveQL

Hive Interview QuestionsHadoop Hive Interview Questions

Hive Data types ?

Hive Interview Questions Hive-Datatypes

Hadoop Hive Interview Questions


  1. If I have 100 nodes already how can I add other 100 nodes ?

    • mahesh chimmiri says:

      1) Take a new system create a new user names and passwords
      2) Install SSH and setup ssh connections with master node
      3) add ssh public_rsa id key to authorized keys file
      4)add the new datanode hostname,ip address and other details in /etc/hosts slaves file slave3.in slave3
      5)Start the DataNode on New Node
      6)Login to new node like su hadoop or ssh -X hadoop@
      7)Start HDFS on a newly added slave node by using the following command
      ./bin/hadoop-daemon.sh start datanode
      8)Check the output of jps command on a new node

    • mahesh chimmiri says:

      There is a topic called Commision concept for how to add extra nodes

  2. Today i have gained some knowledge from your site..The way you have posted in your site is so nice and easy to understand..
    You have posted in the way indian’s habituated..


  3. Hello, Can you tell me how to insert, update data in Hive table.And which file format it will support.

  4. your blog is very useful.
    can you please take one sample project and explain the flow and architecture.

    thanks in adv,

Speak Your Mind