Hive

Hive Introduction and Installation

Posted on Updated on

Apache Hive ?

Apache Hive is a framework for data-warehousing on top of Hadoop. It facilitates querying and managing large datasets in Hadoop storage using SQL-like language called HiveQL.
Hive is very useful for data mining, log processing ,document indexing and also in many other applications.

Installing Hive:

You can download a stable Hive version from official website https://hive.apache.org/downloads.html”. Then please proceed with the below steps:

1) open your terminal and go to the directory where you copied the hive stable version

For instance ,

cd /home/sireesh

2) unzip the hive tar file

tar xzf apache-hive-1.1.0-bin.tar.gz

3) set HIVE_HOME and PATH variables in your .bashrc file

open .bashrc from your home directory ( vi .bashrc )

export HIVE_HOME=enter hive directory path here
export PATH=$PATH:$HIVE_HOME/bin

4) exit from the terminal and open a fresh terminal instance or you can just type bash again to load the environmental variables set in .bashrc

Hurray! you are done with the Hive installation and just typing ‘hive’ will take you to the hive shell.

Note: The above steps will help working on hive in local machine as all the metadata ( table schemas ) will be stored in Derby metastore which cannot be shared to other users in Hadoop.
I will discuss in another post on how to configure Hive shared metastore.

Will discuss about Hive schema in next post