Hive Introduction and Installation

Posted on Updated on

Apache Hive ?

Apache Hive is a framework for data-warehousing on top of Hadoop. It facilitates querying and managing large datasets in Hadoop storage using SQL-like language called HiveQL.
Hive is very useful for data mining, log processing ,document indexing and also in many other applications.

Installing Hive:

You can download a stable Hive version from official website https://hive.apache.org/downloads.html”. Then please proceed with the below steps:

1) open your terminal and go to the directory where you copied the hive stable version

For instance ,

cd /home/sireesh

2) unzip the hive tar file

tar xzf apache-hive-1.1.0-bin.tar.gz

3) set HIVE_HOME and PATH variables in your .bashrc file

open .bashrc from your home directory ( vi .bashrc )

export HIVE_HOME=enter hive directory path here
export PATH=$PATH:$HIVE_HOME/bin

4) exit from the terminal and open a fresh terminal instance or you can just type bash again to load the environmental variables set in .bashrc

Hurray! you are done with the Hive installation and just typing ‘hive’ will take you to the hive shell.

Note: The above steps will help working on hive in local machine as all the metadata ( table schemas ) will be stored in Derby metastore which cannot be shared to other users in Hadoop.
I will discuss in another post on how to configure Hive shared metastore.

Will discuss about Hive schema in next post

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s