Apache Hive ?
Apache Hive is a framework for data-warehousing on top of Hadoop. It facilitates querying and managing large datasets in Hadoop storage using SQL-like language called HiveQL.
Hive is very useful for data mining, log processing ,document indexing and also in many other applications.
You can download a stable Hive version from official website “https://hive.apache.org/downloads.html”. Then please proceed with the below steps:
1) open your terminal and go to the directory where you copied the hive stable version
For instance ,
2) unzip the hive tar file
tar xzf apache-hive-1.1.0-bin.tar.gz
3) set HIVE_HOME and PATH variables in your .bashrc file
open .bashrc from your home directory ( vi .bashrc )
4) exit from the terminal and open a fresh terminal instance or you can just type bash again to load the environmental variables set in .bashrc
Hurray! you are done with the Hive installation and just typing ‘hive’ will take you to the hive shell.
Note: The above steps will help working on hive in local machine as all the metadata ( table schemas ) will be stored in Derby metastore which cannot be shared to other users in Hadoop.
I will discuss in another post on how to configure Hive shared metastore.