Hive is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems.
In short, you can run a Hadoop MapReduce using SQL-like statements with Hive.
Here is an WordCount example I did using Hive. The example first shows how to do it on your Local machine, then I will show how to do it using Amazon EMR.
Local
1. Install Hive.
First you need to install Hadoop on your local, here is a post for how to do it. After you installed Hadoop, you can use this official tutorial.