Build Local Single Node Hadoop Cluster on Linux

This post shows how to build a local single node Hadoop cluster on Linux.

Prerequisite:

(1) Install JDK , Download Link

(2) install ANT, Download Link

Use the bin version, and add the following lines in your
~/.bash_profile

export ANT_HOME=YOUR_ANT_PATH
export JAVA_HOME=YOUR_JDK_PATH
export PATH=${PATH}:${ANT_HOME}/bin

 

Install Hadoop:

(1) Download Hadoop, Download Link

(2) uncompress the file, move the folder to wherever you want, I use ~/hadoop/

(3) enter ~/hadoop/, run

ant

(4) Put localhost in two files, conf/masters and conf/slaves:

echo localhost > conf/masters;
echo localhost > conf/slaves;

(5) In conf/core-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://127.0.0.1:9000</value>
  </property>
</configuration>

(6) In conf/mapred-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
  </property>

  <property>
    <name>dfs.datanode.socket.write.timeout</name>
    <value>0</value>
  </property>
</configuration>

(7) edit conf/hadoop-env.sh, uncomment the JAVA_HOME line, and put the right path for JDK

export JAVA_HOME=YOUR_JDK_PATH

(8) Run Hadoop !

./bin/start-all.sh

 

After Running:

(1) Check that the daemons are running using jps (shows java processes):

jps

Should show 5 processes and Jps (and maybe more if you are running e.g. Eclipse):

17400 SecondaryNameNode
17172 NameNode
17599 TaskTracker
17279 DataNode
17493 JobTracker
17699 Jps

Check the local NameNode: http://localhost:50070/dfshealth.jsp should show one live node.

Check the local JobTracker: http://localhost:50030/jobtracker.jsp should show ‘State: RUNNING’, not ‘INITIALIZING’.

If the Namenode is not running, format the Namenode, and run start-all.sh again:

rm -r /tmp/hadoop-*;
bin/hadoop namenode -format;
./bin/start-all.sh

This can be caused by switching Hadoop versions.
(2) Sanity test

hadoop fs -ls /

(3) Word Count Example:

Please Look Here

3 thoughts on “Build Local Single Node Hadoop Cluster on Linux

  1. Pingback: Wordcount mapreduce example using Hive on local and EMR « 大春饵的博客

Leave a Reply

Your email address will not be published. Required fields are marked *