Hadoop Installation Overview – Part1 – Single Node
Hadoop Installation Overview – Single Node Part1
Overview
This is the part1 of Hadoop installation document, We will be covering installing hadoop on a Single VM enviroment.
Both namenode and datanode will run on the same VM.
Both namenode and datanode will run on the same VM.
Assumption
This document assumes that
- VMWare box is already installed on your machine, check below link to install
install vmware workstation on vmware - Install ubuntu 12.0.4 LTS, download the .iso file and install it.
install ubuntu
Installation Steps
Step1
Install VMWare Workstation, as mentioned above
Step2
Install ubuntu , as mentioned above.
**Note – Make sure you select the 32bit or 64bit version based on your windows version
Step3
Install OS Updates for ubuntu.
Go to Terminal and type
Go to Terminal and type
$ sudo apt-get update
Step4
Install OpenJDK6. Since hadoop is all java we need the JDK.
$apt-get install openjdk-6-jdk
Step5
Install Eclipse.
From UI go to software center and install eclipse
Step6
Open SSH Server
sudo apt-get install openssh-server
Step7
download hadoop 1.2.1.tar.gz
Step 8
Go to terminal and
$ cp hadoop 1.2.1.tar.gz ~ #copy to home dir
$ tar-xvf hadoop 1.2.1.tar.gz
Step9
#Open .bashrc and make the following settings
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64 #checkout the home is pointing to jdk
export HADOOP_PREFIX=/softwares/hadoop-1.2.1
export PATH = $PATH:$HADOOP_PREFIX/bin;
Step10 – CONFIGURATION FILES
Configuration Settings, go to conf folder of your hadoop installation
- core-site.xml
$sudo gedit core-site.xml <property> <name>fs-default.name</name> <value>localhost:8020</name> </property>
- hdfs-site.xml
$sudo gedit hdfs-site.xml <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.perimissions</name> <value>false</value> </property>
- mapred-site.xml
$sudo gedit mapred-site.xml <property> <name>mapred.job.tracker</name> <value>localhost:8021</value> </property>
- /etc/hosts
make the entry of the machines ipaddress in hosts file
$ifconfig
$sudo gedit /etc/hosts
192.168.1.129 mynode1
192.168.1.129 localhost
Step11 – FINAL STEP
Generate SSH Keys and pass public key to datanode.
$ ssh-keygen
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
#note this step is needed, since we have datanote and namenode on the same machine, namenode communcates with the datanote and for that on a single node installation authorized_keys is needed.
Step12 – RUN HADOOP AND VERIFY
With Step11, Installation is complete, lets run the hadoop
$ hadoop namenode -format
#Note - This should format the hdfs partition. you should see successfully formatted message
$ start-dfs.sh
#This will start the namenode and secondary namenode daemon
$ start-mapred.sh
#This will start the jobtracker and tasktracker daemon
$ jps
#verify all the daemons are running, you see 5 process running, namenode, secondary namenode, jobtracker,tasktracker
YOUR HADOOP IS RUNNING NOW, If you see any challenges post in the comments below.
Comments
Post a Comment