Happy Path: Setup HDFS-2.6.5 on Fully-Distributed Operation
nodes | ip | NameNode | DataNode | Secondary NameNode |
---|---|---|---|---|
node01 | 192.168.157.11 | ✅ | ||
node02 | 192.168.157.12 | ✅ | ✅ | |
node03 | 192.168.157.13 | ✅ | ||
node04 | 192.168.157.14 | ✅ |
Prerequisite
- 4 Linux nodes using which can ping each other correctly
-
vi /etc/hosts
192.168.157.11 node01 192.168.157.12 node02 192.168.157.13 node03 192.168.157.14 node04
Setup
scp <path_to_file> node02:/path_to_store
- change the
/etc/profile
on each node - on each node, run
ssh localhost
to generate.ssh
folder - on node01, run
scp ~/.ssh/id_dsa.pub node02:/root/.ssh/node01.pub
- on node02, run
cat ~/.ssh/node01.pub >> authorized_keys
- on node01, run
scp node02
can login correctly - do same thing for node03 & node04
Node01
cd $HADOOP/etc/hadoop
-
vi core-site.xml
<property> <name>fs.defaultFS</name> <value>hdfs://node01:9000</value> </property>
setup where to start NameNode
-
vi hdfs-site.xml
<property> <name>dfs.replication</name> <value>2</value> </property>
setup how many replication for each block/file
<property> <name>dfs.namenode.secondary.http-address</name> <value>node02:50090</value> </property> <property> <name>dfs.namenode.checkpoint.dir</name> <value>/var/bigdata/hadoop/full/dfs/secondary</value> </property>
setup where to run SNN
-
vi slaves
node02 node03 node04
setup where to run DN
cd /opt
scp -r ./bigdata/ node02:'pwd'
same thing for node03 & node04
Run
- on Node01, run
hdfs namenode -format
- on Node01, run
start-dfs.sh
[root@node01 hadoop]# start-dfs.sh
Starting namenodes on [node01]
node01: starting namenode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-namenode-node01.out
node04: starting datanode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-datanode-localhost.localdomain.out
node03: starting datanode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-datanode-localhost.localdomain.out
node02: starting datanode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-datanode-localhost.localdomain.out
Starting secondary namenodes [node02]
node02: starting secondarynamenode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-secondarynamenode-localhost.localdomain.out
NN start on Node01, SNN start on Node02, DN start on Node02, Node03 and Node04
- verify
jps
& check directories
node01 only has NN related
node02 has SNN&DN related
node03 & node04 has DN related