1 minute read

nodes ip NameNode DataNode Secondary NameNode
node01 192.168.157.11    
node02 192.168.157.12  
node03 192.168.157.13    
node04 192.168.157.14    

Prerequisite

  • 4 Linux nodes using which can ping each other correctly
  • vi /etc/hosts

    192.168.157.11 node01
    192.168.157.12 node02
    192.168.157.13 node03
    192.168.157.14 node04
    
    

Setup

  1. scp <path_to_file> node02:/path_to_store
  2. change the /etc/profile on each node
  3. on each node, run ssh localhost to generate .ssh folder
  4. on node01, run scp ~/.ssh/id_dsa.pub node02:/root/.ssh/node01.pub
  5. on node02, run cat ~/.ssh/node01.pub >> authorized_keys
  6. on node01, run scp node02 can login correctly
  7. do same thing for node03 & node04

Node01

  1. cd $HADOOP/etc/hadoop
  2. vi core-site.xml

    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://node01:9000</value>
    </property>
    
    

    setup where to start NameNode

  3. vi hdfs-site.xml

     <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    
    

    setup how many replication for each block/file

    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>node02:50090</value>
    </property>
    <property>
        <name>dfs.namenode.checkpoint.dir</name>
        <value>/var/bigdata/hadoop/full/dfs/secondary</value>
    </property>
    
    

    setup where to run SNN

  4. vi slaves

    node02
    node03
    node04
    

    setup where to run DN

  5. cd /opt
  6. scp -r ./bigdata/ node02:'pwd' same thing for node03 & node04

Run

  1. on Node01, run hdfs namenode -format
  2. on Node01, run start-dfs.sh
  [root@node01 hadoop]# start-dfs.sh 
Starting namenodes on [node01]
node01: starting namenode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-namenode-node01.out
node04: starting datanode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-datanode-localhost.localdomain.out
node03: starting datanode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-datanode-localhost.localdomain.out
node02: starting datanode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-datanode-localhost.localdomain.out
Starting secondary namenodes [node02]
node02: starting secondarynamenode, logging to /opt/bigdata/hadoop-2.6.5/logs/hadoop-root-secondarynamenode-localhost.localdomain.out

NN start on Node01, SNN start on Node02, DN start on Node02, Node03 and Node04

  1. verify

jps & check directories node01 only has NN related node02 has SNN&DN related node03 & node04 has DN related