2 minute read

this article will cover the HA solution for Hadoop under Fully Distributed.

source link

Prerequisite

have a hadoop cluster running under fully distributed operation

Abstract

  • Active & Standby NN
  • JournalNode(>3) to sync the Editlog between NNs
  • zkfc to coordinate Active and Standby NNs
  • DNs report to Active & Standby NNs simultaneously

Cluster Design

nodes ip NN JN DN ZKFC ZK
node01 192.168.157.11    
node02 192.168.157.12
node03 192.168.157.13    
node04 192.168.157.14      

Setup

JAVA SETUP

  • make sure each nodes has JDK setup
  • /etc/profile
  • . /etc/profile
  • jps

Secret-Free Setup

  • setup passwordless between node01 and node02
    • because ZKFC need to know the status of each NN
  • in each node, run ssh-keygen -t dsa -P '' -f ./id_dsa
  • cat id_dsa.pub >> authorized_keys
  • ssh localhost to verify no password needed
  • scp id_dsa.pub node02:'pwd'/node02.pub
  • on node01, cat node02.pub >> authorized_keys

Zookeeper Setup

  • download at site
  • upload downloaded tar.gz to node02 first
  • in node02 ~ folder, run tar -zxvf apache-zookeeper.... -C /opt/bigdata
  • cd /opt/bigdata/zookeeper../conf
  • cp zoo_sample.cfg zoo.cfg
  • vi zoo.cfg
    • dataDir=/var/bigdata/hadoop/zk
    • server.1=node02:2888:3888
      server.2=node03:2888:3888
      server.3=node04:2888:3888
      
  • mkdir -p /var/bigdata/hadoop/zk
  • in zk folderecho <weight> >> myid
    • depends on the weight you assigned in zoo.cfg
  • vi /ect/profile
    • add zookeeper bin into PATH variables
  • spread the zookeeper folders to node03 and node04
  • scp -r apache-zooke... node03:'pwd'
  • on node03 and node04
    • mkdir -p /var/bigdata/hadoop/zk
    • echo <weight> > myid
    • vi /etc/profile
      • add zookeeper into path variable

ZK Setup Verification

  • verify ZK

    • on node02
      • zkServer.sh start
        • Starting zookeeper ... STARTED
      • zkServer.sh status
        • It is probably not running.
      • jps
        • 7384 QuorumPeerMain
          • the process is started, but service is not provided
    • on node03
      • zkServer.sh start
    • on each node, make sure firewall is disabled
      • systemctl disable firewalld
    • on node02
      • zkServer.sh status
        • Mode: follower
    • on node03
      • zkServer.sh status
        • Mode: leader
    • on node04
      • zkServer.sh start
      • zkServer.sh status

hadoop setup

  • on node01

    • vi $HADOOP_HOME/etc/hadoop/core-site.xml
    • <property>
          <name>fs.defaultFS</name>
          <value>hdfs://mycluster</value>
      </property>
      <property>
          <name>ha.zookeeper.quorum</name>
          <value>node02:2181,node03:2181,node04:2181</value>
      </property>
          
      
    • vi hdfs-site.xml
    • ``` dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 node01:8020 dfs.namenode.rpc-address.mycluster.nn2 node02:8020 dfs.namenode.http-address.mycluster.nn1 node01:50070 dfs.namenode.http-address.mycluster.nn2 node02:50070 dfs.namenode.shared.edits.dir qjournal://node01:8485;node02:8485;node03:8485/mycluster dfs.journalnode.edits.dir /var/bigdata/hadoop/ha/dfs/jn dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_dsa dfs.ha.automatic-failover.enabled true
    • scp core-site.xml hdfs-site.xml node02:'pwd'

    transfer to different nodes

Experiment Steps

  • start JournalNode

    • on node01
      • hadoop-daemon.sh start journalnode
      • jps
        • 8543 JournalNode
    • on node02
      • hadoop-daemon.sh start journalnode
    • on node03
      • hadoop-daemon.sh start journalnode
      • cd $HADOOP/logs
        • tail -f hadoop-root-journalnode-node03.log
          • INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8485: starting
  • Reformat NameNode

    • on node01
      • hdfs namenode -format
        • Storage directory /var/bigdata/hadoop/ha/dfs/name has been successfully formatted.
        • then all journal node will create folders /var/bigdata/hadoop/ha/dfs/jn/mycluster/current on its own machine, VERSION are same with NameNode
    • start NameNode and its Standby
      • on node01
        • hadoop-daemon.sh start namenode
        • know what’s the VERSION
      • on node02
        • hdfs namenode -bootstrapStandby
          • new folder are created /var/bigdata/hadoop/ha/dfs/name/current
  • Reformat ZK

    • on node04
      • zkCli.sh
        • ls /
          • [zookeeper]
    • on node01
      • hdfs zkfc -formatZK
    • back to node04
      • zkCli.sh
        • ls /
          • [hadoop-ha, zookeeper]
  • on node01

    • start-dfs.sh
  • open 192.168.157.11:50070
  • open 192.168.157.12:50070
  • on nod04
    • ls /hadoop-ha/mycluster
      • [ActiveBreadCrumb, ActiveStandbyElectorLock]
    • get /hadoop-ha/mycluster/ActiveStandbyElectorLock