Happy Path: Setup HA solution for HDFS under Fully Distributed Operation
this article will cover the HA solution for Hadoop under Fully Distributed.
source link
Prerequisite
have a hadoop cluster running under fully distributed operation
Abstract
- Active & Standby NN
- JournalNode(>3) to sync the Editlog between NNs
- zkfc to coordinate Active and Standby NNs
- DNs report to Active & Standby NNs simultaneously
Cluster Design
nodes | ip | NN | JN | DN | ZKFC | ZK |
---|---|---|---|---|---|---|
node01 | 192.168.157.11 | ✅ | ✅ | ✅ | ||
node02 | 192.168.157.12 | ✅ | ✅ | ✅ | ✅ | ✅ |
node03 | 192.168.157.13 | ✅ | ✅ | ✅ | ||
node04 | 192.168.157.14 | ✅ | ✅ |
Setup
JAVA SETUP
- make sure each nodes has JDK setup
/etc/profile
. /etc/profile
jps
Secret-Free Setup
- setup passwordless between node01 and node02
- because ZKFC need to know the status of each NN
- in each node, run
ssh-keygen -t dsa -P '' -f ./id_dsa
cat id_dsa.pub >> authorized_keys
ssh localhost
to verify no password neededscp id_dsa.pub node02:'pwd'/node02.pub
- on node01,
cat node02.pub >> authorized_keys
Zookeeper Setup
- download at site
- upload downloaded
tar.gz
tonode02
first - in node02 ~ folder, run
tar -zxvf apache-zookeeper.... -C /opt/bigdata
cd /opt/bigdata/zookeeper../conf
cp zoo_sample.cfg zoo.cfg
vi zoo.cfg
dataDir=/var/bigdata/hadoop/zk
-
server.1=node02:2888:3888 server.2=node03:2888:3888 server.3=node04:2888:3888
mkdir -p /var/bigdata/hadoop/zk
- in zk folder
echo <weight> >> myid
- depends on the weight you assigned in zoo.cfg
vi /ect/profile
- add zookeeper bin into PATH variables
- spread the zookeeper folders to node03 and node04
scp -r apache-zooke... node03:'pwd'
- on node03 and node04
mkdir -p /var/bigdata/hadoop/zk
echo <weight> > myid
vi /etc/profile
- add zookeeper into path variable
ZK Setup Verification
-
verify ZK
- on
node02
zkServer.sh start
Starting zookeeper ... STARTED
zkServer.sh status
It is probably not running.
jps
7384 QuorumPeerMain
- the process is started, but service is not provided
- on
node03
zkServer.sh start
- on each node, make sure firewall is disabled
systemctl disable firewalld
- on
node02
zkServer.sh status
Mode: follower
- on
node03
zkServer.sh status
Mode: leader
- on
node04
zkServer.sh start
zkServer.sh status
- on
hadoop setup
-
on
node01
vi $HADOOP_HOME/etc/hadoop/core-site.xml
-
<property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>node02:2181,node03:2181,node04:2181</value> </property>
vi hdfs-site.xml
- ```
dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1,nn2 dfs.namenode.rpc-address.mycluster.nn1 node01:8020 dfs.namenode.rpc-address.mycluster.nn2 node02:8020 dfs.namenode.http-address.mycluster.nn1 node01:50070 dfs.namenode.http-address.mycluster.nn2 node02:50070 dfs.namenode.shared.edits.dir qjournal://node01:8485;node02:8485;node03:8485/mycluster dfs.journalnode.edits.dir /var/bigdata/hadoop/ha/dfs/jn dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files /root/.ssh/id_dsa dfs.ha.automatic-failover.enabled true scp core-site.xml hdfs-site.xml node02:'pwd'
transfer to different nodes
Experiment Steps
-
start JournalNode
- on
node01
hadoop-daemon.sh start journalnode
jps
8543 JournalNode
- on
node02
hadoop-daemon.sh start journalnode
- on
node03
hadoop-daemon.sh start journalnode
cd $HADOOP/logs
tail -f hadoop-root-journalnode-node03.log
INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8485: starting
- on
-
Reformat NameNode
- on
node01
hdfs namenode -format
Storage directory /var/bigdata/hadoop/ha/dfs/name has been successfully formatted.
- then all journal node will create folders
/var/bigdata/hadoop/ha/dfs/jn/mycluster/current
on its own machine, VERSION are same with NameNode
- start NameNode and its Standby
- on
node01
hadoop-daemon.sh start namenode
- know what’s the VERSION
- on
node02
hdfs namenode -bootstrapStandby
- new folder are created
/var/bigdata/hadoop/ha/dfs/name/current
- new folder are created
- on
- on
-
Reformat ZK
- on
node04
zkCli.sh
ls /
[zookeeper]
- on
node01
hdfs zkfc -formatZK
- back to
node04
zkCli.sh
ls /
[hadoop-ha, zookeeper]
- on
-
on
node01
start-dfs.sh
- open
192.168.157.11:50070
- open
192.168.157.12:50070
- on
nod04
ls /hadoop-ha/mycluster
[ActiveBreadCrumb, ActiveStandbyElectorLock]
get /hadoop-ha/mycluster/ActiveStandbyElectorLock