Happy Path: Setup HDFS-2.6.5 on Pseudo-Distributed Operation

1 minute read

this article covers how to setup Hadoop as Pseudo-distributed Operation

Setup

download hadoop-2.6.5 from this link
transfer tar.gz file into VM
mkdir /opt/bigdata
tar -zxvf hadoopxxx -C /opt/bigdata

vi /etc/profile

export HADOOP_HOME=/opt/bigdata/hadoop-2.6.5
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

cd $HADOOP/etc/hadoop
vi hadoop-env.sh

replace JAVA_HOME with the value you get from following command

dirname $(dirname $(readlink -f $(which javac)))

vi core-site.xml

<property>
  <name>fs.defaultFS</name
  <value>hdfs://node01:9000</value>
</property>

setup the port for Name Node

vi hdfs-site.xml

<property>
    <name>dfs.replication</name>
    <value>1</value>
</property>
<property>
    <name>dfs.namenode.name.dir</name>
    <value>/var/bigdata/hadoop/local/dfs/name</value>
</property>
<property>
    <name>dfs.datanode.data.dir</name>
    <value>/var/bigdata/hadoop/local/dfs/data</value>
</property>
<property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>node01:50090</value>
</property>
<property>
    <name>dfs.namenode.checkpoint.dir</name>
    <value>/var/bigdata/hadoop/local/dfs/secondary</value>
</property>

setup Data Node, and the folder for Name Node and Data Node, since the dafault folder is under /tmp which can be deleted randomly.

vi slaves
```
localhost
```

Run

hdfs namenode -format

create all folders if not exist
create an empty fsImage
VERSION indicate cluster-id

Image file /var/bigdata/hadoop/local/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 321 bytes saved in 0 seconds.
   

cd /var/bigdata/hadoop/local/dfs

only name folder get created
cat /name/current/VERSION

pay attention to clusterID
start-dfs.sh

jps

[root@localhost dfs]# jps
NameNode
Jps
SecondaryNameNode
DataNode

ls /var/bigdata/hadoop/local/dfs
```
data  name  secondary
```
1. each folder has VERSION too
C:\Windows\System32\drivers\etc\hosts
in Browser, open <ip of node01>:50070
hdfs dfs -mkdir -p /user/root

create a new folder in HDFS
hdfs dfs -put <file_to_upload> <target folder in hdfs> upload a file onto hdfs
cd /var/.../current/finalized/subdir0/subdir0 check the uploaded file in hdfs
for i in `seq 100000`; do echo “hello hadoop $i” » data.txt; done
hdfs dfs -D dfs.blocksize=1048576 -put test_data.txt

default to upload the file to /user/root/ , using the 1M as block size

Chengze Li

Happy Path: Setup HDFS-2.6.5 on Pseudo-Distributed Operation

Setup

Run

You May Also Enjoy

Algorithm: Leetcode Contest 418

Algorithm: Leetcode Contest 417

Algorithm: Leetcode Contest 414

Algorithm: Leetcode Contest 412