Hadoop基础教程-第3章 HDFS:分布式文件系统(3.4 HDFS集群模式)

2022-05-06 18:33:04 浏览数 (1)

第3章 HDFS:分布式文件系统

3.4 HDFS集群模式

节点

IP

角色

node1

192.168.80.131

NameNode,DataNode

node2

192.168.80.132

SecondaryNameNode,DataNode

node3

192.168.80.133

DataNode

3.4.1 Hadoop环境变量

代码语言:javascript复制
[root@node1 ~]# vi /etc/profile.d/custom.sh
代码语言:javascript复制
#Hadoop path
export HADOOP_HOME=/opt/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
代码语言:javascript复制
[root@node1 ~]# source /etc/profile.d/custom.sh

【2018-01-27补充】 感谢吴家行hang的提醒,这里node2和node3也要进行类似的环境变量配置。

3.4.2 准备工作

由于前面在node1上部署了Hadoop单机模式,需要停止Hadoop所有服务并清除数据目录。顺便检验一下设置的Hadoop环境变量。

清除Hadoop数据目录

代码语言:javascript复制
[root@node1 ~]# rm -rf /tmp/hadoop-root/

3.4.2 core-site.xml

代码语言:javascript复制
[root@node1 ~]# cd /opt/hadoop-2.7.3/etc/hadoop/
[root@node1 hadoop]# vi core-site.xml

core-site.xml文件内容如下:

代码语言:javascript复制
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://node1:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/var/data/hadoop</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>65536</value>
    </property>
</configuration>

3.4.3 hdfs-site.xml

代码语言:javascript复制
[root@node1 hadoop]# vi hdfs-site.xml

hdfs-site.xml文件内容如下:

代码语言:javascript复制
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>node2:50090</value>
    </property>
    <property>
        <name>dfs.namenode.secondary.https-address</name>
        <value>node2:50091</value>
    </property>    
</configuration>

3.4.4 slaves

编辑slaves文件

代码语言:javascript复制
[root@node1 hadoop]# vi slaves

slaves文件内容设置为:

代码语言:javascript复制
node1
node2
node3

3.4.5 分发文件

将Hadoop软件包复制到node2和node3节点上

代码语言:javascript复制
[root@node1 ~]# scp -r /opt/hadoop-2.7.3/ node2:/opt
代码语言:javascript复制
[root@node1 ~]# scp -r /opt/hadoop-2.7.3/ node3:/opt

将环境变量文件复制到node2和node3节点上

代码语言:javascript复制
[root@node1 ~]# scp /etc/profile.d/custom.sh node2:/etc/profile.d
代码语言:javascript复制
[root@node1 ~]# scp /etc/profile.d/custom.sh node3:/etc/profile.d

最后source一下

代码语言:javascript复制
[root@node2 ~]# source /etc/profile.d/custom.sh
代码语言:javascript复制
[root@node3 ~]# source /etc/profile.d/custom.sh

3.4.6 NameNode格式化

代码语言:javascript复制
[root@node1 ~]# hdfs namenode -format
代码语言:javascript复制
************************************************************/
17/05/14 09:17:28 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
17/05/14 09:17:28 INFO namenode.NameNode: createNameNode [-format]
Formatting using clusterid: CID-29bae3d3-1786-4428-8359-077976fe15e5
17/05/14 09:17:30 INFO namenode.FSNamesystem: No KeyProvider found.
17/05/14 09:17:30 INFO namenode.FSNamesystem: fsLock is fair:true
17/05/14 09:17:30 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
17/05/14 09:17:30 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
17/05/14 09:17:30 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
17/05/14 09:17:30 INFO blockmanagement.BlockManager: The block deletion will start around 2017 May 14 09:17:30
17/05/14 09:17:30 INFO util.GSet: Computing capacity for map BlocksMap
17/05/14 09:17:30 INFO util.GSet: VM type       = 64-bit
17/05/14 09:17:30 INFO util.GSet: 2.0% max memory 966.7 MB = 19.3 MB
17/05/14 09:17:30 INFO util.GSet: capacity      = 2^21 = 2097152 entries
17/05/14 09:17:30 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
17/05/14 09:17:30 INFO blockmanagement.BlockManager: defaultReplication         = 3
17/05/14 09:17:30 INFO blockmanagement.BlockManager: maxReplication             = 512
17/05/14 09:17:30 INFO blockmanagement.BlockManager: minReplication             = 1
17/05/14 09:17:30 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
17/05/14 09:17:30 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
17/05/14 09:17:30 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
17/05/14 09:17:30 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
17/05/14 09:17:30 INFO namenode.FSNamesystem: fsOwner             = root (auth:SIMPLE)
17/05/14 09:17:30 INFO namenode.FSNamesystem: supergroup          = supergroup
17/05/14 09:17:30 INFO namenode.FSNamesystem: isPermissionEnabled = true
17/05/14 09:17:30 INFO namenode.FSNamesystem: HA Enabled: false
17/05/14 09:17:30 INFO namenode.FSNamesystem: Append Enabled: true
17/05/14 09:17:31 INFO util.GSet: Computing capacity for map INodeMap
17/05/14 09:17:31 INFO util.GSet: VM type       = 64-bit
17/05/14 09:17:31 INFO util.GSet: 1.0% max memory 966.7 MB = 9.7 MB
17/05/14 09:17:31 INFO util.GSet: capacity      = 2^20 = 1048576 entries
17/05/14 09:17:31 INFO namenode.FSDirectory: ACLs enabled? false
17/05/14 09:17:31 INFO namenode.FSDirectory: XAttrs enabled? true
17/05/14 09:17:31 INFO namenode.FSDirectory: Maximum size of an xattr: 16384
17/05/14 09:17:31 INFO namenode.NameNode: Caching file names occuring more than 10 times
17/05/14 09:17:31 INFO util.GSet: Computing capacity for map cachedBlocks
17/05/14 09:17:31 INFO util.GSet: VM type       = 64-bit
17/05/14 09:17:31 INFO util.GSet: 0.25% max memory 966.7 MB = 2.4 MB
17/05/14 09:17:31 INFO util.GSet: capacity      = 2^18 = 262144 entries
17/05/14 09:17:31 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
17/05/14 09:17:31 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
17/05/14 09:17:31 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
17/05/14 09:17:31 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10
17/05/14 09:17:31 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10
17/05/14 09:17:31 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25
17/05/14 09:17:31 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
17/05/14 09:17:31 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
17/05/14 09:17:31 INFO util.GSet: Computing capacity for map NameNodeRetryCache
17/05/14 09:17:31 INFO util.GSet: VM type       = 64-bit
17/05/14 09:17:31 INFO util.GSet: 0.029999999329447746% max memory 966.7 MB = 297.0 KB
17/05/14 09:17:31 INFO util.GSet: capacity      = 2^15 = 32768 entries
17/05/14 09:17:31 INFO namenode.FSImage: Allocated new BlockPoolId: BP-698786385-192.168.80.131-1494767851416
17/05/14 09:17:31 INFO common.Storage: Storage directory /var/data/hadoop/dfs/name has been successfully formatted.
17/05/14 09:17:31 INFO namenode.FSImageFormatProtobuf: Saving image file /var/data/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
17/05/14 09:17:31 INFO namenode.FSImageFormatProtobuf: Image file /var/data/hadoop/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 351 bytes saved in 0 seconds.
17/05/14 09:17:31 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
17/05/14 09:17:31 INFO util.ExitUtil: Exiting with status 0
17/05/14 09:17:31 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at node1/192.168.80.131
************************************************************/
代码语言:javascript复制
[root@node1 ~]# ll /var/data/hadoop/dfs/name/current/
total 16
-rw-r--r-- 1 root root 351 May 14 09:17 fsimage_0000000000000000000
-rw-r--r-- 1 root root  62 May 14 09:17 fsimage_0000000000000000000.md5
-rw-r--r-- 1 root root   2 May 14 09:17 seen_txid
-rw-r--r-- 1 root root 206 May 14 09:17 VERSION

3.4.7 启动HDFS

代码语言:javascript复制
[root@node1 ~]# start-dfs.sh
Starting namenodes on [node1]
node1: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-node1.out
node2: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-node2.out
node3: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-node3.out
node1: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-node1.out
Starting secondary namenodes [node2]
node2: starting secondarynamenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-secondaryn
[root@node1 ~]# 

查看三个节点上的Java进程:

3.4.8 HDFS Web界面

打开http://192.168.80.131:50070

在”Datanodes”可以看到三个DataNode节点的信息:

0 人点赞