一、安装准备
1、下载地址
https://www.apache.org/dyn/closer.cgi/hadoop/common
2、参考文档
https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/ClusterSetup.html
3、ssh免密配置
https://cloud.tencent.com/developer/article/1818611
4、zookeeper安装
https://cloud.tencent.com/developer/article/1818643
5、集群角色分配
hadoop集群角色 | 集群节点 |
---|---|
NameNode | hadoop001、hadoop002 |
DataNode | hadoop003、hadoop004、hadoop005 |
JournalNode | hadoop003、hadoop004、hadoop005 |
ResourceManager | hadoop001、hadoop002 |
NodeManager | hadoop003、hadoop004、hadoop005 |
DFSZKFailoverController | hadoop001、hadoop002 |
二、解压安装
解压文件
代码语言:javascript复制cd /usr/local/hadoop
tar zxpf hadoop-3.0.0.tar.gz
创建软链接
代码语言:javascript复制ln -s hadoop-3.0.0 hadoop
三、环境变量配置
编辑 /etc/profile 文件
代码语言:javascript复制vim /etc/profile
添加以下内容
代码语言:javascript复制export HADOOP_HOME=/usr/local/hadoop/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
四、修改配置文件
1、检查磁盘空间
首先查看磁盘挂载空间,避免hadoop的数据放在挂载空间小的目录
代码语言:javascript复制df -h
磁盘一共800G,home目录占了741G,故以下配置目录都会以 /home开头
2、修改配置文件
worker
代码语言:javascript复制hadoop003
hadoop004
hadoop005
core-site.xml
代码语言:javascript复制 fs.defaultFS
hdfs://ns1
hadoop.tmp.dir
/home/cluster/hadoop/data/tmp
io.file.buffer.size
131072
Size of read/write buffer used in SequenceFiles
ha.zookeeper.quorum
hadoop001:2181,hadoop002:2181,hadoop003:2181
DFSZKFailoverController
hadoop-env.sh
代码语言:javascript复制export HDFS_NAMENODE_OPTS="-XX: UseParallelGC -Xmx4g"
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export JAVA_HOME=/usr/java/jdk1.8
hdfs-site.xml
代码语言:javascript复制 dfs.namenode.name.dir
/home/cluster/hadoop/data/nn
dfs.datanode.data.dir
/home/cluster/hadoop/data/dn
dfs.journalnode.edits.dir
/home/cluster/hadoop/data/jn
dfs.nameservices
ns1
dfs.ha.namenodes.ns1
hadoop001,hadoop002
dfs.namenode.rpc-address.ns1.hadoop001
hadoop001:9000
dfs.namenode.http-address.ns1.hadoop001
hadoop001:50070
dfs.namenode.rpc-address.ns1.hadoop002
hadoop002:9000
dfs.namenode.http-address.ns1.hadoop002
hadoop002:50070
dfs.ha.automatic-failover.enabled.ns1
true
dfs.client.failover.proxy.provider.ns1
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.permissions.enabled
false
dfs.replication
3
dfs.blocksize
64M
HDFS blocksize of 128MB for large file-systems
dfs.namenode.handler.count
100
More NameNode server threads to handle RPCs from large number of DataNodes.
dfs.namenode.shared.edits.dir
qjournal://hadoop001:8485;hadoop002:8485;hadoop003:8485/ns1
dfs.ha.fencing.methods
sshfence
dfs.ha.fencing.ssh.private-key-files
/root/.ssh/id_rsa
mapred-site.xml
代码语言:javascript复制 mapreduce.framework.name
yarn
Execution framework set to Hadoop YARN.
mapreduce.map.memory.mb
4096
Larger resource limit for maps.
mapreduce.map.java.opts
-Xmx4096M
Larger heap-size for child jvms of maps.
mapreduce.reduce.memory.mb
4096
Larger resource limit for reduces.
mapreduce.reduce.java.opts
-Xmx4096M
Larger heap-size for child jvms of reduces.
mapreduce.task.io.sort.mb
4096
Higher memory-limit while sorting data for efficiency.
mapreduce.task.io.sort.factor
400
More streams merged at once while sorting files.
mapreduce.reduce.shuffle.parallelcopies
200
Higher number of parallel copies run by reduces to fetch outputs from very large number of maps.
mapreduce.jobhistory.address
hadoop001:10020
MapReduce JobHistory Server host:port.Default port is 10020
mapreduce.jobhistory.webapp.address
hadoop001:19888
MapReduce JobHistory Server Web UI host:port.Default port is 19888.
mapreduce.jobhistory.intermediate-done-dir
/tmp/mr-history/tmp
Directory where history files are written by MapReduce jobs.
mapreduce.jobhistory.done-dir
/tmp/mr-history/done
Directory where history files are managed by the MR JobHistory Server.
yarn-site.xml
代码语言:javascript复制 yarn.resourcemanager.ha.enabled
true
yarn.resourcemanager.ha.automatic-failover.enabled
true
yarn.resourcemanager.ha.automatic-failover.embedded
true
yarn.resourcemanager.cluster-id
yarn-rm-cluster
yarn.resourcemanager.ha.rm-ids
rm1,rm2
yarn.resourcemanager.hostname.rm1
hadoop001
yarn.resourcemanager.hostname.rm2
hadoop002
yarn.resourcemanager.recovery.enabled
true
yarn.resourcemanager.zk.state-store.address
hadoop001:2181,hadoop002:2181,hadoop003:2181
yarn.resourcemanager.zk-address
hadoop001:2181,hadoop002:2181,hadoop003:2181
yarn.resourcemanager.address.rm1
hadoop001:8032
yarn.resourcemanager.address.rm2
hadoop002:8032
yarn.resourcemanager.scheduler.address.rm1
hadoop001:8034
yarn.resourcemanager.webapp.address.rm1
hadoop001:8088
yarn.resourcemanager.scheduler.address.rm2
hadoop002:8034
yarn.resourcemanager.webapp.address.rm2
hadoop002:8088
yarn.acl.enable
true
Enable ACLs? Defaults to false.
yarn.admin.acl
*
yarn.log-aggregation-enable
false
Configuration to enable or disable log aggregation
yarn.resourcemanager.hostname
hadoop001
host Single hostname that can be set in place of setting all yarn.resourcemanager*address resources. Results in default ports for ResourceManager components.
yarn.scheduler.maximum-allocation-mb
20480
yarn.nodemanager.resource.memory-mb
28672
yarn.nodemanager.log.retain-seconds
10800
yarn.nodemanager.log-dirs
/home/cluster/yarn/log/1,/home/cluster/yarn/log/2,/home/cluster/yarn/log/3
yarn.nodemanager.aux-services
mapreduce_shuffle
Shuffle service that needs to be set for Map Reduce applications.
yarn.log-aggregation.retain-seconds
-1
yarn.log-aggregation.retain-check-interval-seconds
-1
五、初始化集群
1、启动zookeeper
由于hadoop的HA机制依赖于zookeeper,因此先启动zookeeper集群
如果zookeeper集群没有搭建参考:https://cloud.tencent.com/developer/article/1818643
代码语言:javascript复制zkServer.sh start
代码语言:javascript复制zkServer.sh status
2、在zookeeper中初始化元数据
代码语言:javascript复制hdfs zkfc -formatZK
3、启动zkfc
代码语言:javascript复制hdfs --daemon start zkfc
4、启动JournalNode
格式化NameNode前必须先格式化JournalNode,否则格式化失败
这里配置了3个JournalNode节点,hadoop001、hadoop002、hadoop003
代码语言:javascript复制hdfs --daemon start journalnode
5、格式化NameNode
在第一台NameNode节点上执行
代码语言:javascript复制hdfs namenode -format
6、启动hdfs
代码语言:javascript复制start-all.sh
7、同步备份NameNode
等hdfs初始化完成之后(20秒),在另一台NameNode上执行
代码语言:javascript复制hdfs namenode -bootstrapStandby
如果格式化失败或者出现以下错误,把对应节点上的目录删掉再重新格式化
代码语言:javascript复制Directory is in an inconsistent state: Can't format the storage directory because the current directory is not empty.
rm -rf /home/cluster/hadoop/data/jn/ns1/*
hdfs namenode -format
8、启动备份NameNode
同步之后,需要在另一台NameNode节点上启动NameNode进程
代码语言:javascript复制hdfs --daemon start namenode
9、查看集群状态
代码语言:javascript复制hadoop dfsadmin -report
10、访问集群
http://hadoop001:50070/
http://hadoop002:50070/
六、集群高可用测试
1、停止Active状态的NameNode
在active状态上的NameNode执行(hadoop1)
代码语言:javascript复制hdfs --daemon stop namenode
2、查看standby状态的NameNode
http://hadoop002:50070/ 可以看到,hadoop2从standby变成了active状态
3、重启启动停止的NameNode
停止之后,浏览器无法访问,重启恢复
代码语言:javascript复制hdfs --daemon start namenode
4、查看两个NameNode状态
http://hadoop001:50070/
http://hadoop002:50070/