Ansible搭建hadoop3.1.3高可用
一、节点信息
- 内核版本:
3.10.0-1062.el7.x86_64
- 系统版本:
Red Hat Enterprise Linux Server release 7.7 (Maipo)
节点 | ip | 内存 | jdk | hadoop | ZK | NN | DN | RN | NM | JN | ZKFC |
---|---|---|---|---|---|---|---|---|---|---|---|
hdp-01 | 192.186.10.11 | 1G | √ | √ | √ | √ | |||||
hdp-02 | 192.186.10.12 | 1G | √ | √ | √ | √ | |||||
hdp-03 | 192.186.10.13 | 1G | √ | √ | √ | ||||||
hdp-04 | 192.186.10.14 | 1G | √ | √ | √ | ||||||
hdp-05 | 192.186.10.15 | 4G | √ | √ | √ | √ | √ | √ | |||
hdp-06 | 192.186.10.16 | 4G | √ | √ | √ | √ | √ | √ | |||
hdp-07 | 192.186.10.17 | 4G | √ | √ | √ | √ | √ | √ |
二、准备工作
1.登录环境
系统启动时进入字符界面
代码语言:txt复制systemctl set-default multi-user.target &&
systemctl isolate multi-user.target
2.网卡
ens33
用来连接外网,下载软件
TYPE="Ethernet"
BOOTPROTO="dhcp"
NAME="ens33"
DEVICE="ens33"
ONBOOT="yes"
ens34
用来连接内网,进行集群间的通信
BOOTPROTO=static
NAME=ens34
DEVICE=ens34
ONBOOT=yes
IPADDR=192.186.10.13
PREFIX=24
3.防火墙、Selinux
关闭防火墙与Selinux
yum install -y iptables-services
iptables -F &&
service iptables save &&
systemctl stop firewalld &&
systemctl disable firewalld &&
setenforce 0 &&
sed -ri 's#(SELINUX=)(enforcing)#1disabled#' /etc/selinux/config
4.ssh免密登录
因为hdp-01
与hdp-02
为hdfs-ha
,所以它们之间必须要自己可以免密登录自己,自己可以登录免密对方
此功能在剧本中已经配置完毕
- hdp-01->hdp-01
- hdp-01->hdp-02
- hdp-01->其它所有主机
- hdp-02->hdp-02
- hdp-02->hdp-01
- hdp-02->其它所有主机
5.安装软件
jdk,hadoop ,zookeeper
的安装及环境变量的配置均已在剧本中写好
6.配置hosts
hosts
配置已经在剧本中写好
三、配置文件
- 写
ansible.cfg
配置文件的时候注意,所有的配置栏目不能少,否则使用ansible
时就会报错
[defaults]
inventory = /root/ansible/inventory
roles_path = /root/ansible/roles
remote_user = root
ask_pass = Flase
forks = 10
[inventory]
[privilege_escalation]
[paramiko_connection]
[ssh_connection]
[persistent_connection]
[accelerate]
[selinux]
[colors]
[diff]
四、目录信息
代码语言:txt复制[root@hdp-01 ~]# tree ansible/
ansible/
├── hadoop_ha.yml #角色启动文件
├── inventory # 主机清单
└── roles
└── hadoop_ha
├── defaults
├── files
├── handlers
├── meta
├── README.md # 帮助文档
├── tasks
│ ├── 01-ssh.yml # 生成hosts文件设置主机名及nn主机免密登录集群
│ ├── 02-install-soft.yml # 安装jdk hadoop zookeeper软件及配置环境变量
│ ├── 03-config_zk.yml # 配置zookeeper集群
│ ├── 04-copy_conf_file.yml # 复制配置文件到所有主机
│ ├── 05-init_ha.yml # 初始化集群
│ ├── 06-start-cluster.yml # 启动集群
│ └── main.yml # 任务入口执行文件
├── templates
│ ├── core-site.xml.j2 # core-site.xml模板文件
│ ├── hadoop-env.sh.j2 # hadoop-env.sh模板文件
│ ├── hdfs-site.xml.j2 # hdfs-site.xml模板文件
│ ├── mapred-site.xml.j2 # mapred-site.xml模板文件
│ ├── workers.j2 # workers模板文件
│ └── yarn-site.xml.j2 #
├── tests
└── vars
├── core.yml # core-site.xml变量
├── hdfs.yml # hdfs-site.xml变量
├── soft.yml # 软件环境及网络变量
└── yarn.yml # yarn-site.xml变量
五、主机清单
代码语言:txt复制[hdp]
hdp-0[1:7] ansible_user=root ansible_ssh_pass="123456"
[nn]
hdp-0[1:2]
[rm]
hdp-0[3:4]
[zk]
hdp-0[5:7]
[jn]
hdp-0[5:7]
[dn]
hdp-0[5:7]
[nm]
hdp-0[5:7]
[nn1]
hdp-01
[nn2]
hdp-02
六、角色
tasks
00-main.yml
代码语言:txt复制- name: include vars
include_vars:
dir: vars/
depth: 1
tags: "always"
- name: config ssh yml
import_tasks: "01-ssh.yml"
tags: "confg-ssh"
- name: install soft yml
import_tasks: "02-install-soft.yml"
tags: "install-soft"
- name: config zk
import_tasks: "03-config_zk.yml"
tags: "config-zk"
- name: copy config file
import_tasks: "04-copy_conf_file.yml"
tags: "copy-con-file"
- name: init ha
import_tasks: "05-init_ha.yml"
tags: "ini-ha"
- name: start cluster
import_tasks: "06-start-cluster.yml"
tags: "start-cluster"
01-ssh.yml
代码语言:txt复制# 1.执行生成主机名脚本
- name: 1. make hosts
script: hosts.sh
register: r
when: ansible_hostname in groups['nn1']
# 2.输出到hosts文件中
- name: 2. out vars
lineinfile:
path: /etc/hosts
line: "{{ hostname }}"
regexp: '^{{ hostname }}'
owner: root
group: root
mode: '0644'
with_items: "{{ r.stdout_lines }}"
loop_control:
loop_var: hostname
when: ansible_hostname in groups['nn1']
#3.在NameNode主机上生成密钥对
- name: gen-pub-key
shell: echo 'y' |ssh-keygen -t rsa -P "" -f /root/.ssh/id_rsa
when: ansible_hostname in groups['nn']
#4.将hdp-01中的host文件复制给所有主机
- name: copy-hosts
copy:
src: /etc/hosts
dest: /etc/hosts
mode: '0644'
force: yes
when: ansible_hostname in groups['nn1']
#5.设置所有主机名
- name: set-hostname
shell: hostnamectl set-hostname $(cat /etc/hosts|grep `ifconfig |grep "inet "|awk '{print $2}'|grep "{{ network }}"`|cut -d " " -f2)
#6.将NameNode主机上将公钥复制给所有的主机
- name: ssh-pub-key-copy
shell: sshpass -p "{{ ansible_ssh_pass }}" ssh-copy-id -i ~/.ssh/id_rsa.pub "{{ ansible_user }}"@"{{ host }}" -o StrictHostKeyChecking=no
with_items: "{{ groups['hdp'] }}"
loop_control:
loop_var: host
when: ansible_hostname in groups['nn']
#8.清除所有主机的iptables规则,关闭selinux
- name: clean
shell: 'source /etc/profile ; iptables -F ; setenforce 0 ; sed -ri "s#(SELINUX=)(enforcing)#1disabled#" /etc/selinux/config'
ignore_errors: true
02-install-soft.yml
代码语言:txt复制#1.创建软件安装目录
- name: create apps directory
file:
path: "{{ soft_install_path }}"
state: directory
mode: '0755'
#2.所有主机安装jdk与hadoop
- name: install-jdk-hadoop
unarchive:
src: "{{ soft }}"
dest: "{{ soft_install_path }}"
with_items:
- [ "{{ hadoop_soft }}", "{{ jdk_soft }}" ]
loop_control:
loop_var: soft
tags: install-ha-jdk
#3.清掉原来的jdk,hadoop环境变量
- name: clean jdk,hadoop env
shell: sed -ri '/HADOOP_HOME/d;/JAVA_HOME/d;/ZOOKEEPER_HOME/d' "{{ env_file }}"
tags: set-env
#4.配置用户的jdk,hadoop环境变量
- name: set jdk hadoop env
lineinfile:
dest: "{{ env_file }}"
line: "{{ soft_env.env }}"
regexp: "{{ soft_env.reg }}"
state: present
with_items:
- { env: 'export JAVA_HOME={{ jdk_home }}' ,reg: '^export JAVA_HOME=' }
- { env: 'export HADOOP_HOME={{ hdp_home }}' ,reg: '^export HADOOP_HOME' }
loop_control:
loop_var: soft_env
tags: set-env
#5.在指定主机组,安装zookeeper集群
- name: install zookeeper
unarchive:
src: "{{ zookeeper_soft }}"
dest: "{{ soft_install_path }}"
when: ansible_hostname in groups['zk']
tags: install-zookeeper
#6.设置zookeeper的用户环境变量
- name: set zookeeper env
lineinfile:
dest: "{{ env_file }}"
line: "{{ zk_env.env }}"
regexp: "{{ zk_env.reg }}"
state: present
with_items:
- { env: 'export ZOOKEEPER_HOME={{ zk_home }}' ,reg: '^export ZOOKEEPER_HOME=' }
loop_control:
loop_var: zk_env
when: ansible_hostname in groups['zk']
tags: set-env
#7.export所有主机的jdk与hadoop环境变量
- name: export jdk hadoop env
lineinfile:
dest: "{{ env_file }}"
line: 'export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin'
regexp: "^export PATH"
state: present
tags: set-env
# 8.export zookeeper集群主机的环境变量
- name: export zookeeper env
replace:
path: "{{ env_file }}"
regexp: "^(export PATH=)(. )$"
replace: '12:$ZOOKEEPER_HOME/bin'
when: ansible_hostname in groups['zk']
tags: set-env
03-config_zk.yml
代码语言:txt复制# 1.复制配置文件
- name: copy config file
copy:
src: "{{ zk_home }}/conf/zoo_sample.cfg"
dest: "{{ zk_home }}/conf/zoo.cfg"
remote_src: yes
when: ansible_hostname in groups['zk']
# 2.创建zk运行时的数据目录
- name: create zk data directory
file:
path: "{{ zk_data_dir }}"
state: directory
mode: '0755'
when: ansible_hostname in groups['zk']
# 3.在配置文件中指定数据目录
- name: set zookeeper dataDir
lineinfile:
dest: "{{ zk_home }}/conf/zoo.cfg"
line: "dataDir={{ zk_data_dir }}"
regexp: "^dataDir="
state: present
when: ansible_hostname in groups['zk']
# 4.设置集群信息
- name: set cluster info
lineinfile:
dest: "{{ zk_home }}/conf/zoo.cfg"
line: "server.{{ item.0 1 }}={{ item.1 }}:2888:3888"
regexp: "^server{{ item.0 1 }}"
with_indexed_items: "{{ groups['zk'] }}"
when: ansible_hostname in groups['zk']
# 5.根据集群信息,创建对应的myid文件
- name: make server id
shell: 'cat {{ zk_home }}/conf/zoo.cfg |grep {{ ansible_hostname }}|cut -d "." -f2|head -c1 > {{ zk_data_dir }}/myid'
when: ansible_hostname in groups['zk']
04-copy_conf_file
代码语言:txt复制# 1.生成classpath变量
- name: hadoopath
shell: 'source {{ env_file }} ; hadoop classpath'
register: r
# 2.复制配置文件到所有主机中
- name: template
template:
src: "{{ item }}"
dest: "{{ hdp_conf }}/{{ item | replace('.j2','') }}"
mode: '0644'
vars:
hdp_classpath: "{{ r.stdout }}"
with_items: ["core-site.xml.j2","hdfs-site.xml.j2","mapred-site.xml.j2","yarn-site.xml.j2","hadoop-env.sh.j2","workers.j2"]
05-init_ha.yml
代码语言:txt复制# 1.首先在zk上要删除hadoop数据目录下所有文件
- name: delete hdp data
shell: "rm -rf {{ hdp_data }}/*"
when: ansible_hostname in groups['zk']
# 2.启动zkServer
- name: start zookeeper
shell: 'source {{ env_file }} && nohup zkServer.sh restart'
when: ansible_hostname in groups['zk']
# 3.启动journalnode
- name: start journalnode
shell: 'source {{ env_file }} ; nohup hdfs --daemon stop journalnode ; nohup hdfs --daemon start journalnode'
when: ansible_hostname in groups['jn']
# 4.首先在nn上要删除hadoop数据目录下所有文件
- name: delete hdp data
shell: "rm -rf {{ hdp_data }}/*"
when: ansible_hostname in groups['nn']
# 5.格式化前要能连接journnode,并且journnode的目录是空的
- name: format namenode
shell: 'source {{ env_file }} && nohup echo y | hdfs namenode -format'
when: ansible_hostname in groups['nn1']
# 6.nn1启动namenode
- name: start namenode
shell: 'source {{ env_file }} ; nohup hdfs --daemon stop namenode ; nohup hdfs --daemon start namenode'
when: ansible_hostname in groups['nn1']
# 7.nn2在复制nn1的元数据之前,nn1要启动namenode
- name: copy mate data
shell: 'source {{ env_file }} && nohup hdfs namenode -bootstrapStandby'
when: ansible_hostname in groups['nn2']
# 8.nn1格式化zkfc
- name: format zkfc
shell: 'source {{ env_file }} && nohup echo y |hdfs zkfc -formatZK'
when: ansible_hostname in groups['nn1']
06-start-cluster.yml
代码语言:txt复制- name: start zookeeper
shell: "source {{ env_file }} ; zkServer.sh restart"
when: ansible_hostname in groups['zk']
# 启动dfs
- name: start dfs
shell: "source {{ env_file }} ;nohup stop-dfs.sh ; nohup start-dfs.sh"
when: ansible_hostname in groups['nn1']
# 启动yarn
- name: start yarn
shell: "source {{ env_file }} ; nohup stop-yarn.sh ; nohup start-yarn.sh"
when: ansible_hostname in groups['nn1']
vars
00-soft.yml
代码语言:txt复制# 主机网段
network: "192.186.10."
# 软件安装路径
soft_install_path: "/root/apps"
# hadoop安装包
hadoop_soft: "/root/soft/hadoop-3.1.3.tar.gz"
# hadoop家目录
hdp_home: "{{ soft_install_path }}/hadoop-3.1.3"
# hadoop配置文件目录
hdp_conf: "{{ hdp_home }}/etc/hadoop"
# hadoop 数据目录
hdp_data: "/root/hdpdata"
# hadoop执行用户
hdp_user: "root"
# jdk安装包
jdk_soft: "/root/soft/jdk1.8.0.tar.gz"
# jdk家目录
jdk_home: "{{ soft_install_path }}/jdk1.8.0"
# zookeeper安装包
zookeeper_soft: "/root/soft/apache-zookeeper-3.5.8-bin.tar.gz"
# zookeeper的安装目录
zk_home: "{{ soft_install_path }}/apache-zookeeper-3.5.8-bin"
# zookeeper运行时数据目录
zk_data_dir: "/root/zkdata"
# 环境变量文件
env_file: "/root/.bashrc"
01-core.yml
代码语言:txt复制# hdfs集群名称
dfs_cluster_name: "mycluster"
# hadoop的临时目录
tmp_dir: "/root/hdpdata/tmp"
# zookeeper集群地址
zk_cluster: "hdp-05:2181,hdp-06:2181,hdp-07:2181"
03-hdfs.yml
代码语言:txt复制# 名称目录
name_dir: "/root/hdpdata/name"
# 数据目录
data_dir: "/root/hdpdata/data"
# namesnodes的名称
nn_names: ["nn1","nn2"]
# namesnodes的rpc地址
nn_rpc_address: ["hdp-01:9000","hdp-02:9000"]
# namesnodes的http地址
nn_http_address: ["hdp-01:9870","hdp-02:9870"]
# NameNode的共享edits元数据在存放的位置
edits_dir: "qjournal://hdp-05:8485;hdp-06:8485;hdp-07:8485/{{ dfs_cluster_name }}"
# JournalNode数据存入的位置
jn_data_dir: "/root/hdpdata/journaldata"
# ssh私钥存入的位置
pri_key: /root/.ssh/id_rsa
#sshfence隔离机制超时时间
ssh_fen_con_timeout: 3000
04-yarn.yml
代码语言:txt复制# yarn集群id
yarn_cluster_id: yrc
# resoucemanager名称
rm_names: ["rm1","rm2"]
# resoucemanager主机名称
rm_hostnames: ["hdp-03","hdp-04"]
# resoucemanager的Web地址
rm_webapp_address: ["hdp-03:8088","hdp-04:8088"]
# 环境白名单列表
env_whitelist: ["JAVA_HOME","HADOOP_HOME"]
templates
1.hadoop-env.sh.j2
代码语言:txt复制export HADOOP_OS_TYPE=${HADOOP_OS_TYPE:-$(uname -s)}
export HADOOP_HOME={{ hdp_home }}
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export JAVA_LIBRARY_PATH=$HADOOP_COMMON_LIB_NATIVE_DIR:$JAVA_LIBRARY_PATH
export HDFS_NAMENODE_USER={{ hdp_user }}
export HDFS_DATANODE_USER={{ hdp_user }}
export YARN_NODEMANAGER_USER={{ hdp_user }}
export YARN_RESOURCEMANAGER_USER={{ hdp_user }}
export HDFS_JOURNALNODE_USER={{ hdp_user }}
export HDFS_ZKFC_USER={{ hdp_user }}
export JAVA_HOME={{ jdk_home }}
2.core-site.xml.j2
代码语言:txt复制<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- 配置集群地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://{{ dfs_cluster_name }}/</value>
</property>
<!-- 指定hadoop临时目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>{{ tmp_dir }}</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>{{ zk_cluster }}</value>
</property>
</configuration>
3.hdfs-site.xml.j2
代码语言:txt复制<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!--指定hdfs的nameservice,需要和core-site.xml中的保持一致 -->
<property>
<name>dfs.nameservices</name>
<value>{{ dfs_cluster_name }}</value>
</property>
<!-- 指定namenodes的名称 -->
<property>
<name>dfs.ha.namenodes.{{ dfs_cluster_name }}</name>
<value>
{% for nn in nn_names %}
{%- set sep=',' -%}
{%- if loop.last -%}
{%- set sep='' -%}
{%- endif -%}
{{ nn }}{{ sep }}
{%- endfor -%}
</value>
</property>
{% for nn in nn_names %}
<!-- {{ nn }}的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.{{ dfs_cluster_name }}.{{ nn }}</name>
<value>{{ nn_rpc_address[loop.index0] }}</value>
</property>
{% endfor %}
{% for nn in nn_names %}
<!-- {{ nn }}的http通信地址 -->
<property>
<name>dfs.namenode.http-address.{{ dfs_cluster_name }}.{{ nn }}</name>
<value>{{ nn_http_address[loop.index0] }}</value>
</property>
{% endfor %}
<!-- 名称目录位置 -->
<property>
<name>dfs.namenode.name.dir</name>
<value>{{ name_dir }}</value>
</property>
<!-- 数据目录位置 -->
<property>
<name>dfs.datanode.data.dir</name>
<value>{{ data_dir }}</value>
</property>
<!-- 指定NameNode的共享edits元数据在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>{{ edits_dir }}</value>
</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>{{ jn_data_dir }}</value>
</property>
<!-- 开启NameNode失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.{{ dfs_cluster_name }}</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>{{ pri_key }}</value>
</property>
<!-- 配置sshfence隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>{{ ssh_fen_con_timeout }}</value>
</property>
</configuration>
4.yarn-site.xml.j2
代码语言:txt复制<?xml version="1.0"?>
<configuration>
<!-- 开启RM高可用 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!-- 指定RM的cluster id -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>{{ yarn_cluster_id }}</value>
</property>
<!-- 指定RM的逻辑名字 -->
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>
{%- for rm in rm_names -%}
{%- set sep=',' -%}
{%- if loop.last -%}
{%- set sep='' -%}
{%- endif -%}
{{ rm }}{{ sep }}
{%- endfor -%}
</value>
</property>
{%- for rm in rm_names -%}
<!-- 指定{{ rm }}的地址 -->
<property>
<name>yarn.resourcemanager.hostname.{{ rm }}</name>
<value>{{ rm_hostnames[loop.index0] }}</value>
</property>
{%- endfor -%}
<!-- 至关重要,即使默认有也要配置 -->
{%- for rm in rm_names -%}
<!-- {{ rm }}的webapp地址 -->
<property>
<name>yarn.resourcemanager.webapp.address.{{ rm }}</name>
<value>{{ rm_webapp_address[loop.index0] }}</value>
</property>
{%- endfor -%}
<!-- 指定zk集群地址 -->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>{{ zk_cluster }}</value>
</property>
<!--启用自动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!-- 启用自动切换 -->
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 指定resourcemanager的状态信息存储在zookeeper集群 -->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<!-- NodeManager上运行的附属服务,需配置成mapreduce_shuffle,才可运行MapReduce程序 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 配置nm环境环境变量白名单 -->
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>{{ env_whitelist }}</value>
</property>
<!-- yarn程序运行环境变量 -->
<property>
<name>yarn.application.classpath</name>
<value>{{ hdp_classpath }}</value>
</property>
<!-- 让NodeManager自动检测内存和CPU -->
<property>
<name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
<value>true</value>
</property>
</configuration>
5.mapred-site.xml.j2
代码语言:txt复制<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
6.workers.j2
代码语言:txt复制{% for host in groups['dn'] %}
{{ host }}
{% endfor %}
七、使用方法
1.执行所有
- 查看
hadoop_ha
角色文件
[root@hdp-01 ansible]# cat hadoop_ha.yml
- hosts: all
roles:
- { role: hadoop_ha }
- 从头开始执行所有步骤,适合初化环境下运行
[root@hdp-01 ansible]# ansible-playbook hadoop_ha.yml
2.指定执行
- 查看角色
tasks
中的所有标签
[root@hdp-01 ~]# ansible-playbook --list-tags hadoop_ha.yml
[always, config-ssh, config-zk, copy-con-file, ini-ha, install-ha-jdk, install-soft, install-zookeeper, set-env, start-cluster]
- 可以指定标签执行对应的功能,适合精确的使用某个功能
ansible -t config-ssh hadoop_ha.yml
八、测试集群
1.查看集群进程信息
代码语言:txt复制[root@hdp-01 ~]# ansible -m shell -a 'jps' hdp
hdp-02 | CHANGED | rc=0 >>
13909 Jps
11597 NameNode
11663 DFSZKFailoverController
hdp-04 | CHANGED | rc=0 >>
11219 Jps
9802 ResourceManager
hdp-03 | CHANGED | rc=0 >>
9827 ResourceManager
11436 Jps
hdp-01 | CHANGED | rc=0 >>
2882 Jps
1829 NameNode
1957 DFSZKFailoverController
hdp-05 | CHANGED | rc=0 >>
12560 Jps
10281 JournalNode
10026 QuorumPeerMain
10219 DataNode
10475 NodeManager
hdp-06 | CHANGED | rc=0 >>
10197 JournalNode
9942 QuorumPeerMain
10135 DataNode
12430 Jps
10399 NodeManager
hdp-07 | CHANGED | rc=0 >>
10112 DataNode
12518 Jps
9927 QuorumPeerMain
10375 NodeManager
2.测试mapreduce
1).查看yarn
集群信息
[root@hdp-02 ~]# yarn rmadmin -getAllServiceState
hdp-03:8033 active
hdp-04:8033 standby
2).进入示例目录
代码语言:txt复制[root@hdp-01 ~]# cd /root/apps/hadoop-3.1.3/share/hadoop/mapreduce
3).执行pi
的mapreduce
程序
[root@hdp-01 mapreduce]# hadoop jar hadoop-mapreduce-examples-3.1.3.jar pi 3 5
4).执行结果
代码语言:txt复制Estimated value of Pi is 3.73333333333333333333
3.测试hdfs高可用
1).上传一个文件到hdfs
中*
[root@hdp-01 ~]# hadoop fs -put /var/log/messages /
2).获取active
状态的主机,kill掉namenode
[root@hdp-01 ~]# hdfs haadmin -getAllServiceState
hdp-01:9000 standby
hdp-02:9000 active
[root@hdp-02 ~]# jps
14020 Jps
11597 NameNode
11663 DFSZKFailoverController
[root@hdp-02 ~]# kill -9 11597
3).查看nn1
对应hdp-01
的namenode
状态
[root@hdp-01 ~]# hdfs haadmin -getServiceState nn1
active
4).再次查看hdfs
中的文件信息,发现仍然可以访问,说明成功
[root@hdp-01 ~]# hadoop fs -ls /messages
-rw-r--r-- 3 root supergroup 684483 2020-08-10 14:48 /messages
5).再次启动刚刚kill
掉的namdenode
,查看集群状态信息,发现hdp-02
已经是standby
了
[root@hdp-02 ~]# hdfs --daemon start namenode
[root@hdp-02 ~]# hdfs haadmin -getAllServiceState
hdp-01:9000 active
hdp-02:9000 standby