简介
随着公司发展,业务数据量增涨,越来越多的公司都开始利用大数据平台,分析业务,但是大数据平台构建比较复杂,如果完全手动安装(其实非常不方便,组件和版本都要关联对应)。有没有方便一点的安装呢?,有,目前主流大数据平台集成环境安装:分别是CDH 或者Ambari.本文主要介绍CDH 安装。
一、软件包:
链接: https://pan.baidu.com/s/1kjgKuk5gKvSSBYWM4lCoKw 密码: wagf
二、初始化
- 1、主机列表
119.101.166.253 172.27.55.96 sc01
119.98.60.170 172.27.177.58 sc02
119.98.64.72 172.27.177.56 sc03
119.98.65.245 172.27.177.57 sc04
119.92.227.70 172.27.177.55 sc05
- 2、所有主机关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
iptables -F
- 3、说有主机设置hosts文件
提取:cat 1.txt |awk '{print "echo ""$2,$3,""",">>/etc/hosts"}'
[root@sc01 ~]# more 1.txt
119.101.166.253 172.27.55.96 sc01
119.98.60.170 172.27.177.58 sc02
119.98.64.72 172.27.177.56 sc03
119.98.65.245 172.27.177.57 sc04
119.92.227.70 172.27.177.55 sc05
[root@sc01 ~]# cat 1.txt |awk '{print "echo ""$2,$3,""",">>/etc/hosts"}'
echo "172.27.55.96 sc01 " >>/etc/hosts
echo "172.27.177.58 sc02 " >>/etc/hosts
echo "172.27.177.56 sc03 " >>/etc/hosts
echo "172.27.177.57 sc04 " >>/etc/hosts
echo "172.27.177.55 sc05 " >>/etc/hosts
- 4、关闭所有节点的selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
setenforce 0
- 5、所有节点时钟同步
echo '*/30 * * * * /usr/sbin/ntpdate cn.pool.ntp.org && hwclock -w && hwclock --systohc >/dev/null 2>&1' >>/var/spool/cron/root
- 6、所有节点安装JDK
mkdir /usr/java
ln -s /usr/local/jdk1.8 /usr/java/default
systemctl stop cloudera-scm-agent
systemctl start cloudera-scm-agent
mkdir -p /usr/share/java/
下载mysql驱动包https://dev.mysql.com/downloads/connector/j/
重命名不能带版本号
cp mysql-connector-java-5.1.47.jar /usr/share/java/mysql-connector-java.jar
二、安装MYSQL
- 1、安装mysql
mysql -uroot -pxxxx
- 2、编码:(不能使用utf8mb4)
Use UTF8 encoding for all custom databases. MySQL and MariaDB must use the MySQL utf8 encoding, not utf8mb4.
- 3、创建CDH的元数据库和用户、amon服务的数据库及用户
create database cmf DEFAULT CHARACTER SET utf8;
create database amon DEFAULT CHARACTER SET utf8;
create database hive DEFAULT CHARACTER SET utf8;
grant all on cmf.* TO 'cmf'@'%' IDENTIFIED BY '123Aa123';
grant all on hive.* TO 'hive'@'%' IDENTIFIED BY '123Aa123';
grant all on amon.* TO 'amon'@'%' IDENTIFIED BY '123Aa123';
flush privileges;
三、CDH部署
1、离线部署cm server及agent
- 1.1.【所有节点】创建目录及解压
mkdir /opt/cloudera-manager
tar -xzvf cm6.3.1-redhat7.tar.gz -C /opt/cloudera-manager/
- 1.2.选择sc01为cm server,不下载依赖包直接部署(服务节点)
cd /opt/cloudera-manager/cm6.3.1/RPMS/x86_64
rpm -ivh cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
rpm -ivh cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
- 1.3.所有节点(包含sc01)为cm agent,不下载依赖包直接部署(cm_agent节点)
cd /opt/cloudera-manager/cm6.3.1/RPMS/x86_64
rpm -ivh cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
rpm -ivh cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm --nodeps --force
- 1.4.所有节点修改agent的配置,指向server的节点sc01
sed -i "s/server_host=localhost/server_host=sc01/g" /etc/cloudera-scm-agent/config.ini
- 1.5.主节点修改server的配置:
vi /etc/cloudera-scm-server/db.properties
com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=sc01
com.cloudera.cmf.db.name=cmf
com.cloudera.cmf.db.user=cmf
com.cloudera.cmf.db.password=123Aa123
com.cloudera.cmf.db.setupType=EXTERNAL
2、sc01节点部署离线parcel源
- 2.1.安装httpd服务
yum install -y httpd
- 2.2.部署离线parcel源(https://archive.cloudera.com/cdh6/6.3.2/parcels/)
mkdir -p /var/www/html/cdh6_parcel
- 2.3.启动httpd,window查看
systemctl start httpd
访问地址:http://sc01/cdh6_parcel
3、sc01节点启动Server
- 3.1.启动server
systemctl start cloudera-scm-server
- 查看日志:
cd /var/log/cloudera-scm-server/
- 3.2 开启7180端口
- 3.3.等待1min,打开 http://sc01:7180 账号密码:admin/admin
- 3.4.假如打不开,去看server的log,根据错误仔细排查错误
4.所有节点启动Agent
代码语言:javascript复制systemctl start cloudera-scm-agent
5.WEB登录操作
代码语言:javascript复制http://sc01:7180/cmf
账号密码:admin/admin
代码语言:javascript复制修复透明大页面(主、次)
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
swappiness 设置(主、次)
sysctl vm.swappiness=10
echo 'vm.swappiness=10'>> /etc/sysctl.conf
6、HDFS文件操作(切记,文件直接放到/root/目录,上传HDFS文件系统,会提示文件不存在,移动到其他目录:例如/tmp)
代码语言:javascript复制sudo -u hdfs hadoop fs -mkdir -p /home/spark_conf
sudo -u hdfs hadoop fs -put /opt/cloudera/parcels/CDH/etc/hive/conf.dist/hive-site.xml /home/spark_conf/
四、CDH 错误日志处理
1、下面错误主要是JDK原因导致,JDK安装之前写的路径进行安装,避免出现下面问题。
代码语言:javascript复制[root@sc01 cdh]# journalctl -xe
Sep 16 10:44:02 sc01 cm-server[20855]: | - a supported version of the Oracle JDK from the Oracle Java web |
Sep 16 10:44:02 sc01 cm-server[20855]: | site: |
Sep 16 10:44:02 sc01 cm-server[20855]: | > http://www.oracle.com/technetwork/java/javase/index.html < |
Sep 16 10:44:02 sc01 cm-server[20855]: | OR |
Sep 16 10:44:02 sc01 cm-server[20855]: | - a supported version of the OpenJDK from your OS vendor. Help for |
Sep 16 10:44:02 sc01 cm-server[20855]: | some OSes are available at: |
Sep 16 10:44:02 sc01 cm-server[20855]: | > http://openjdk.java.net/install/ < |
Sep 16 10:44:02 sc01 cm-server[20855]: | |
Sep 16 10:44:02 sc01 cm-server[20855]: | Cloudera Manager requires Oracle JDK or OpenJDK 1.8 or later. |
Sep 16 10:44:02 sc01 cm-server[20855]: | NOTE: Cloudera Manager will find the Oracle JDK when starting, |
Sep 16 10:44:02 sc01 cm-server[20855]: | regardless of whether you installed the JDK using a binary |
Sep 16 10:44:02 sc01 cm-server[20855]: | installer or the RPM-based installer. |
Sep 16 10:44:02 sc01 cm-server[20855]: ======================================================================
Sep 16 10:44:02 sc01 systemd[1]: cloudera-scm-server.service: main process exited, code=exited, status=1/FAILURE
Sep 16 10:44:02 sc01 systemd[1]: Unit cloudera-scm-server.service entered failed state.
Sep 16 10:44:02 sc01 systemd[1]: cloudera-scm-server.service failed.
Sep 16 10:44:02 sc01 systemd[1]: cloudera-scm-server.service holdoff time over, scheduling restart.
Sep 16 10:44:02 sc01 systemd[1]: Stopped Cloudera CM Server Service.
-- Subject: Unit cloudera-scm-server.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit cloudera-scm-server.service has finished shutting down.
Sep 16 10:44:02 sc01 systemd[1]: start request repeated too quickly for cloudera-scm-server.service
Sep 16 10:44:02 sc01 systemd[1]: Failed to start Cloudera CM Server Service.
-- Subject: Unit cloudera-scm-server.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit cloudera-scm-server.service has failed.
--
-- The result is failed.
Sep 16 10:44:02 sc01 systemd[1]: Unit cloudera-scm-server.service entered failed state.
Sep 16 10:44:02 sc01 systemd[1]: cloudera-scm-server.service failed.
- 2、集群安装错误(主机运行状况不良)
解决:(删除agent目录下面的cm_guid文件,并重启失败节点的agent服务恢复。)
rm -rvf /var/lib/cloudera-scm-agent/cm_guid
service cloudera-scm-agent restart