0818-7.1.1-如何卸载CDP

2020-11-30 10:53:17 浏览数 (1)

作者:刘元强

数据备份

1.1备份HDFS数据

常见的备份HDFS数据有如下办法:

1.使用distcp将数据拷贝到另外一个Hadoop集群。

2.将数据拷贝到其他存储设备。

3.将数据分批导出到各台主机的各个磁盘上

以上三种方法也可以只使用于关键数据,具体使用哪种方法,可以根据自己集群的规模和数据量大小具体选择。

1.2备份NameNode元数据

1.登录到Active NameNode节点,将HDFS进入安全模式,并且将所有edits修改都flush到fsimage

代码语言:javascript复制
sudo -u hdfs hdfs dfsadmin -safemode enter
sudo -u hdfs hdfs dfsadmin –saveNamespace

2.将NameNode元数据进行备份,根据自己集群NameNode目录进行如下操作:

代码语言:javascript复制
mkdir namenode_back             
cd namenode_back/               
tar czvf nn_bak.tar.gz /dfs/nn/*

1.3备份MySQL元数据

代码语言:javascript复制
mkdir mysql_back
cd mysql_back/

#-u后面是mysql用户名,-p单引号中是用户对应的密码,metastore为库名,metastore.sql为备份输出文件
mysqldump -uroot -p'Password&123' hive > hive.sql
mysqldump -uroot -p'Password&123' cm > cm.sql    
mysqldump -uroot -p'Password&123' rman > rman.sql
mysqldump -uroot -p'Password&123' hue > hue.sql  
mysqldump -uroot -p'Password&123' ranger > ranger.sql   

注:如果有Ranger数据库可以同样备份。

1.4 备份集群配置数据

通过Cloudera Manager提供的API接口,导出一份JSON文件,该文件包含Cloudera Manager所有与部署相关的所有信息如:所有主机,集群,服务,角色,用户,设置等等。可以通过这份JSON文件备份或恢复Cloudera Manager的整个部署。

备份集群配置数据,登录到Cloudera Manager所在服务器,运行如下命令:

代码语言:javascript复制
curl -u admin:admin "http://192.168.0.159:7180/api/v31/cm/deployment" > ./cm-deployment.json 
ll cm-deployment.json

admin: 登录到Cloudera Manager的用户名 admin: 对应admin_username用户的密码 192.168.0.159: 是Cloudera Manager服务器的主机IP ./cm-deployment.json: 保存配置文件的路径和文件名 将上述提到的四个参数修改当前集群对应的信息即可

1.5记录用户数据目录

在后面的章节正式开始卸载时,各个组件的用户数据目录会删除。主要包括如/var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper data_drive_path/dfs data_drive_path/mapred data_drive_path/yarn,默认配置是在这些路径下。但是有些时候,你可能通过Cloudera Manager重新进行了配置。如果卸载集群时需要完全删除这些数据目录,或者为了保证你卸载后马上重新安装能成功,一旦你进行了个性化配置,你需要在Cloudera Manager中仔细检查这些目录配置并记录。

删除集群

2.1停止集群服务

1.停止Cluster

在Cloudera Manager主页上选择Cluster1菜单“操作->停止”选项

在弹出的对话框中选择停止。

等待集群服务停止完成

2.停止Cloudera Management Service

选择Coudera Management Server菜单的停止选项

选择停止

Cloudera Management Server停止完成

CM主页显示如下

2.2解除并删除Parcels

1.停用Parcels

在 Cloudera Manager 主页,点击左侧的Parcel 图标

在 parcel 页面,点击右方停用按钮

选择仅限停用状态,确定

此时右方按钮变为“激活”

2.删除Parcels

点击“激活”下方菜单,选择“从主机中删除”

确认删除

完成后按钮变为“分配”

点击下方菜单选择“删除”

删除成功后按钮变为“下载”

2.3 删除集群

进入Cloudera Manager主页,点击Cluster 1右方菜单,选择“删除”

确认删除

删除成功后主页显示如下

软件卸载与目录删除

3.1 停止并卸载cloudera-scm-server

1.在CM节点使用命令停止cloudera-scm-server停止服务

代码语言:javascript复制
systemctl stop cloudera-scm-server
systemctl status cloudera-scm-server | grep Active

2.删除cloudera-scm-server服务

代码语言:javascript复制
yum -y remove cloudera-manager-server

3.2 停止并卸载cloudera-scm-agent

1.使用脚本批量停止所有节点的cloudera-scm-agent服务

代码语言:javascript复制
sh batch_cmd.sh node.list "systemctl stop cloudera-scm-supervisord"
sh batch_cmd.sh node.list "systemctl stop cloudera-scm-agent"

使用脚本执行命令,查看所有节点cloudera-scm-agent服务均已被停止

代码语言:javascript复制
sh batch_cmd.sh node.list "systemctl status cloudera-scm-agent | grep Active"

所有节点查看supervisord服务也已被停止

2.所有节点卸载cloudera-manager-agent

代码语言:javascript复制
yum -y remove 'cloudera-manager-*'

3.3 卸载集群软件

1.卸载所有节点上的软件

代码语言:javascript复制
yum -y remove avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-history-server spark-python sqoop sqoop2 whirr hue-common oozie-client solr solr-doc sqoop2-client zookeeper

2.清除yum缓存

代码语言:javascript复制
yum -y remove avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-history-server spark-python sqoop sqoop2 whirr hue-common oozie-client solr solr-doc sqoop2-client zookeeper

删除Cloudera Manager和用户数据

4.1 删除Cloudera Manager数据

1.解除挂载cm_processes

代码语言:javascript复制
sh batch_cmd.sh node.list "umount cm_processes"
sh batch_cmd.sh node.list "df -hl"

2.删除所有节点的Cloudera Manager数据

代码语言:javascript复制
sh batch_cmd.sh node.list "umount cm_processes"
sh batch_cmd.sh node.list "df -hl"

3.删除所有节点的.scm_prepare_node.lock文件

代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /tmp/.scm_prepare_node.lock"

4.2 移除用户数据(所有节点)

1./etc目录下的集群服务配置文件

代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /etc/cloudera* /etc/flume-ng /etc/hadoop* /etc/hbase* /etc/hive* /etc/hue /etc/impala /etc/kafka /etc/kudu /etc/ranger /etc/sentry /etc/solr /etc/spark /etc/sqoop /etc/tez /etc/zeppelin /etc/zookeeper"
代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /etc/alternatives/avro-tools /etc/alternatives/beeline /etc/alternatives/bigtop-detect-javahome /etc/alternatives/catalogd /etc/alternatives/cli_mt /etc/alternatives/cli_st /etc/alternatives/flume* /etc/alternatives/hadoop* /etc/alternatives/hbase* /etc/alternatives/hcat /etc/alternatives/hdfs /etc/alternatives/hive* /etc/alternatives/hiveserver2 /etc/alternatives/hue-conf /etc/alternatives/impala* /etc/alternatives/impalad /etc/alternatives/kafka* /etc/alternatives/kudu* /etc/alternatives/load_gen /etc/alternatives/mapred /etc/alternatives/oozie /etc/alternatives/ozone /etc/alternatives/parquet-tools /etc/alternatives/phoenix* /etc/alternatives/pyspark /etc/alternatives/sentry* /etc/alternatives/solr* /etc/alternatives/solrctl /etc/alternatives/spark* /etc/alternatives/sqoop* /etc/alternatives/statestored /etc/alternatives/tez-conf /etc/alternatives/yarn /etc/alternatives/zeppelin-conf /etc/alternatives/zookeeper*"

2./usr/bin/目录下各项服务的可执行程序命令脚本

代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /usr/bin/avro-tools /usr/bin/beeline /usr/bin/bigtop-detect-javahome /usr/bin/catalogd /usr/bin/cli_mt /usr/bin/cli_st /usr/bin/flume-ng /usr/bin/hadoop* /usr/bin/hbase* /usr/bin/hcat /usr/bin/hdfs /usr/bin/hive /usr/bin/hiveserver2 /usr/bin/impala* /usr/bin/impalad /usr/bin/kafka* /usr/bin/kudu* /usr/bin/load_gen /usr/bin/mapred /usr/bin/oozie /usr/bin/ozone /usr/bin/parquet-tools /usr/bin/phoenix* /usr/bin/pyspark /usr/bin/sentry /usr/bin/solrctl /usr/bin/spark* /usr/bin/sqoop* /usr/bin/statestored /usr/bin/yarn /usr/bin/zookeeper*"

3./var/lib/目录下各项服务数据目录

代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /var/lib/accumulo /var/lib/atlas /var/lib/cloudera* /var/lib/druid /var/lib/flink /var/lib/flume-ng /var/lib/hadoop* /var/lib/hbase /var/lib/hive /var/lib/hue /var/lib/impala /var/lib/kafka /var/lib/knox /var/lib/kudu /var/lib/livy /var/lib/llama /var/lib/oozie /var/lib/phoenix /var/lib/ranger /var/lib/solr /var/lib/spark /var/lib/sqoop* /var/lib/superset /var/lib/yarn-ce /var/lib/zeppelin /var/lib/zookeeper"
代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /var/lib/alternatives/avro-tools /var/lib/alternatives/beeline /var/lib/alternatives/bigtop-detect-javahome /var/lib/alternatives/catalogd /var/lib/alternatives/cli_mt /var/lib/alternatives/cli_st /var/lib/alternatives/flume* /var/lib/alternatives/hadoop* /var/lib/alternatives/hbase* /var/lib/alternatives/hcat /var/lib/alternatives/hdfs /var/lib/alternatives/hive* /var/lib/alternatives/hue-conf /var/lib/alternatives/impala* /var/lib/alternatives/impalad /var/lib/alternatives/kafka* /var/lib/alternatives/kudu* /var/lib/alternatives/load_gen /var/lib/alternatives/mapred /var/lib/alternatives/oozie /var/lib/alternatives/ozone /var/lib/alternatives/parquet-tools /var/lib/alternatives/phoenix* /var/lib/alternatives/pyspark /var/lib/alternatives/sentry* /var/lib/alternatives/solr* /var/lib/alternatives/solrctl /var/lib/alternatives/spark* /var/lib/alternatives/sqoop* /var/lib/alternatives/statestored /var/lib/alternatives/tez-conf /var/lib/alternatives/yarn /var/lib/alternatives/zeppelin-conf /var/lib/alternatives/zookeeper*"

4./var/run/目录下的各项服务数据目录

代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /var/run/cloudera* /var/run/flume-ng /var/run/hadoop* /var/run/hbase /var/run/hdfs-sockets /var/run/hive /var/run/hue /var/run/impala /var/run/oozie /var/run/sqoop2 /var/run/zookeeper"

5./var/log/目录下的日志文件

代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /var/log/atlas /var/log/catalogd /var/log/cloudera* /var/log/hadoop* /var/log/hbase /var/log/hive /var/log/hue* /var/log/impalad /var/log/impala* /var/log/kafka /var/log/oozie /var/log/phoenix /var/log/ranger /var/log/spark /var/log/statestore /var/log/yarn /var/log/zookeeper /var/local/kafka"

6./tmp/目录下的临时文件

代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /tmp/*_resources /tmp/cmflistener* /tmp/ehcache* /tmp/embedded /tmp/hadoop* /tmp/hbase* /tmp/hive* /tmp/hsperfdata* /tmp/jetty* /tmp/oozie /tmp/scm_prepare_node* /tmp/start_* /tmp/tmp*"

4.3 删除安装目录

1.删除/etc/yum.repos.d/cloudera*

代码语言:javascript复制
 sh batch_cmd.sh node.list "rm -rf /etc/yum.repos.d/cloudera*"

2.删除nn,dn,jn,yarn,impala,kudu等数据目录

代码语言:javascript复制
sh batch_cmd.sh node.list "rm -rf /dfs/* /data0/* /data1/* /data/* /impala /yarn /kudu*“

最后根据实际情况操作是否remove元数据库MySQL,至此,CDP的卸载完毕。

0 人点赞