温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。
Fayson的github:https://github.com/fayson/cdhproject
提示:代码块部分可以左右滑动查看噢
1.文档编写目的
在前面的文章中,Fayson介绍了《如何在Redhat7.4安装CDH6.0.0_beta1》,这里我们基于这个环境开始安装Kerberos。关于CDH启用Kerberos的文章,前面Fayson也介绍过《如何在CDH集群启用Kerberos》和《如何在Redhat7.3的CDH5.14中启用Kerberos》,通过本文,我们也可以来看看CDH6启用Kerberos有哪些不一样的地方。
- 内容概述:
1.如何安装及配置KDC服务
2.如何通过CDH启用Kerberos
3.如何登录Kerberos并访问Hadoop相关服务
4.总结
- 测试环境:
1.操作系统:Redhat7.4
2.CDH6.0.0-beta1
3.采用root用户进行操作
2.KDC服务安装及配置
本文档中将KDC服务安装在Cloudera Manager Server所在服务器上(KDC服务可根据自己需要安装在其他服务器)
1.在Cloudera Manager服务器上安装KDC服务
代码语言:javascript复制[root@ip-172-31-0-131 ~]# yum -y install krb5-server krb5-libs krb5-auth-dialog krb5-workstation
(可左右滑动)
2.修改/etc/krb5.conf配置
代码语言:javascript复制[root@ip-172-31-0-131 ~]# vim /etc/krb5.conf
# Configuration snippets may be placed in this directory as well
includedir /etc/krb5.conf.d/
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
dns_lookup_realm = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
default_realm = FAYSON.COM
#default_ccache_name = KEYRING:persistent:%{uid}
[realms]
FAYSON.COM = {
kdc = ip-172-31-0-131.ap-southeast-1.compute.internal
admin_server = ip-172-31-0-131.ap-southeast-1.compute.internal
}
[domain_realm]
.ap-southeast-1.compute.internal = FAYSON.COM
ap-southeast-1.compute.internal = FAYSON.COM
(可左右滑动)
标红部分为需要修改的信息。
3.修改/var/kerberos/krb5kdc/kadm5.acl配置
代码语言:javascript复制[root@ip-172-31-0-131 ~]# vim /var/kerberos/krb5kdc/kadm5.acl
*/admin@FAYSON.COM *
(可左右滑动)
4.修改/var/kerberos/krb5kdc/kdc.conf配置
代码语言:javascript复制[root@ip-172-31-0-131 ~]# vim /var/kerberos/krb5kdc/kdc.conf
[root@ip-172-31-0-131 ~]# cat /var/kerberos/krb5kdc/kdc.conf
[kdcdefaults]
kdc_ports = 88
kdc_tcp_ports = 88
[realms]
FAYSON.COM = {
#master_key_type = aes256-cts
max_renewable_life= 7d 0h 0m 0s
acl_file = /var/kerberos/krb5kdc/kadm5.acl
dict_file = /usr/share/dict/words
admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal camellia256-cts:normal camellia128-cts:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
}
(可左右滑动)
标红部分为需要修改的配置
5.创建Kerberos数据库
代码语言:javascript复制[root@ip-172-31-0-131 ~]# kdb5_util create –r FAYSON.COM -s
Loading random data
Initializing database '/var/kerberos/krb5kdc/principal' for realm 'FAYSON.COM',
master key name 'K/M@FAYSON.COM'
You will be prompted for the database Master Password.
It is important that you NOT FORGET this password.
Enter KDC database master key:
Re-enter KDC database master key to verify:
(可左右滑动)
此处需要输入Kerberos数据库的密码。
6.创建Kerberos的管理账号
代码语言:javascript复制[root@ip-172-31-0-131 ~]# kadmin.local
Authenticating as principal root/admin@FAYSON.COM with password.
kadmin.local: addprinc admin/admin@FAYSON.COM
WARNING: no policy specified for admin/admin@FAYSON.COM; defaulting to no policy
Enter password for principal "admin/admin@FAYSON.COM":
Re-enter password for principal "admin/admin@FAYSON.COM":
Principal "admin/admin@FAYSON.COM" created.
kadmin.local: exit
(可左右滑动)
标红部分为Kerberos管理员账号,需要输入管理员密码。
7.将Kerberos服务添加到自启动服务,并启动krb5kdc和kadmin服务
代码语言:javascript复制[root@ip-172-31-0-131 ~]# systemctl enable krb5kdc
Created symlink from /etc/systemd/system/multi-user.target.wants/krb5kdc.service to /usr/lib/systemd/system/krb5kdc.service.
[root@ip-172-31-0-131 ~]# systemctl enable kadmin
Created symlink from /etc/systemd/system/multi-user.target.wants/kadmin.service to /usr/lib/systemd/system/kadmin.service.
[root@ip-172-31-0-131 ~]# systemctl start krb5kdc
[root@ip-172-31-0-131 ~]# systemctl start kadmin
(可左右滑动)
8.测试Kerberos的管理员账号
代码语言:javascript复制[root@ip-172-31-0-131 ~]# kinit admin/admin@FAYSON.COM
Password for admin/admin@FAYSON.COM:
[root@ip-172-31-0-131 ~]#
[root@ip-172-31-0-131 ~]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: admin/admin@FAYSON.COM
Valid starting Expires Service principal
05/19/2018 09:19:08 05/20/2018 09:19:08 krbtgt/FAYSON.COM@FAYSON.COM
renew until 05/26/2018 09:19:08
(可左右滑动)
9.为集群安装所有Kerberos客户端,包括Cloudera Manager
使用批处理脚本为集群所有节点安装Kerberos客户端
代码语言:javascript复制[root@ip-172-31-0-131 shell]# sh ssh_do_all.sh node.list 'yum -y install krb5-libs krb5-workstation'
(可左右滑动)
10.在Cloudera Manager Server服务器上安装额外的包
代码语言:javascript复制[root@ip-172-31-0-131 shell]# yum -y install openldap-clients
(可左右滑动)
11.将KDC Server上的krb5.conf文件拷贝到所有Kerberos客户端
使用批处理脚本将Kerberos服务端的krb5.conf配置文件拷贝至集群所有节点的/etc目录下:
代码语言:javascript复制[root@ip-172-31-0-131 shell]# sh bk_cp.sh node.list /etc/krb5.conf /etc/
(可左右滑动)
3.CDH集群启用Kerberos
1.在KDC中给Cloudera Manager添加管理员账号
代码语言:javascript复制[root@ip-172-31-0-131 shell]# kadmin.local
Authenticating as principal admin/admin@FAYSON.COM with password.
kadmin.local: addprinc cloudera-scm/admin@FAYSON.COM
WARNING: no policy specified for cloudera-scm/admin@FAYSON.COM; defaulting to no policy
Enter password for principal "cloudera-scm/admin@FAYSON.COM":
Re-enter password for principal "cloudera-scm/admin@FAYSON.COM":
Principal "cloudera-scm/admin@FAYSON.COM" created.
kadmin.local: exit
(可左右滑动)
2.进入Cloudera Manager的“管理”->“安全”界面
3.选择“启用Kerberos”,进入如下界面
4.确保如下列出的所有检查项都已完成
5.点击“继续”,配置相关的KDC信息,包括类型、KDC服务器、KDC Realm、加密类型以及待创建的Service Principal(hdfs,yarn,,hbase,hive等)的更新生命期等
6.不建议让Cloudera Manager来管理krb5.conf, 点击“继续”
7.输入Cloudera Manager的Kerbers管理员账号,一定得和之前创建的账号一致,点击“继续”
8.点击“继续”启用Kerberos
9.Kerberos启用完成,点击“继续”
10.勾选重启集群,点击“继续”
11.集群重启完成,点击“继续”
12.点击“继续”
点击“完成”,至此已成功启用Kerberos。
4.Kerberos使用
使用fayson用户运行MapReduce任务及操作Hive,需要在集群所有节点创建fayson用户。
1.使用kadmin创建一个fayson的principal
代码语言:javascript复制[root@ip-172-31-0-131 shell]# kadmin.local
Authenticating as principal admin/admin@FAYSON.COM with password.
kadmin.local: addprinc fayson@FAYSON.COM
WARNING: no policy specified for fayson@FAYSON.COM; defaulting to no policy
Enter password for principal "fayson@FAYSON.COM":
Re-enter password for principal "fayson@FAYSON.COM":
Principal "fayson@FAYSON.COM" created.
kadmin.local: exit
You have new mail in /var/spool/mail/root
(可左右滑动)
2.使用fayson用户登录Kerberos
代码语言:javascript复制[root@ip-172-31-0-131 shell]# kdestroy
[root@ip-172-31-0-131 shell]# klist
klist: No credentials cache found (filename: /tmp/krb5cc_0)
[root@ip-172-31-0-131 shell]# kinit fayson
Password for fayson@FAYSON.COM:
[root@ip-172-31-0-131 shell]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: fayson@FAYSON.COM
Valid starting Expires Service principal
05/19/2018 11:50:13 05/20/2018 11:50:13 krbtgt/FAYSON.COM@FAYSON.COM
renew until 05/26/2018 11:50:13
(可左右滑动)
3.在集群所有节点添加fayson用户
使用批量脚本在所有节点添加fayson用户
代码语言:javascript复制[root@ip-172-31-0-131 shell]# sh ssh_do_all.sh node.list "useradd fayson"
(可左右滑动)
4.运行MapReduce作业
代码语言:javascript复制[root@ip-172-31-0-131 hadoop-mapreduce]# hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 1
(可左右滑动)
5.使用beeline连接hive进行测试
代码语言:javascript复制root@ip-172-31-0-131 hadoop-mapreduce]# beeline
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.339140/jars/log4j-slf4j-impl-2.8.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-6.0.0-1.cdh6.0.0.p0.339140/jars/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 2.1.1-cdh6.0.0-beta1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000/;principal=hive/ip-172-31-0-131.ap-southeast-1.compute.internal@FAYSON.COM
Connecting to jdbc:hive2://localhost:10000/;principal=hive/ip-172-31-0-131.ap-southeast-1.compute.internal@FAYSON.COM
Connected to: Apache Hive (version 2.1.1-cdh6.0.0-beta1)
Driver: Hive JDBC (version 2.1.1-cdh6.0.0-beta1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000/> show tables;
INFO : Compiling command(queryId=hive_20180519115652_3f96406a-3d1b-46ef-841b-2ff75844b8f9): show tables
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling command(queryId=hive_20180519115652_3f96406a-3d1b-46ef-841b-2ff75844b8f9); Time taken: 1.228 seconds
INFO : Executing command(queryId=hive_20180519115652_3f96406a-3d1b-46ef-841b-2ff75844b8f9): show tables
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20180519115652_3f96406a-3d1b-46ef-841b-2ff75844b8f9); Time taken: 0.04 seconds
INFO : OK
-----------
| tab_name |
-----------
| test |
-----------
1 row selected (2.674 seconds)
(可左右滑动)
向test表中插入数据
代码语言:javascript复制0: jdbc:hive2://localhost:10000/> insert into test values(2, 'fayson1');
(可左右滑动)
执行一个Count语句
代码语言:javascript复制0: jdbc:hive2://localhost:10000/> select count(*) from test;
(可左右滑动)
5.常见问题
1.使用Kerberos用户身份运行MapReduce作业报错
代码语言:javascript复制main : run as user is fayson
main : requested yarn user is fayson
Requested user fayson is not whitelisted and has id 501,which is below the minimum allowed 1000
Failing this attempt. Failing the application.
17/09/02 20:05:04 INFO mapreduce.Job: Counters: 0
Job Finished in 6.184 seconds
java.io.FileNotFoundException: File does not exist: hdfs://ip-172-31-6-148:8020/user/fayson/QuasiMonteCarlo_1504382696029_1308422444/out/reduce-out
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1266)
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1258)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1258)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1820)
at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1844)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
(可左右滑动)
问题原因:是由于Yarn限制了用户id小于1000的用户提交作业;
解决方法:修改Yarn的min.user.id来解决
2.进行kinit操作后,执行MR作业报“User fayson not found”
问题原因:在集群的节点上没有fayson这个用户
解决方法:需要在集群所有节点添加fayson用户
6.总结
- CDH6与CDH5启用Kerberos的过程基本没差别,除了CDH6的界面有些许变化外。
- 在CDH集群中启用Kerberos需要先安装Kerberos服务(krb5kdc和kadmin服务)
- 在集群所有节点需要安装Kerberos客户端,用于和kdc服务通信
- 在Cloudera Manager Server节点需要额外安装openldap-clients包
- CDH集群启用Kerberos后,使用自己定义的fayson用户向集群提交作业需确保集群所有节点的操作系统中存在fayson用户,否则作业会执行失败
提示:代码块部分可以左右滑动查看噢
为天地立心,为生民立命,为往圣继绝学,为万世开太平。 温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。
推荐关注Hadoop实操,第一时间,分享更多Hadoop干货,欢迎转发和分享。
原创文章,欢迎转载,转载请注明:转载自微信公众号Hadoop实操