目前impala的认证方式支持两种:用户名密码和kerberos,由于impala的表数据一般是存在HDFS上的,所以很多时候,impala集群也会开启kerberos的认证,初次新接入Impala的小伙伴,可能会对kerberos比较头疼,这里将通过一个简单的例子来告诉大家,如何在代码中访问带kerberos的impala集群。废话不多说,直接上代码:
代码语言:javascript复制package com.netease.impala;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
/**
* @Author: Sheng Wang
* @Description:
* @Date: Created in 2019/2/22
* @Modified By:
*/
public class ImpalaJdbc {
public static final String KRB5_CONF = "/Users/project/keytab/krb5.conf";
public static final String PRINCIPAL = "impala/dev@xxx";
public static final String KEYTAB = "/Users/project/keytab/impala.keytab";
public static final String URL = "jdbc:hive2://localhost:21050/default;principal=impala/_HOST@xxx";
private static String HIVE_DRIVER = "org.apache.hive.jdbc.HiveDriver";
public static void main(String[] args) {
System.setProperty("java.security.krb5.conf", KRB5_CONF);
try {
Configuration conf = new Configuration();
conf.set("hadoop.security.authentication", "Kerberos");
UserGroupInformation.setConfiguration(conf);
UserGroupInformation.loginUserFromKeytab(PRINCIPAL, KEYTAB);
System.out.println("Login from keytab " KEYTAB " successful");
Class.forName(HIVE_DRIVER);
Connection conn = DriverManager.getConnection(URL);
Statement state = conn.createStatement();
ResultSet rs = state.executeQuery("show databases;");
while (rs.next()) {
System.out.println(rs.getString(1));
}
} catch (IOException e) {
System.out.println("Login from keytab " KEYTAB " failed.");
e.printStackTrace();
} catch (ClassNotFoundException ee) {
System.out.println("Cannot find driver " HIVE_DRIVER);
ee.printStackTrace();
} catch (SQLException eee) {
System.out.println("SQL execute failed.");
eee.printStackTrace();
}
}
}
这里是通过hive的jdbc driver来连接Impala,有几个常量需要解释一下:
- KRB5_CONF,这个就是kerberos的krb5.conf配置,一般配置在服务器的/etc/krb5.conf中,不清楚的童鞋可以咨询相关的技术人员;
- KEYTAB,这个就是用来进行身份认证的keytab文件,这个一般每个业务方都会有自己的keytab,用来访问相应的HDFS/HIVE/SPARK等;
- PRINCIPAL,这个就是keytab文件对应的principal,在linux机器上可以通过klist -kt xxx.keytab来查看keytab文件对应的principal;
- URL,这个就是Impala集群的连接地址,每个Impala集群的地址都不同,具体的可询问相关的对接人员。
以下是需要引入的依赖:
代码语言:javascript复制<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>1.2.1</version>
</dependency>
</dependencies>
Connection建立成功之后,就可以像普通的SQL查询引擎一样使用了,希望可以帮助到大家。