Java代码连接带kerberos的Impala集群

2022-05-20 08:15:28 浏览数 (1)

目前impala的认证方式支持两种:用户名密码和kerberos,由于impala的表数据一般是存在HDFS上的,所以很多时候,impala集群也会开启kerberos的认证,初次新接入Impala的小伙伴,可能会对kerberos比较头疼,这里将通过一个简单的例子来告诉大家,如何在代码中访问带kerberos的impala集群。废话不多说,直接上代码:

代码语言:javascript复制
package com.netease.impala;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.security.UserGroupInformation;

import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

/**
 * @Author: Sheng Wang
 * @Description:
 * @Date: Created in 2019/2/22
 * @Modified By:
 */
public class ImpalaJdbc {
    public static final String KRB5_CONF = "/Users/project/keytab/krb5.conf";
    public static final String PRINCIPAL = "impala/dev@xxx";
    public static final String KEYTAB = "/Users/project/keytab/impala.keytab";
    public static final String URL = "jdbc:hive2://localhost:21050/default;principal=impala/_HOST@xxx";

    private static String HIVE_DRIVER = "org.apache.hive.jdbc.HiveDriver";

    public static void main(String[] args) {
        System.setProperty("java.security.krb5.conf", KRB5_CONF);
        try {
            Configuration conf = new Configuration();
            conf.set("hadoop.security.authentication", "Kerberos");
            UserGroupInformation.setConfiguration(conf);
            UserGroupInformation.loginUserFromKeytab(PRINCIPAL, KEYTAB);
            System.out.println("Login from keytab "   KEYTAB   " successful");

            Class.forName(HIVE_DRIVER);
            Connection conn = DriverManager.getConnection(URL);
            Statement state = conn.createStatement();
            ResultSet rs = state.executeQuery("show databases;");
            while (rs.next()) {
                System.out.println(rs.getString(1));
            }
        } catch (IOException e) {
            System.out.println("Login from keytab "   KEYTAB   " failed.");
            e.printStackTrace();
        } catch (ClassNotFoundException ee) {
            System.out.println("Cannot find driver "   HIVE_DRIVER);
            ee.printStackTrace();
        } catch (SQLException eee) {
            System.out.println("SQL execute failed.");
            eee.printStackTrace();
        }
    }
}

这里是通过hive的jdbc driver来连接Impala,有几个常量需要解释一下:

  • KRB5_CONF,这个就是kerberos的krb5.conf配置,一般配置在服务器的/etc/krb5.conf中,不清楚的童鞋可以咨询相关的技术人员;
  • KEYTAB,这个就是用来进行身份认证的keytab文件,这个一般每个业务方都会有自己的keytab,用来访问相应的HDFS/HIVE/SPARK等;
  • PRINCIPAL,这个就是keytab文件对应的principal,在linux机器上可以通过klist -kt xxx.keytab来查看keytab文件对应的principal;
  • URL,这个就是Impala集群的连接地址,每个Impala集群的地址都不同,具体的可询问相关的对接人员。

以下是需要引入的依赖:

代码语言:javascript复制
<dependencies>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>2.7.3</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hive</groupId>
        <artifactId>hive-jdbc</artifactId>
        <version>1.2.1</version>
    </dependency>
</dependencies>

Connection建立成功之后,就可以像普通的SQL查询引擎一样使用了,希望可以帮助到大家。

0 人点赞