HBase的JavaAPI使用--进阶篇--过滤器的使用

2021-01-26 10:56:05 浏览数 (1)

本篇博客,小菌为大家带来HBase的进阶使用,关于基础入门操作大家可以去阅览小菌之前的博客《HBase的JavaAPI使用–基础篇》。

在展示代码之前先为大家介绍一下过滤器,同时它也是我们这篇博客介绍的主角。

过滤器的类型很多,但是可以分为两大类——比较过滤器,专用过滤器 过滤器的作用是在服务端判断数据是否满足条件,然后只将满足条件的数据返回给客户端;

hbase过滤器的比较运算符:

LESS < LESS_OR_EQUAL <= EQUAL = NOT_EQUAL <> GREATER_OR_EQUAL >= GREATER > NO_OP 排除所有

Hbase过滤器的专用过滤器(指定比较机制):

BinaryComparator 按字节索引顺序比较指定字节数组,采用Bytes.compareTo(byte[]) BinaryPrefixComparator 跟前面相同,只是比较左端的数据是否相同 NullComparator 判断给定的是否为空 BitComparator 按位比较 RegexStringComparator 提供一个正则的比较器,仅支持 EQUAL 和非EQUAL SubstringComparator 判断提供的子串是否出现在value中。

接下来正式上我们的代码!

一、比较过滤器

1、rowKey过滤器RowFilter

通过RowFilter过滤比rowKey 0003小的所有值出来

代码语言:javascript复制
/**
     * hbase 行键过滤器 RowFilter
     * [通过RowFilter过滤  比rowKey  0003小的所有值出来]
     * @throws Exception
     */
    @Test
    public void rowKeyFilter() throws Exception{

        // 连接 数据库
        Configuration conf = new Configuration();

        conf.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");

        Connection connection = ConnectionFactory.createConnection(conf);

        //读取表
        Table mytest1 = connection.getTable(TableName.valueOf("mytest1"));

        Scan scan = new Scan();

        //创建一个过滤器,并将其添加至scan对象   <=
        RowFilter rowFilter = new RowFilter(LESS, new BinaryComparator(Bytes.toBytes("0003")));

        scan.setFilter(rowFilter);

        // scanner 为 行数据result的集合
        ResultScanner scanner = mytest1.getScanner(scan);

        for (Result result : scanner) {

            // 获取 rowkey
            System.out.println("rowkey:" Bytes.toString(result.getRow()));

            // 指定列族以及列 打印 列 当中的数据出来
            System.out.println("id:" Bytes.toInt(result.getValue("f1".getBytes(),"id".getBytes())));

            System.out.println("age:" Bytes.toInt(result.getValue("f1".getBytes(),"age".getBytes())));

            System.out.println("name:" Bytes.toString(result.getValue("f1".getBytes(),"name".getBytes())));

        }

        mytest1.close();

        connection.close();


    }
2、列族过滤器FamilyFilter

查询比f2列族小的所有的列族内的数据

代码语言:javascript复制
 /**
     * 列族过滤器 FamilyFilter
     * [查询比f2列族小的所有的列族内的数据]
     * @throws Exception
     */
    @Test
    public void familyFilter() throws Exception{

        // 获取连接
        Configuration conf = new Configuration();

        conf.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");

        Connection connection = ConnectionFactory.createConnection(conf);

        // 获取表
        Table mytest1 = connection.getTable(TableName.valueOf("mytest1"));

        Scan scan = new Scan();

        // 创建一个过滤器并添置给scan对象
        FamilyFilter familyFilter  = new FamilyFilter(LESS, new BinaryComparator("f2".getBytes()));

        scan.setFilter(familyFilter);

        ResultScanner resultScanner = mytest1.getScanner(scan);

        for (Result result : resultScanner) {

            // 获取rowkey
            System.out.println(Bytes.toString(result.getRow()));


            System.out.println(Bytes.toString(result.getValue("f1".getBytes(),"name".getBytes())));

        }

        mytest1.close();

        connection.close();


    }
3、列过滤器QualifierFilter

只查询name列的值</font

代码语言:javascript复制
 /**
     * hbase 列过滤器
     * [只查询name列]
     * @throws Exception
     */

    @Test
    public void qualifier() throws Exception{

        // 获取连接
        Configuration conf = new Configuration();

        conf.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");

        Connection connection = ConnectionFactory.createConnection(conf);

        // 获取表
        Table mytest1 = connection.getTable(TableName.valueOf("mytest1"));

        //全表扫描
        Scan scan = new Scan();

        QualifierFilter qualifierFilter = new QualifierFilter(EQUAL, new SubstringComparator("name"));

        scan.setFilter(qualifierFilter);

        ResultScanner scanner = mytest1.getScanner(scan);

        //result 是一行数据
        for (Result result : scanner) {

            // 获取 rowkey
            System.out.println(Bytes.toString(result.getRow()));

            // 指定列族 以及列打印当中的数据
            System.out.println(Bytes.toString(result.getValue("f1".getBytes(),"id".getBytes())));

            System.out.println(Bytes.toString(result.getValue("f1".getBytes(),"name".getBytes())));

        }

        // 关闭资源
        mytest1.close();
        connection.close();


    }
4、列值过滤器ValueFilter

查询所有列当中包含8的数据

代码语言:javascript复制
 /**
     * 列值过滤器ValueFilter
     * [查询所有列当中包含8的数据]
     * @throws Exception
     */
    @Test
public void valueFilter()throws Exception {

        // 获取连接
        Configuration conf = new Configuration();

        conf.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");

        Connection connection = ConnectionFactory.createConnection(conf);

        // 获取表
        Table mytest1 = connection.getTable(TableName.valueOf("mytest1"));

        //全表扫描
        Scan scan = new Scan();

        ValueFilter valueFilter = new ValueFilter(EQUAL, new SubstringComparator("30"));

        scan.setFilter(valueFilter);

        ResultScanner scanner = mytest1.getScanner(scan);

        for (Result result : scanner) {

            // 获取 rowkey
            System.out.println(Bytes.toString(result.getRow()));

            //指定列族 以及 列打印当中的数据出来
            //System.out.println(Bytes.toInt(result.getValue("f1".getBytes(),"id".getBytes())));

            System.out.println(Bytes.toString(result.getValue("f2".getBytes(),"phone".getBytes())));

}

        // 关闭资源
        mytest1.close();
        connection.close();


}

二、专用过滤器

1、单列值过滤器 SingleColumnValueFilter

SingleColumnValueFilter会返回满足条件的整列值的所有字段

代码语言:javascript复制
/**
     * 单列值过滤器,返回满足条件的整行数据
     */
    @Test
    public void singleColumnFilter() throws IOException {
        //获取连接
        Configuration configuration = HBaseConfiguration.create();
        configuration.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");
        Connection connection = ConnectionFactory.createConnection(configuration);
        Table myuser = connection.getTable(TableName.valueOf("myuser"));
        Scan scan = new Scan();
        SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("f1".getBytes(), "name".getBytes(), CompareFilter.CompareOp.EQUAL, "刘备".getBytes());
        scan.setFilter(singleColumnValueFilter);
        ResultScanner resultScanner = myuser.getScanner(scan);
        for (Result result : resultScanner) {
            //获取rowkey
            System.out.println(Bytes.toString(result.getRow()));
            //指定列族以及列打印列当中的数据出来
            System.out.println(Bytes.toInt(result.getValue("f1".getBytes(), "id".getBytes())));
            System.out.println(Bytes.toString(result.getValue("f1".getBytes(), "name".getBytes())));
            System.out.println(Bytes.toString(result.getValue("f2".getBytes(), "phone".getBytes())));
        }
        myuser.close();
    }
2、列值排除过滤器SingleColumnValueExcludeFilter

与SingleColumnValueFilter相反,会排除掉指定的列,其他的列全部返回。

3.rowkey前缀过滤器PrefixFilter

查询以00开头的所有前缀的rowkey

代码语言:javascript复制
 /**
     * rowkey前缀过滤器PrefixFilter
     * [查询以00开头的所有前缀的rowkey]
     *
     * @throws IOException
     */
    @Test
    public void preFilter() throws IOException{

        //获取连接
        Configuration configuration = HBaseConfiguration.create();
        configuration.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");
        Connection connection = ConnectionFactory.createConnection(configuration);

        //获取表
        Table mytest1 = connection.getTable(TableName.valueOf("mytest1"));

        //扫描全表
        Scan scan = new Scan();

        PrefixFilter prefixFilter = new PrefixFilter("00".getBytes());
        scan.setFilter(prefixFilter);

        ResultScanner resultScanner = mytest1.getScanner(scan);

        // result 代表一行数据
        for (Result result : resultScanner) {

            // 获取rowkey
            System.out.println(Bytes.toString(result.getRow()));

            //指定列族以及列打印列当中的数据出来
            System.out.println(Bytes.toInt(result.getValue("f1".getBytes(),"id".getBytes())));

            System.out.println(Bytes.toString(result.getValue("f1".getBytes(),"name".getBytes())));

            System.out.println(Bytes.toString(result.getValue("f2".getBytes(),"phone".getBytes())));


        }

        mytest1.close();
        connection.close();

    }
4、分页过滤器PageFilter

通过pageFilter实现分页过滤器

代码语言:javascript复制
/**
     * 分页过滤器PageFilter
     * [通过pageFilter实现分页过滤器]
     * @throws IOException
     */
    @Test
    public void pageFilter2() throws IOException {
        // 获取连接
        Configuration configuration = HBaseConfiguration.create();
        configuration.set("hbase.zookeeper.quorum", "node01:2181,node02:2181,node03:2181");
        Connection connection = ConnectionFactory.createConnection(configuration);

        // 获取表
        Table myuser = connection.getTable(TableName.valueOf("mytest1"));
        int pageNum = 3;
        int pageSize = 2;

        Scan scan = new Scan();
        if (pageNum == 1) {
            PageFilter filter = new PageFilter(pageSize);
            scan.setStartRow(Bytes.toBytes(""));
            scan.setFilter(filter);
            scan.setMaxResultSize(pageSize);
            ResultScanner scanner = myuser.getScanner(scan);
            for (Result result : scanner) {
                //获取rowkey
                System.out.println(Bytes.toString(result.getRow()));
                //指定列族以及列打印列当中的数据出来
//            System.out.println(Bytes.toInt(result.getValue("f1".getBytes(), "id".getBytes())));
                System.out.println(Bytes.toString(result.getValue("f1".getBytes(), "name".getBytes())));
                //System.out.println(Bytes.toString(result.getValue("f2".getBytes(), "phone".getBytes())));
            }

        }else{
            String startRowKey ="";
            PageFilter filter = new PageFilter((pageNum - 1) * pageSize   1  );
            scan.setStartRow(startRowKey.getBytes());
            scan.setMaxResultSize((pageNum - 1) * pageSize   1);
            scan.setFilter(filter);
            ResultScanner scanner = myuser.getScanner(scan);
            for (Result result : scanner) {
                byte[] row = result.getRow();
                startRowKey =  new String(row);
            }
            Scan scan2 = new Scan();
            scan2.setStartRow(startRowKey.getBytes());
            scan2.setMaxResultSize(Long.valueOf(pageSize));
            PageFilter filter2 = new PageFilter(pageSize);
            scan2.setFilter(filter2);

            ResultScanner scanner1 = myuser.getScanner(scan2);
            for (Result result : scanner1) {
                byte[] row = result.getRow();
                System.out.println(new String(row));
            }
        }
        myuser.close();

        connection.close();


    }

三、多过滤器综合查询FilterList

需求:使用SingleColumnValueFilter查询f1列族,name为刘备的数据,并且同时满足rowkey的前缀以00开头的数据(PrefixFilter)

代码语言:javascript复制
/**
     * 多过滤器综合查询FilterList
     * 需求:使用SingleColumnValueFilter查询f1列族,name为刘备的数据,并且同时满足rowkey的前缀以00开头的数据(PrefixFilter)
     * @throws Exception
     */
    @Test
    public void manyFilter() throws Exception{

        //获取连接
        Configuration configuration = HBaseConfiguration.create();
        configuration.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");
        Connection connection = ConnectionFactory.createConnection(configuration);

        //获取表
        Table mytest1 = connection.getTable(TableName.valueOf("mytest1"));

        //扫描全表
        Scan scan = new Scan();

        FilterList filterList = new FilterList();

        // 单列过滤
        SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter("f1".getBytes(), "name".getBytes(), EQUAL, "刘备".getBytes());

        // rowkey 前缀过滤
        PrefixFilter prefixFilter = new PrefixFilter("00".getBytes());

        filterList.addFilter(singleColumnValueFilter);
        filterList.addFilter(prefixFilter);

        // 设置过滤
        scan.setFilter(filterList);

        ResultScanner scanner = mytest1.getScanner(scan);

        for (Result result : scanner) {

            //获取rowkey
            System.out.println(Bytes.toString(result.getRow()));

            //指定列族以及列 打印当中的数据出来
            System.out.println(Bytes.toInt(result.getValue("f1".getBytes(),"id".getBytes())));

            System.out.println(Bytes.toString(result.getValue("f1".getBytes(),"name".getBytes())));

            System.out.println(Bytes.toString(result.getValue("f2".getBytes(),"phone".getBytes())));

        }

        mytest1.close();
        connection.close();

    }

好了到这里,关于过滤器的使用就到这里了,接下来要为大家展示如何调用API删除表数据和表!

四.删除数据

根据rowkey删除数据

代码语言:javascript复制
/**
     * 根据rowkey删除数据
     * @throws Exception
     */
    @Test
    public void deleteByRowKey() throws Exception{

        //获取连接
        Configuration configuration = HBaseConfiguration.create();
        configuration.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");
        Connection connection = ConnectionFactory.createConnection(configuration);

        //获取表
        Table mytest1 = connection.getTable(TableName.valueOf("mytest1"));

        //根据RowKey 确定需要删除掉的行
        Delete delete = new Delete("0001".getBytes());

        mytest1.delete(delete);


        //关闭连接
        mytest1.close();
        connection.close();

    }

五.删除表

删除指定的数据表

代码语言:javascript复制
 @Test
    public void deleteTable() throws Exception{

        //获取连接
        Configuration configuration = HBaseConfiguration.create();
        configuration.set("hbase.zookeeper.quorum","node01:2181,node02:2181,node03:2181");
        Connection connection = ConnectionFactory.createConnection(configuration);

        //获取表
        Table mytest1 = connection.getTable(TableName.valueOf("mytest1"));

        //创建一个管理员对象
        Admin admin = connection.getAdmin();
        admin.disableTable(TableName.valueOf("myuser"));

        admin.deleteTable(TableName.valueOf("myuser"));


        // 关闭资源
        admin.close();

        mytest1.close();
        connection.close();

    }

这里需要注意一下,在删除数据表之前一定要先禁用数据表

好了,看到这里也很不容易,给热爱学习的你们点个赞!

话说回来,上面的所有代码小菌都亲测过没问题的!有疑惑的小伙伴们可以私信我哟(^U^)ノ~YO

本次的分享就到这里了,受益的小伙伴们,不要忘了点赞关注小菌٩(๑❛ᴗ❛๑)۶

0 人点赞