ES 基于查询结果的聚合

2022-09-21 09:26:08 浏览数 (1)

在了解本文内容前,必须先了解ES DSL查询和ES 聚合查询,ES基于查询结果的聚合分为两种,第一种类似与关系型数据库中的Having语法,第二种类似于关系型数据库中先where在group by的语法,本文主要分析先查询后聚合场景

演示数据从ES 聚合查询获取

1、先查询后聚合

现在需要统计价格在50到500价格范围区间的所有食物,并按照标签进行聚合查询,代码如下:

代码语言:javascript复制
GET food/_search
{
  "query": {
    "range": {
      "Price": {
        "gte": 50,
        "lte": 500
      }
    }
  },
  "aggs": {
    "tags_bucket": {
      "terms": {
        "field": "Tags.keyword",
        "order": {
          "_count": "asc"
        }
      }
    }
  }
}

搜索结果如下:

代码语言:javascript复制
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "food",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "CreateTime" : "2022-07-09 13:11:11",
          "Desc" : "榴莲 非常好吃 很贵 吃一个相当于吃一只老母鸡",
          "Level" : "高级水果",
          "Name" : "榴莲",
          "Price" : 100.11,
          "Tags" : [
            "贵",
            "水果",
            "营养"
          ],
          "Type" : "水果"
        }
      },
      {
        "_index" : "food",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "CreateTime" : "2022-06-07 13:11:11",
          "Desc" : "芦笋来自国外进口的蔬菜,西餐标配",
          "Level" : "中等蔬菜",
          "Name" : "芦笋",
          "Price" : 66.11,
          "Tags" : [
            "有点贵",
            "国外",
            "绿色蔬菜",
            "营养价值高"
          ],
          "Type" : "蔬菜"
        }
      },
      {
        "_index" : "food",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "CreateTime" : "2022-07-08 13:11:11",
          "Desc" : "猫砂王榴莲 榴莲中的战斗机",
          "Level" : "高级水果",
          "Name" : "猫砂王榴莲",
          "Price" : 300.11,
          "Tags" : [
            "超级贵",
            "进口",
            "水果",
            "非常好吃"
          ],
          "Type" : "水果"
        }
      }
    ]
  },
  "aggregations" : {
    "tags_bucket" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "国外",
          "doc_count" : 1
        },
        {
          "key" : "有点贵",
          "doc_count" : 1
        },
        {
          "key" : "绿色蔬菜",
          "doc_count" : 1
        },
        {
          "key" : "营养",
          "doc_count" : 1
        },
        {
          "key" : "营养价值高",
          "doc_count" : 1
        },
        {
          "key" : "贵",
          "doc_count" : 1
        },
        {
          "key" : "超级贵",
          "doc_count" : 1
        },
        {
          "key" : "进口",
          "doc_count" : 1
        },
        {
          "key" : "非常好吃",
          "doc_count" : 1
        },
        {
          "key" : "水果",
          "doc_count" : 2
        }
      ]
    }
  }
}

hits中是按照query查询的结果集,下面是根据query的结果集进行的聚合查询.

2、先聚合后查询(注意这里不是having语法,而是查询聚合里面的详情) 通过post_filter实现

现在需要查询价格范围在50到500之间,按照标签分组之后,标签包含营养的记录数据,代码如下:

代码语言:javascript复制
GET food/_search
{
  "query": {
    "range": {
      "Price": {
        "gte": 50,
        "lte": 500
      }
    }
  },
  "aggs": {
    "tags_bucket":{
      "terms": {
        "field": "Tags.keyword"
      }
    }
  },
  "post_filter": {
    "term": {
      "Tags.keyword": "营养"
    }
  }
}

搜索结果如下:

代码语言:javascript复制
{
  "took" : 41,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "food",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "CreateTime" : "2022-07-09 13:11:11",
          "Desc" : "榴莲 非常好吃 很贵 吃一个相当于吃一只老母鸡",
          "Level" : "高级水果",
          "Name" : "榴莲",
          "Price" : 100.11,
          "Tags" : [
            "贵",
            "水果",
            "营养"
          ],
          "Type" : "水果"
        }
      }
    ]
  },
  "aggregations" : {
    "tags_bucket" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "水果",
          "doc_count" : 2
        },
        {
          "key" : "国外",
          "doc_count" : 1
        },
        {
          "key" : "有点贵",
          "doc_count" : 1
        },
        {
          "key" : "绿色蔬菜",
          "doc_count" : 1
        },
        {
          "key" : "营养",
          "doc_count" : 1
        },
        {
          "key" : "营养价值高",
          "doc_count" : 1
        },
        {
          "key" : "贵",
          "doc_count" : 1
        },
        {
          "key" : "超级贵",
          "doc_count" : 1
        },
        {
          "key" : "进口",
          "doc_count" : 1
        },
        {
          "key" : "非常好吃",
          "doc_count" : 1
        }
      ]
    }
  }
}

3、取消查询条件,嵌套查询

现在需要统计指定范围内食品的平均值、最大值等等,最后需要带上一个所有食品的平均值.这个时候计算所有食品的平均值不能受限于查询条件,实现方式如下:

代码语言:javascript复制
GET food/_search
{
  "query": {
    "range": {
      "Price": {
        "gte": 50,
        "lte": 500
      }
    }
  },
  "aggs": {
    "price_avg":{
      "avg": {
        "field": "Price"
      }
    },
    "price_max":{
      "max": {
        "field": "Price"
      }
    },
    "price_min":{
      "min": {
        "field": "Price"
      }
    },
    "all_price_avg":{
      "global": {},
      "aggs": {
        "price_avg":{
          "avg": {
            "field": "Price"
          }
        }
      }
    }
  }
}

搜索结果如下:

代码语言:javascript复制
{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "food",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "CreateTime" : "2022-07-09 13:11:11",
          "Desc" : "榴莲 非常好吃 很贵 吃一个相当于吃一只老母鸡",
          "Level" : "高级水果",
          "Name" : "榴莲",
          "Price" : 100.11,
          "Tags" : [
            "贵",
            "水果",
            "营养"
          ],
          "Type" : "水果"
        }
      },
      {
        "_index" : "food",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "CreateTime" : "2022-06-07 13:11:11",
          "Desc" : "芦笋来自国外进口的蔬菜,西餐标配",
          "Level" : "中等蔬菜",
          "Name" : "芦笋",
          "Price" : 66.11,
          "Tags" : [
            "有点贵",
            "国外",
            "绿色蔬菜",
            "营养价值高"
          ],
          "Type" : "蔬菜"
        }
      },
      {
        "_index" : "food",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "CreateTime" : "2022-07-08 13:11:11",
          "Desc" : "猫砂王榴莲 榴莲中的战斗机",
          "Level" : "高级水果",
          "Name" : "猫砂王榴莲",
          "Price" : 300.11,
          "Tags" : [
            "超级贵",
            "进口",
            "水果",
            "非常好吃"
          ],
          "Type" : "水果"
        }
      }
    ]
  },
  "aggregations" : {
    "all_price_avg" : {
      "doc_count" : 6,
      "price_avg" : {
        "value" : 83.44333092371623
      }
    },
    "price_min" : {
      "value" : 66.11000061035156
    },
    "price_avg" : {
      "value" : 155.44332885742188
    },
    "price_max" : {
      "value" : 300.1099853515625
    }
  }
}

这里通过 "global": {}来实现取消查询条件

0 人点赞