在了解本文内容前,必须先了解ES DSL查询和ES 聚合查询,ES基于查询结果的聚合分为两种,第一种类似与关系型数据库中的Having语法,第二种类似于关系型数据库中先where在group by的语法,本文主要分析先查询后聚合场景
演示数据从ES 聚合查询获取
1、先查询后聚合
现在需要统计价格在50到500价格范围区间的所有食物,并按照标签进行聚合查询,代码如下:
代码语言:javascript复制GET food/_search
{
"query": {
"range": {
"Price": {
"gte": 50,
"lte": 500
}
}
},
"aggs": {
"tags_bucket": {
"terms": {
"field": "Tags.keyword",
"order": {
"_count": "asc"
}
}
}
}
}
搜索结果如下:
代码语言:javascript复制{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "food",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"CreateTime" : "2022-07-09 13:11:11",
"Desc" : "榴莲 非常好吃 很贵 吃一个相当于吃一只老母鸡",
"Level" : "高级水果",
"Name" : "榴莲",
"Price" : 100.11,
"Tags" : [
"贵",
"水果",
"营养"
],
"Type" : "水果"
}
},
{
"_index" : "food",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"CreateTime" : "2022-06-07 13:11:11",
"Desc" : "芦笋来自国外进口的蔬菜,西餐标配",
"Level" : "中等蔬菜",
"Name" : "芦笋",
"Price" : 66.11,
"Tags" : [
"有点贵",
"国外",
"绿色蔬菜",
"营养价值高"
],
"Type" : "蔬菜"
}
},
{
"_index" : "food",
"_id" : "6",
"_score" : 1.0,
"_source" : {
"CreateTime" : "2022-07-08 13:11:11",
"Desc" : "猫砂王榴莲 榴莲中的战斗机",
"Level" : "高级水果",
"Name" : "猫砂王榴莲",
"Price" : 300.11,
"Tags" : [
"超级贵",
"进口",
"水果",
"非常好吃"
],
"Type" : "水果"
}
}
]
},
"aggregations" : {
"tags_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "国外",
"doc_count" : 1
},
{
"key" : "有点贵",
"doc_count" : 1
},
{
"key" : "绿色蔬菜",
"doc_count" : 1
},
{
"key" : "营养",
"doc_count" : 1
},
{
"key" : "营养价值高",
"doc_count" : 1
},
{
"key" : "贵",
"doc_count" : 1
},
{
"key" : "超级贵",
"doc_count" : 1
},
{
"key" : "进口",
"doc_count" : 1
},
{
"key" : "非常好吃",
"doc_count" : 1
},
{
"key" : "水果",
"doc_count" : 2
}
]
}
}
}
hits中是按照query查询的结果集,下面是根据query的结果集进行的聚合查询.
2、先聚合后查询(注意这里不是having语法,而是查询聚合里面的详情) 通过post_filter实现
现在需要查询价格范围在50到500之间,按照标签分组之后,标签包含营养的记录数据,代码如下:
代码语言:javascript复制GET food/_search
{
"query": {
"range": {
"Price": {
"gte": 50,
"lte": 500
}
}
},
"aggs": {
"tags_bucket":{
"terms": {
"field": "Tags.keyword"
}
}
},
"post_filter": {
"term": {
"Tags.keyword": "营养"
}
}
}
搜索结果如下:
代码语言:javascript复制{
"took" : 41,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "food",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"CreateTime" : "2022-07-09 13:11:11",
"Desc" : "榴莲 非常好吃 很贵 吃一个相当于吃一只老母鸡",
"Level" : "高级水果",
"Name" : "榴莲",
"Price" : 100.11,
"Tags" : [
"贵",
"水果",
"营养"
],
"Type" : "水果"
}
}
]
},
"aggregations" : {
"tags_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "水果",
"doc_count" : 2
},
{
"key" : "国外",
"doc_count" : 1
},
{
"key" : "有点贵",
"doc_count" : 1
},
{
"key" : "绿色蔬菜",
"doc_count" : 1
},
{
"key" : "营养",
"doc_count" : 1
},
{
"key" : "营养价值高",
"doc_count" : 1
},
{
"key" : "贵",
"doc_count" : 1
},
{
"key" : "超级贵",
"doc_count" : 1
},
{
"key" : "进口",
"doc_count" : 1
},
{
"key" : "非常好吃",
"doc_count" : 1
}
]
}
}
}
3、取消查询条件,嵌套查询
现在需要统计指定范围内食品的平均值、最大值等等,最后需要带上一个所有食品的平均值.这个时候计算所有食品的平均值不能受限于查询条件,实现方式如下:
代码语言:javascript复制GET food/_search
{
"query": {
"range": {
"Price": {
"gte": 50,
"lte": 500
}
}
},
"aggs": {
"price_avg":{
"avg": {
"field": "Price"
}
},
"price_max":{
"max": {
"field": "Price"
}
},
"price_min":{
"min": {
"field": "Price"
}
},
"all_price_avg":{
"global": {},
"aggs": {
"price_avg":{
"avg": {
"field": "Price"
}
}
}
}
}
}
搜索结果如下:
代码语言:javascript复制{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "food",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"CreateTime" : "2022-07-09 13:11:11",
"Desc" : "榴莲 非常好吃 很贵 吃一个相当于吃一只老母鸡",
"Level" : "高级水果",
"Name" : "榴莲",
"Price" : 100.11,
"Tags" : [
"贵",
"水果",
"营养"
],
"Type" : "水果"
}
},
{
"_index" : "food",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"CreateTime" : "2022-06-07 13:11:11",
"Desc" : "芦笋来自国外进口的蔬菜,西餐标配",
"Level" : "中等蔬菜",
"Name" : "芦笋",
"Price" : 66.11,
"Tags" : [
"有点贵",
"国外",
"绿色蔬菜",
"营养价值高"
],
"Type" : "蔬菜"
}
},
{
"_index" : "food",
"_id" : "6",
"_score" : 1.0,
"_source" : {
"CreateTime" : "2022-07-08 13:11:11",
"Desc" : "猫砂王榴莲 榴莲中的战斗机",
"Level" : "高级水果",
"Name" : "猫砂王榴莲",
"Price" : 300.11,
"Tags" : [
"超级贵",
"进口",
"水果",
"非常好吃"
],
"Type" : "水果"
}
}
]
},
"aggregations" : {
"all_price_avg" : {
"doc_count" : 6,
"price_avg" : {
"value" : 83.44333092371623
}
},
"price_min" : {
"value" : 66.11000061035156
},
"price_avg" : {
"value" : 155.44332885742188
},
"price_max" : {
"value" : 300.1099853515625
}
}
}
这里通过 "global": {}来实现取消查询条件