概述
继续跟中华石杉老师学习ES,第34篇
课程地址: https://www.roncoo.com/view/55
案例
原始数据:
需求
在白话Elasticsearch33-深入聚合数据分析之案例实战Terms Aggregation Avg Aggregation ( bucket metrics ) 中,我们演示了 对颜色进行bucket操作以后,再计算每种颜色的平均价格的metrics操作。
假设 又来了个新需求: 从颜色到品牌进行下钻分析,每种颜色的平均价格,以及找到每种颜色每个品牌的平均价格
那就需要进行多层次的下钻分析了
举个例子:比如说,我们现在的索引中红色的电视有4台,同时这4台电视中,有3台是属于长虹的,1台是属于小米的
那如何计算出 : 红色电视中的3台长虹的平均价格是多少? 红色电视中的1台小米的平均价格是多少?
解决
Step1.对每种颜色进行bucket分组
代码语言:javascript复制GET /tvs/sales/_search
{
"size": 0 ,
"aggs": {
"group_by_color": {
"terms": {
"field": "color"
}
}
}
}
返回
Step2.对每种颜色进行bucket分组 , 然后对每个分组再次计算平均价格
代码语言:javascript复制GET /tvs/sales/_search
{
"size": 0,
"aggs": {
"group_by_color": {
"terms": {
"field": "color"
},
"aggs": {
"color_avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
返回:
Step3.对每种颜色进行bucket分组 , 然后对每个分组再次计算平均价格 , 紧接再对每种颜色按照brand分组,直接写到和 color_avg_price 并列的地方就可以了
代码语言:javascript复制GET /tvs/sales/_search
{
"size": 0,
"aggs": {
"group_by_color": {
"terms": {
"field": "color"
},
"aggs": {
"color_avg_price": {
"avg": {
"field": "price"
}
},
"group_by_brand": {
"terms": {
"field": "brand"
}
}
}
}
}
}
返回
代码语言:javascript复制{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 8,
"max_score": 0,
"hits": []
},
"aggregations": {
"group_by_color": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "红色",
"doc_count": 4,
"color_avg_price": {
"value": 3250
},
"group_by_brand": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "长虹",
"doc_count": 3
},
{
"key": "三星",
"doc_count": 1
}
]
}
},
{
"key": "绿色",
"doc_count": 2,
"color_avg_price": {
"value": 2100
},
"group_by_brand": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "TCL",
"doc_count": 1
},
{
"key": "小米",
"doc_count": 1
}
]
}
},
{
"key": "蓝色",
"doc_count": 2,
"color_avg_price": {
"value": 2000
},
"group_by_brand": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "TCL",
"doc_count": 1
},
{
"key": "小米",
"doc_count": 1
}
]
}
}
]
}
}
}
Step4.对每种颜色进行bucket分组 , 然后对每个分组再次计算平均价格 , 紧接再对每种颜色按照brand分组,直接写到和 color_avg_price 并列的地方就可以了。 最后对品牌进行metrics操作,即求每种品牌的平均价格,所以要在 brand 这个bucket中,再次aggs
代码语言:javascript复制GET /tvs/sales/_search
{
"size": 0 ,
"aggs": {
"group_by_color": {
"terms": {
"field": "color"
},
"aggs": {
"color_avg_price": {
"avg": {
"field": "price"
}
},
"group_by_brand":{
"terms": {
"field": "brand"
},
"aggs": {
"brand_avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
}
}
到这里,就一步步完成了该需求,来看下返回结果吧
返回:
代码语言:javascript复制{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 8,
"max_score": 0,
"hits": []
},
"aggregations": {
"group_by_color": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "红色",
"doc_count": 4,
"color_avg_price": {
"value": 3250
},
"group_by_brand": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "长虹",
"doc_count": 3,
"brand_avg_price": {
"value": 1666.6666666666667
}
},
{
"key": "三星",
"doc_count": 1,
"brand_avg_price": {
"value": 8000
}
}
]
}
},
{
"key": "绿色",
"doc_count": 2,
"color_avg_price": {
"value": 2100
},
"group_by_brand": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "TCL",
"doc_count": 1,
"brand_avg_price": {
"value": 1200
}
},
{
"key": "小米",
"doc_count": 1,
"brand_avg_price": {
"value": 3000
}
}
]
}
},
{
"key": "蓝色",
"doc_count": 2,
"color_avg_price": {
"value": 2000
},
"group_by_brand": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "TCL",
"doc_count": 1,
"brand_avg_price": {
"value": 1500
}
},
{
"key": "小米",
"doc_count": 1,
"brand_avg_price": {
"value": 2500
}
}
]
}
}
]
}
}
}
校验下
原始数据:
我们通过ES算出来的数据:
对比下原始数据,符合预期,至此实现了该需求的DSL 。