说明
本文描述问题及解决方法同样适用于 腾讯云 Elasticsearch Service(ES)。
背景
Kibana 中的 Dashboard 给我们直观的数据展示。在实际的工作中,可以用于汇报。在 Kibana 中,我们可以来生成我们想要的Report。
问题
在生成报表时失败,报错:
代码语言:text复制Can't reach the server. Please try agin.
打开F12,返回的是内部错误。
问题比较奇怪,需要深入分析一下。
问题原因
一、分析kibana异常日志
打开kibana的日志,经过分析,我们发现了异常的地方:
代码语言:javascript复制"message":"[illegal_argument_exception] Rejecting mapping update to [.reporting-2021.10.24] as the final mapping would have more than 1 type: [esqueue, doc]"}
{"type":"response","@timestamp":"2021-10-27T05:53:12Z","tags":["api"],"pid":14595,"method":"post","statusCode":500,"req":{"url":"/api/reporting/generate/csv?jobParams=(conflictedTypesFields:!(kfext,kfuin,requestId),fields:!('@timestamp',text),indexPatternId:'21fe4820-8916-11ea-8b39-a19e11c4dfcb',metaFields:!(_source,_id,_type,_index,_score),searchRequest:(body:(_source:(excludes:!(),includes:!('@timestamp',text)),docvalue_fields:!(),query:(bool:(filter:!(),must:!((query_string:(analyze_wildcard:!t,default_field:'*',query:'"high risky with req"')),(range:('@timestamp':(format:epoch_millis,gte:1635310298396,lte:1635313898396)))),must_not:!(),should:!())),script_fields:(),sort:!(('@timestamp':(order:desc,unmapped_type:boolean))),stored_fields:!('@timestamp',text),version:!t),index:'account-admin-ol-*'),title:'high risky with req',type:search)","method":"post","headers":{"host":"kibana","connection":"close","content-length":"0","x-stgw-time":"1635313992.732","x-client-proto":"https","x-forwarded-proto":"https","x-client-proto-ver":"HTTP/2.0","x-real-ip":"116.233.19.162","x-forwarded-for":"116.233.19.162","sec-ch-ua":""Chromium";v="92", " Not A;Brand";v="99", "Google Chrome";v="92"","sec-ch-ua-mobile":"?0","user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36","kbn-version":"6.8.2","content-type":"application/json","accept":"*/*","origin":"https://es-3ktojklt.kibana.tencentelasticsearch.com:5601","sec-fetch-site":"same-origin","sec-fetch-mode":"cors","sec-fetch-dest":"empty","referer":"https://es-3ktojklt.kibana.tencentelasticsearch.com:5601/app/kibana","accept-encoding":"gzip, deflate, br","accept-language":"zh-CN,zh;q=0.9,en;q=0.8"},"remoteAddress":"10.0.130.254","userAgent":"10.0.130.254","referer":"https://es-3ktojklt.kibana.tencentelasticsearch.com:5601/app/kibana"},"res":{"statusCode":500,"responseTime":4695,"contentLength":9},"message":"POST /api/reporting/generate/csv?jobParams=(conflictedTypesFields:!(kfext,kfuin,requestId),fields:!('@timestamp',text),indexPatternId:'21fe4820-8916-11ea-8b39-a19e11c4dfcb',metaFields:!(_source,_id,_type,_index,_score),searchRequest:(body:(_source:(excludes:!(),includes:!('@timestamp',text)),docvalue_fields:!(),query:(bool:(filter:!(),must:!((query_string:(analyze_wildcard:!t,default_field:'*',query:'"high risky with req"')),(range:('@timestamp':(format:epoch_millis,gte:1635310298396,lte:1635313898396)))),must_not:!(),should:!())),script_fields:(),sort:!(('@timestamp':(order:desc,unmapped_type:boolean))),stored_fields:!('@timestamp',text),version:!t),index:'account-admin-ol-*'),title:'high risky with req',type:search) 500 4695ms - 9.0B"}
核心错误在于:
代码语言:javascript复制[.reporting-2021.10.24] as the final mapping would have more than 1 type: [esqueue, doc]
版本问题?
为什么会有这种问题呢,系统索引出现这种故障无非是kibana与es的版本不一致所导致,check了一下:
代码语言:javascript复制[root@VM_130_254_centos /usr/local/service/kibana]# more version.md
6.8.2.2019121001
[root@VM_130_254_centos /usr/local/service/kibana]# cur localhost:9200
{
"name" : "1620648141000429932",
"cluster_name" : "es-3ktojklt",
"cluster_uuid" : "zH1tb_eUS5uHJf5edamMAg",
"version" : {
"number" : "6.8.2",
"build_flavor" : "default",
"build_type" : "zip",
"build_hash" : "f1ae577",
"build_date" : "2019-11-25T13:31:48.079152Z",
"build_snapshot" : false,
"lucene_version" : "7.7.0",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
版本完全一致,不是版本的问题,排除这个可能。
二、分析Elasticsearch日志
查到这里,大概率就是mapping的问题了,但一般谁会去改动系统索引的mapping呢,这里我怀疑是有自定义模板的mapping干扰了系统索引。
在日志里搜索系统索引:
果然发现了异常的地方:
代码语言:javascript复制[.reporting-2021.10.24] creating index, cause [auto(bulk api)], templates [qidian_default, default@template, qd-template, outerBoss-template, hand-nginx-template, hand-template, *, beeflow-java-template, zhiku-template, beeflow-template, test-template, $zhiku-template]
一个系统索引的创建,竟然匹配了那么多自定义模板,这肯定有问题呀。
解决方案
临时作废了这些影响系统索引的自定义模板,由原先的:
代码语言:javascript复制 "index_patterns": [
"*"
]
改为了:
代码语言:javascript复制 "index_patterns": [
"xxx*"
]
然后删除系统报告索引,再次生成报告,就可以正常执行了:
问题解决。
小结
业务在正常使用中,可以自定义模板来匹配实际的业务索引,这个本身没有什么问题。但是切记不可以为了方便,全部都匹配 * ,这个操作很危险,会存在隐患。