1.简介
由于自动refresh过程每秒钟都会创建一个新的segment,不需要很长时间,segment的数量就会爆炸性增长。拥有太多的segment会严重影响ES的性能及查询效率。每个segment都会消耗文件句柄、内存和CPU等。更重要的是,每个搜索请求都必须依次检查每个段;segment越多,搜索速度就越慢。因此,如何制定合理的merge 策略以及如何自动的进行force merge是每个ES运维人员都必须学会的关键技能。
2.什么是merge?
讲到ES的merge,我们先看一下Lucene的实现原理(如图P 1.1),在lucene中单个倒排索引文件被称为Segment。Segment 是自包含的,不可变更的。多个 Segments 汇总在一起,称为 Lucene 的Index,其对应的就是 ES 中的 Shard。当有新文档写入时,并且执行 Refresh,就会生成一个新 Segment。 Lucene 中有一个文件,用来记录所有 Segments 信息,叫做Commit Point。查询时会同时查询所有Segments,并且对结果汇总。删除的文档信息,保存在“.del”文件中,查询后会进行过滤。Segment 会定期 Merge,合并成一个,同时删除已删除文档。一个大segment的merge操作是很消耗CPU、IO资源的,如果使用不当会影响到本身的serach查询性能。es默认会控制merge进程的资源占用以保证merge期间search具有足够资源。
3.merge 相关操作的流程
refresh操作会相应的产生很多小的segment文件,并刷入到文件系统缓存(此时文件系统中既有已经完全commit的segment也有不完全提交仅searchable的segment)
es可以对这些零散的小segment文件进行合并(包含完全提交以及searchalbe的segment)
- 新的 segment 被flush到磁盘.
- commit point会被更新
- 新的segment会被打开保证searchalbe
- 带有.del的segment会被删除
4.merge策略
merge策略是按照一定的运行策略来挑选 segment 进行归并的。主要有以下几条:
- index.merge.policy.floor_segment 默认 2MB,小于这个大小的 segment,优先被归并。
- index.merge.policy.max_merge_at_once 默认一次最多归并 10 个 segment
- index.merge.policy.max_merge_at_once_explicit 默认 forcemerge 时一次最多归并 30 个 segment。
- index.merge.policy.max_merged_segment 默认 5 GB,大于这个大小的 segment,不用参与归并。forcemerge 除外。
根据这段策略,其实我们也可以从另一个角度考虑如何减少 segment 归并的消耗以及提高响应的办法:1.对于非实时业务可以加大refresh的时间间隔,减少segment的生成数量; 2.加大 flush 间隔,尽量让每次新生成的 segment 本身大小就比较大。3.提前做好force merge减少业务高峰期的merge的操作。
5.force merge以及自动化force merge配置
5.1 force merge
force merge可以设置三个参数,其接口和对应参数如下:
代码语言:javascript复制POST /kimchy/_forcemerge?only_expunge_deletes=false&max_num_segments=100&flush=true
参数 | 解释 |
---|---|
max_num_segments | The number of segments to merge to. To fully merge the index, set it to 1. Defaults to simply checking if a merge needs to execute, and if so, executes it. |
only_expunge_deletes | Should the merge process only expunge segments with deletes in it. In Lucene, a document is not deleted from a segment, just marked as deleted. During a merge process of segments, a new segment is created that does not have those deletes. This flag allows to only merge segments that have deletes. Defaults to false. Note that this won’t override the index.merge.policy.expunge_deletes_allowed threshold. |
flush | Should a flush be performed after the forced merge. Defaults to true. |
同时,ES支持多个索引一起force merge ,如果你想集群内所有索引一起force merge也是可以的,但是要注意磁盘io,语法如下:
代码语言:javascript复制POST /shakespeare,blogs_analyzed/_forcemerge?max_num_segments=1
如下方式可以查询force merge之后的segment情况
代码语言:javascript复制GET /_cat/indices/?s=segmentsCount:desc&v&h=index,segmentsCount,segmentsMemory,memoryTotal,mergesCurrent,mergesCurrentDocs,storeSize,p,r
5.2 自动化force merge配置
可以通过curator对索引进行自动的force merge
curator安装
代码语言:javascript复制wget wget https://packages.elastic.co/curator/5/centos/7/Packages/elasticsearch-curator-5.8.3-1.x86_64.rpm
rpm -ivh elasticsearch-curator-5.8.3-1.x86_64.rpm
mkdir -p /appdata/curator-5.8.3/logs && mkdir -p /appdata/curator-5.8.3/actions
forcemerge.yml 配置
代码语言:javascript复制actions:
1:
action: forcemerge
description: >-
forceMerge log_ prefixed indices older than 2 days (based on index
creation_date) to 1 segments per shard. Delay 120 seconds between each
forceMerge operation to allow the cluster to quiesce. Skip indices that
have already been forcemerged to the minimum number of segments to avoid
reprocessing.
options:
max_num_segments: 1
delay: 120
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: logs-
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: days
unit_count: 2
exclude:
- filtertype: forcemerged
max_num_segments: 1
exclude: true
curator.yml 配置
代码语言:javascript复制client:
hosts: ["192.168.248.115:9200"]
url_prefix:
use_ssl: False
certificate:
client_cert:
client_key:
aws_key:
aws_secret_key:
aws_region:
ssl_no_validate: False
http_auth: elastic:elastic
timeout: 30
master_only: False
logging:
loglevel: DEBUG
logfile: /appdata/curator-5.8.3/logs/log.log
logformat: default
blacklist: []
crontab 部署
代码语言:javascript复制 0 2 * * * curator --config /appdata/curator-5.8.3/curator.yml /appdata/curator-5.8.3/actions/forcemerge.yml
6.总结
merge和force merge是运维最常见的优化方式,合理的应用这一手段可以让您的集群更为稳定。
7.参考
https://www.elastic.co/guide/en/elasticsearch/guide/master/merge-process.html#img-merge