1 前言
curator这个工具很早就社区存在了,而它能够帮你更好的管理你的索引,适用场景很多。本文主要讲解从两个角度去讲解这个工具,第一个角度就是从运维人员的角度,通过这个工具实现日常索引维护的force merge,close,delete以及索引的定期备份等功能;第二个角度就是从架构师的角度,如何用curator进行冷热分离,实现ES热数据和冷数据的自动迁移。
2 版本说明
2.1 curator与ES版本关系图
2.2 实验环境说明
Linux版本 | Elasticsearch 版本 | curator版本 |
---|---|---|
Redhat 7.6 | Elasticsearch 7.2 | curator 5.8.3 |
3 实验环境搭建
3.1 Elasticsearch 节点说明
hot节点 | warm 节点 | cold节点 |
---|---|---|
192.168.248.116:9200 | 192.168.248.117:9200 | 192.168.248.115:9200 |
3.2 curator安装
代码语言:txt复制mkdir -p /appdata/curator-5.8.3 && cd /appdata/curator-5.8.3
wget https://packages.elastic.co/curator/5/centos/7/Packages/elasticsearch-curator-5.8.3-1.x86_64.rpm && yum install -y ./elasticsearch-curator-5.8.3-1.x86_64.rpm
如上curator就已经安装完了,下面就到了我们的重头戏了...
4 创建curator配置文件
代码语言:txt复制cd /appdata/curator-5.8.3
vim curator.yml
######################################
client:
hosts: ["192.168.248.115:9200"]
url_prefix:
use_ssl: False
certificate:
client_cert:
client_key:
aws_key:
aws_secret_key:
aws_region:
ssl_no_validate: False
http_auth: elastic:xxx
timeout: 30
master_only: False
logging:
loglevel: INFO
logfile:/appdata/curator-5.8.3/logs/log.log
logformat: default
blacklist: ['elasticsearch' 'urllib3']
#########################################
这里的参数我主要讲两个,其他都是默认的。
- timeout:默认值是30(秒),一般不应该改得很大。如果一个给定的操作需要更长的超时时间,比如快照、还原或Forcemerge,可以通过在操作选项中设置timeout_override来覆盖每个操作的客户端超时时间。对于一些运行时间较长的动作,有默认的覆盖值。(这个参数最好设置大一点,我们之前有很多次数据没有迁移成功都是因为执行超时了,后面我们直接把这个参数调成300,再也没出现过问题)
- http_auth:ES的用户名和密码
5 创建动作模板
5.1 自动force merge 模板创建
代码语言:txt复制cd /appdata/curator-5.8.3 && mkdir actions && cd actions
vim forcemerge.yml
########################
actions:
1:
action: forcemerge
description: >-
forceMerge log_ prefixed indices older than 10 days (based on index
creation_date) to 1 segments per shard. Delay 120 seconds between each
forceMerge operation to allow the cluster to quiesce. Skip indices that
have already been forcemerged to the minimum number of segments to avoid
reprocessing.
options:
max_num_segments: 1
delay: 120
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: log_
exclude: True
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 10
exclude: True
- filtertype: forcemerged
max_num_segments: 1
exclude: True
######################
5.2 自动close索引模板创建
代码语言:txt复制cd /appdata/curator-5.8.3/actions/
vim close.yml
########################
actions:
1:
action: close
description: >-
Close indices older than 30 days (based on index name) for log_
prefixed indices.
options:
delete_aliases: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: logstash-
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 7
########################
5.3 自动delete索引模板创建
代码语言:txt复制cd /appdata/curator-5.8.3/actions/
vim delete.yml
######################################
actions: 1:
action: delete_indices
description: >-
Delete indices older than 7 days (based on index name) for log_
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: log_
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 30
exclude: true
5.4 自动snapshot索引模板创建
代码语言:txt复制cd /appdata/curator-5.8.3/actions/
vim snapshot.yml
---
# Remember, leave a key empty if there is no value. None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True. If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
1:
action: snapshot
description: >-
Snapshot logstash- prefixed indices older than 1 day (based on index
creation_date) with the default snapshot name pattern of
'curator-%Y%m%d%H%M%S'. Wait for the snapshot to complete. Do not skip
the repository filesystem access check. Use the other options to create
the snapshot.
options:
repository:
# Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
name:
ignore_unavailable: False
include_global_state: True
partial: False
wait_for_completion: True
skip_repo_fs_check: False
disable_action: True
filters:
- filtertype: pattern
kind: prefix
value: logstash-
- filtertype: age
source: creation_date
direction: older
unit: days
unit_count: 1
定时任务制定
代码语言:txt复制0 1 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/delete.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/delete.log 2>&1
0 2 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/close.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/close.log 2>&1
0 3 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/forcemerge.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/forcemerge.log 2>&1
0 4 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/snapshot.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/snapshot.log 2>&1
5.5 冷热架构介绍以及数据自动迁移方案实现
5.5.1 冷热架构介绍
如上图,ES的集群分为:Master Node,Coordinate Node,Ingest Node,Data Node
- Master Node
- 主节点,主要负责集群元数据(Cluster State)的管理与分发
- 大脑,负责指定数据分配规律等
- Data Node
- 数据节点,主要负责数据存储和数据读写请求处理
- 劳动者,真正干活的大哥
- Coordinate Node
- 协调节点,主要负责请求转发
- 交警叔叔,将读写流量调度到具体的数据节点
- Ingest Node
- 预处理节点,主要对数据进行处理和转换
- 相当于filter,通过去除异常字符串,分词,字符转换等操作,实现更好的分词目的,有logstash filter的功能;
讲了ES的冷热架构,我们就讲讲Data Node这一部分如何实现,按照我们的架构图我们的Data Node节点分为hot,warm,cold三种类型,它们分别保存3天前,3-15天,16-30天的数据。
假定我们索引的命名规则为:log_transaction_YY-MM-DD,那它在各数据节点分布如下,
节点类型 | log_transaction_YY-MM-DD |
---|---|
Hot | 3天前的数据 |
Warm | 3-15天的数据 |
Cold | 16-30天的数据 |
归档至NBU或者HDFS | 30天后的数据 |
5.5.2 自动迁移方案实现
1.由 Hot 迁移到Warm,action file 编写
代码语言:txt复制cd /appdata/curator-5.8.3/actions/
vim Allocation_Warm.yml
actions:
1:
action: allocation
description: "Apply shard allocation filtering rules to the specified indices,Hot to Warm"
options:
key: box_type
value: warm
allocation_type: require
wait_for_completion: true
timeout_override:
continue_if_exception: false
disable_action: false
filters:
- filtertype: pattern
kind: prefix
value: log_transaction_
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: months
unit_count: 2
2.由Warm到Cold,action file 编写
代码语言:txt复制cd /appdata/curator-5.8.3/actions/
vim Allocation_Cold.yml
actions:
1:
action: allocation
description: "Apply shard allocation filtering rules to the specified indices,Warm to Cold"
options:
key: box_type
value: cold
allocation_type: require
wait_for_completion: true
timeout_override:
continue_if_exception: false
disable_action: false
filters:
- filtertype: pattern
kind: prefix
value: log_transaction_
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: months
unit_count: 15
3.将超过30天的数据删除
代码语言:txt复制cd /appdata/curator-5.8.3/actions/
vim delete.yml
######################################
actions: 1:
action: delete_indices
description: >-
Delete indices older than 30 days (based on index name) for log_transaction_
prefixed indices. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: True
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: log_transaction_
exclude:
- filtertype: age
source: name
direction: older
timestring: '%Y-%m-%d'
unit: days
unit_count: 30
exclude: true
4.定时任务制定
代码语言:txt复制0 1 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/Allocation_Warm.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/Allocation_Warm.log 2>&1
0 3 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/Allocation_Cold.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs//Allocation_Cold.log 2>&1
0 5 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/delete.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/delete.log 2>&1
6 结语
本文没有写怎么实现30天后的数据归档,其实这一部分内容也很容易实现。作者在本地的做法是:1.对25天后的数据通过curator进行snapshot备份;2.每天用一个定时的crontab去检查备份是否成功,如果成功了就可以自动通过delete.yml对数据进行删除。如果你想知道备份环境如何搭建可以参考《Elasticsearch基于nfs的备份环境搭建》这篇文章。
7 参考文献
https://www.elastic.co/guide/en/elasticsearch/client/curator/5.8/installation.html
备注: 如有疑问或者建议,请及时反馈13580480392@163.com。本人会及时反馈,感谢您的支持!