如何管理你的Elasticsearch索引

1 前言

curator这个工具很早就社区存在了，而它能够帮你更好的管理你的索引，适用场景很多。本文主要讲解从两个角度去讲解这个工具，第一个角度就是从运维人员的角度，通过这个工具实现日常索引维护的force merge，close，delete以及索引的定期备份等功能；第二个角度就是从架构师的角度，如何用curator进行冷热分离，实现ES热数据和冷数据的自动迁移。

2 版本说明

2.1 curator与ES版本关系图

2.2 实验环境说明

Linux版本	Elasticsearch 版本	curator版本
Redhat 7.6	Elasticsearch 7.2	curator 5.8.3

3 实验环境搭建

3.1 Elasticsearch 节点说明

hot节点	warm 节点	cold节点
192.168.248.116:9200	192.168.248.117:9200	192.168.248.115:9200

3.2 curator安装

代码语言：txt复制

mkdir -p /appdata/curator-5.8.3 && cd  /appdata/curator-5.8.3
wget https://packages.elastic.co/curator/5/centos/7/Packages/elasticsearch-curator-5.8.3-1.x86_64.rpm && yum install -y ./elasticsearch-curator-5.8.3-1.x86_64.rpm

如上curator就已经安装完了，下面就到了我们的重头戏了...

4 创建curator配置文件

代码语言：txt复制

cd /appdata/curator-5.8.3
vim curator.yml
######################################
client:  
   hosts: ["192.168.248.115:9200"]
   url_prefix:
   use_ssl: False
   certificate:
   client_cert:
   client_key:
   aws_key:
   aws_secret_key:
   aws_region:
   ssl_no_validate: False
   http_auth: elastic:xxx
   timeout: 30
   master_only: False

logging:
   loglevel: INFO
   logfile:/appdata/curator-5.8.3/logs/log.log
   logformat: default
   blacklist: ['elasticsearch' 'urllib3']
#########################################

这里的参数我主要讲两个，其他都是默认的。

timeout：默认值是30(秒)，一般不应该改得很大。如果一个给定的操作需要更长的超时时间，比如快照、还原或Forcemerge，可以通过在操作选项中设置timeout_override来覆盖每个操作的客户端超时时间。对于一些运行时间较长的动作，有默认的覆盖值。(这个参数最好设置大一点，我们之前有很多次数据没有迁移成功都是因为执行超时了，后面我们直接把这个参数调成300，再也没出现过问题)
http_auth：ES的用户名和密码

5 创建动作模板

5.1 自动force merge 模板创建

代码语言：txt复制

cd /appdata/curator-5.8.3 && mkdir actions && cd actions
vim forcemerge.yml
########################
actions:
  1:
    action: forcemerge
    description: >-
      forceMerge log_ prefixed indices older than 10 days (based on index
      creation_date) to 1 segments per shard.  Delay 120 seconds between each
      forceMerge operation to allow the cluster to quiesce. Skip indices that
      have already been forcemerged to the minimum number of segments to avoid
      reprocessing.
    options:
      max_num_segments: 1
      delay: 120
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: log_
      exclude: True
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 10
      exclude: True
    - filtertype: forcemerged
      max_num_segments: 1
      exclude: True
######################

5.2 自动close索引模板创建

代码语言：txt复制

cd /appdata/curator-5.8.3/actions/
vim close.yml
########################
actions:
  1:
    action: close
    description: >-
      Close indices older than 30 days (based on index name) for log_
      prefixed indices.
    options:
      delete_aliases: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: logstash-
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 7
########################

5.3 自动delete索引模板创建

代码语言：txt复制

cd /appdata/curator-5.8.3/actions/
vim delete.yml
######################################
actions:  1:
    action: delete_indices
    description: >-
      Delete indices older than 7 days (based on index name) for log_
      prefixed indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: log_
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y.%m.%d'
      unit: days
      unit_count: 30
      exclude: true

5.4 自动snapshot索引模板创建

代码语言：txt复制

cd /appdata/curator-5.8.3/actions/
vim snapshot.yml
---
# Remember, leave a key empty if there is no value.  None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True.  If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
  1:
    action: snapshot
    description: >-
      Snapshot logstash- prefixed indices older than 1 day (based on index
      creation_date) with the default snapshot name pattern of
      'curator-%Y%m%d%H%M%S'.  Wait for the snapshot to complete.  Do not skip
      the repository filesystem access check.  Use the other options to create
      the snapshot.
    options:
      repository:
      # Leaving name blank will result in the default 'curator-%Y%m%d%H%M%S'
      name:
      ignore_unavailable: False
      include_global_state: True
      partial: False
      wait_for_completion: True
      skip_repo_fs_check: False
      disable_action: True
    filters:
    - filtertype: pattern
      kind: prefix
      value: logstash-
    - filtertype: age
      source: creation_date
      direction: older
      unit: days
      unit_count: 1

定时任务制定

代码语言：txt复制

0 1 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/delete.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/delete.log 2>&1
0 2 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/close.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/close.log 2>&1
0 3 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/forcemerge.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/forcemerge.log 2>&1
0 4 */1 * * /usr/bin/curator /appdata/curator-5.8.3/actions/snapshot.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/snapshot.log 2>&1

5.5 冷热架构介绍以及数据自动迁移方案实现

5.5.1 冷热架构介绍

如上图，ES的集群分为：Master Node,Coordinate Node,Ingest Node,Data Node

Master Node

主节点，主要负责集群元数据（Cluster State）的管理与分发
大脑，负责指定数据分配规律等

Data Node

数据节点，主要负责数据存储和数据读写请求处理
劳动者，真正干活的大哥

Coordinate Node

协调节点，主要负责请求转发
交警叔叔，将读写流量调度到具体的数据节点

Ingest Node

预处理节点，主要对数据进行处理和转换
相当于filter，通过去除异常字符串，分词，字符转换等操作，实现更好的分词目的，有logstash filter的功能；

讲了ES的冷热架构，我们就讲讲Data Node这一部分如何实现，按照我们的架构图我们的Data Node节点分为hot,warm,cold三种类型，它们分别保存3天前，3-15天，16-30天的数据。

假定我们索引的命名规则为：log_transaction_YY-MM-DD,那它在各数据节点分布如下，

节点类型	log_transaction_YY-MM-DD
Hot	3天前的数据
Warm	3-15天的数据
Cold	16-30天的数据
归档至NBU或者HDFS	30天后的数据

5.5.2 自动迁移方案实现

1.由 Hot 迁移到Warm，action file 编写

代码语言：txt复制

cd /appdata/curator-5.8.3/actions/
vim Allocation_Warm.yml
actions:
  1:
    action: allocation
    description: "Apply shard allocation filtering rules to the specified indices,Hot to Warm"
    options:
      key: box_type
      value: warm
      allocation_type: require
      wait_for_completion: true
      timeout_override:
      continue_if_exception: false
      disable_action: false
    filters:
    - filtertype: pattern
      kind: prefix
      value: log_transaction_
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y-%m-%d'
      unit: months
      unit_count: 2

2.由Warm到Cold，action file 编写

代码语言：txt复制

cd /appdata/curator-5.8.3/actions/
vim Allocation_Cold.yml
actions:
  1:
    action: allocation
    description: "Apply shard allocation filtering rules to the specified indices,Warm to Cold"
    options:
      key: box_type
      value: cold
      allocation_type: require
      wait_for_completion: true
      timeout_override:
      continue_if_exception: false
      disable_action: false
    filters:
    - filtertype: pattern
      kind: prefix
      value: log_transaction_
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y-%m-%d'
      unit: months
      unit_count: 15

3.将超过30天的数据删除

代码语言：txt复制

cd /appdata/curator-5.8.3/actions/
vim delete.yml
######################################
actions:  1:
    action: delete_indices
    description: >-
      Delete indices older than 30 days (based on index name) for log_transaction_
      prefixed indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: log_transaction_
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y-%m-%d'
      unit: days
      unit_count: 30
      exclude: true

4.定时任务制定

代码语言：txt复制

0 1 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/Allocation_Warm.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/Allocation_Warm.log 2>&1
0 3 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/Allocation_Cold.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs//Allocation_Cold.log 2>&1
0 5 1 */1 * /usr/bin/curator /appdata/curator-5.8.3/actions/delete.yml --config /appdata/curator-5.8.3/curator.yml > /appdata/curator-5.8.3/logs/delete.log 2>&1

6 结语

本文没有写怎么实现30天后的数据归档，其实这一部分内容也很容易实现。作者在本地的做法是：1.对25天后的数据通过curator进行snapshot备份；2.每天用一个定时的crontab去检查备份是否成功，如果成功了就可以自动通过delete.yml对数据进行删除。如果你想知道备份环境如何搭建可以参考《Elasticsearch基于nfs的备份环境搭建》这篇文章。

7 参考文献

https://www.elastic.co/guide/en/elasticsearch/client/curator/5.8/installation.html

备注：如有疑问或者建议，请及时反馈13580480392@163.com。本人会及时反馈，感谢您的支持！

ElasticsearchService 数据迁移

0 人点赞