elasticsearch 1.7升级到7.x全攻略

es 5.3升级至es 5.5.x（小版本升级）

安装es 5.3

通过这个网址可以下载安装指定的es版本，首先安装es5.3.0，然后再升级到5.5.3

www.elastic.co/cn/download…

接下来请跟着执行如下命令

代码语言：javascript复制

# 选择目录
cd /opt
# 下载文件
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.3.0.zip
# 解压缩
unzip elasticsearch-5.3.0.zip
# 复制配置文件到外部
cp elasticsearch-5.3.0/config/ ./temp/config

执行vi ./temp/config/elasticsearch.yml编辑内容如下

代码语言：javascript复制

# 创建文件夹
mkdir -P /var/data/elasticsearch
mkdir -P /var/log/elasticsearch
# 数据目录
path.data: /var/data/elasticsearch
# 日志目录
path.logs: /var/log/elasticsearch

elasticsearch启动需要以elasticsearch用户启动，所以需要创建名为elasticsearch的用户

代码语言：javascript复制

# 创建elasticsearch用户
adduser elasticsearch
passwd elasticsearch
# 切换为elasticsearch用户
su elasticsearch

# 改变文件的拥有者为elasticsearch
chown -R elasticsearch /opt/elasticsearch-5.3.0
chown -R elasticsearch /var/log/elasticsearch
chown -R elasticsearch /var/data/elasticsearch
chown -R elasticsearch /opt/temp/config

进入elasticsearch的bin目录，执行./elasticsearch -d -Epath.conf=/opt/temp/config，-Epath.conf指定了复制在外部的配置文件来启动elasticsearch

查看启动日志tail -f /var/log/elasticsearch/my-application.log观察是否启动成功。

执行curl -XGET 'http://localhost:9200/_cat/nodes?v'查看es节点状态

代码语言：javascript复制

# 插入一条数据
curl -XPUT 'http://localhost:9200/forum/article/1?pretty' -d '
{
  "title": "first article",
  "content": "this is my first article"
}'
# 查看数据
curl -XGET 'http://localhost:9200/forum/article/1?pretty'

升级安装es 5.5.3

安装es 5.5.3

代码语言：javascript复制

cd /opt
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.0.zip
unzip elasticsearch-5.5.0.zip

此时解压的文件名为elasticsearch-5.5.0，可以执行mv elasticsearch-5.5.0 elasticsearch-5.3.0覆盖掉es5.3，也可以不执行，直接使用。这里我们不执行，留下原版本。

准备工作

停止一个node之后，这个node上的shard全都不可用了，此时shard allocation机制会等待一分钟，然后开始shard recovery（分片回复）过程，也就是将丢失掉的primary shard（主分片）的replica shard（副本分片）提升为primary shard（主分片），同时创建更多的replica shard满足副本数量，但是这个过程会导致大量的IO操作，是没有必要的。因此在开始升级一个node，以及关闭这个node之前，先禁止shard allocation机制。执行如下命令：curl -XPUT 'http://localhost:9200/_cluster/settings?pretty' -d ' { "persistent": { "cluster.routing.allocation.enable": "none" } }'
停止非核心业务的写入操作，以及执行一次flush操作。可以在升级期间继续写入数据，但是如果在升级期间一直写入数据的话，可能会导致重启节点的时候，shard recovery的时间变长，因为很多数据都是translog里面，没有flush到磁盘上去。如果我们暂时停止数据的写入，而且还进行一次flush操作，把数据都刷入磁盘中，这样在node重启的时候，几乎没有什么数据要从translog中恢复的，重启速度会很快，因为shard recovery过程会很快。用下面这行命令执行flush：POST _flush/synced。但是flush操作是尽量执行的，有可能会执行失败，如果有大量的index写入操作的话。所以可能需要多次执行flush，直到它执行成功。curl -XPOST 'http://localhost:9200/_flush/synced?pretty'
如果你安装了一些插件，或者是自己设置过jvm.options文件的话，需要先将/usr/local/elasticsearch/plugins拷贝出来，作为一个备份，jvm.options也拷贝出来将老的es安装目录删除，然后将最新版本的es解压缩，而且要确保我们绝对不会覆盖config、data、log等目录，否则就会导致我们丢失数据、日志、配置文件还有安装好的插件。
可以将备份的plugins目录拷贝回最新解压开来的es安装目录中，包括你的jvm.options，也自己去官网，找到各个plugin的git地址，git地址上，都有每个plugin version跟es version之间的对应关系。要检查一下所有的plugin是否跟要升级的es版本是兼容的，如果不兼容，那么需要先用elasticsearch-plugin脚本重新安装最新版本的plugin。

关闭旧的es5.3服务

代码语言：javascript复制

# 查询es的进程ID
ps -ef | grep Elasticsearch
# 停掉当前运行的es5.3进程
kill [PID]

开始升级

代码语言：javascript复制

# 改变文件的拥有者为elasticsearch
chown -R elasticsearch /opt/elasticsearch-5.5.0

cd elasticsearch-5.5.0/bin

# -Epath.conf 指定了原5.3.0复制到外部的配置
# -Enetwork.host=0.0.0.0 表示允许任何来源访问， 也可以在elasticsearch.yml中设置
./elasticsearch -d  -Epath.conf=/opt/temp/config -Enetwork.host=0.0.0.0

# 如下方式启动可以指定JVM堆大小
ES_JAVA_OPTS="-Xms2g -Xmx2g" ./elasticsearch -d  -Epath.conf=/opt/temp/config -Enetwork.host=0.0.0.0

# 查看es启动日志
tail -f /var/log/elasticsearch/my-application.log

# 在node上重新启用shard allocation
curl -XPUT 'http://localhost:9200/_cluster/settings?pretty' -d '
{
  "persistent": {
    "cluster.routing.allocation.enable": "all"
  }
}'

等待node完成shard recover过程我们要等待cluster完成shard allocation过程，可以通过下面的命令查看进度：GET _cat/health。一定要等待cluster的status从yellow变成green才可以。green就意味着所有的primary shard和replica shard都可以用了。 curl -XGET 'http://localhost:9200/_cat/health?pretty'
在rolling upgrade期间，primary shard如果分配给了一个更新版本的node，是一定不会将其replica复制给较旧的版本的node的，因为较新的版本的数据格式跟较旧的版本是不兼容的。但是如果不允许将replica shard复制给其他node的话，比如说此时集群中只有一个最新版本的node，那么有些replica shard就会是unassgied状态，此时cluster status就会保持为yellow。此时，就可以继续升级其他的node，一旦其他node变成了最新版本，那么就会进行replica shard的复制，然后cluster status会变成green。
如果没有进行过flush操作的shard是需要一些时间去恢复的，因为要从translog中恢复一些数据出来。可以通过下面的命令来查看恢复的进度：GET _cat/recovery。

升级完成启动成功后，执行如下命令查看数据是否存在

代码语言：javascript复制

curl -XGET 'http://localhost:9200/forum/article/1?pretty'

升级成功！数据迁移成功！

es 2.4.3升级至es 5.5.x（跨版本升级）

提示 es只能使用上一个大版本创建的索引。举例来说，es 5.x可以使用es 2.x中的索引，但是不能使用es 1.x中的索引。 es 5.x如果使用过于陈旧版本的索引去启动，就会启动失败

安装es2.4.3

代码语言：javascript复制

cd /opt
wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/zip/elasticsearch/2.4.3/elasticsearch-2.4.3.zip
unzip elasticsearch-2.4.3.zip
./elasticsearch-2.4.3/bin -d -Epath.conf=/opt/temp/config

接下来的操作同上一章相似，这里不再赘述。

es 1.7.4升级至es 5.5.x（跨多版本升级）

同上文的章节一下，先安装es1.7.4，如果已经安装过其他版本的es了，可以直接卸载，这里不支持降级，因为高版本的索引数据低版本是不支持的。为了以防万一，建议备份数据。

安装es 1.7.4

代码语言：javascript复制

cd /opt
wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.7.4.zip
unzip elasticsearch-1.7.4.zip

# 切换为elasticsearch用户
su elasticsearch

# 更改文件所属者
chown -R elasticsearch elasticsearch-1.7.4

# 启动elasticsearch-1.7.4
cd elasticsearch-1.7.4
./bin/elasticsearch -d

这里为了省事，就不把配置copy一份到外面了，启动成功后自动在当前的elasticsearch 目录中生成了data、logs文件夹

查看启动日志tail -f ./logs/elasticsearch.log发现启动成功，这里的0.0.0.0是允许任何来源访问

通过浏览器查看9200端口地址，也是可以看到成功。

插入测试数据

代码语言：javascript复制

# 插入数据
curl -XPUT 'http://192.168.105.81:9200/forum/article/1?pretty' -d '
{
  "title": "first article",
  "content": "this is my first article"
}'
# 查看数据
curl -XGET 'http://192.168.105.81:9200/forum/article/1?pretty'

安装启动elasticsearch-5.5.3

代码语言：javascript复制

cd /opt/
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.0.zip
unzip elasticsearch-5.5.0.zip
cd elasticsearch-5.5.3

修改es5.5.3的配置文件内容，如下图所示：vi ./config/elasticsearch.yml

启动es5.5.3 ./bin/elasticsearch，如图说是，9201端口的es正是5.5.3版本的elasticsearch

查询es5.5.3中的数据， http://192.168.105.81:9201/forum/article/1?pretty 发现并没有之前在1.7中插入的forum索引数据。

es1.7数据迁移至es5.5.3

准备工作

禁止es1.7的shard allocation机制。执行如下命令： curl -XPUT 'http://localhost:9200/_cluster/settings?pretty' -d ' { "persistent": { "cluster.routing.allocation.enable": "none" } }'
在es1.7执行一次flush数据写入操作，所以可能需要多次执行flush，直到它执行成功。 curl -XPOST 'http://localhost:9200/_flush/synced?pretty'
如有需要备份plugins插件和jvm.options
不要覆盖config、data、log等目录

开始迁移

remote host必须显示配置在elasticsearch.yml中的白名单中，使用reindex.remote.whitelist属性

reindex.remote.whitelist: ["127.0.0.1:9200","localhost:9200"]

reindex过程中会使用的默认的on-heap buffer最大大小是100mb，如果要迁移的数据量很大，需要将batch size设置的很小，这样每次同步的数据就很少，使用size参数。还可以设置socket_timeout和connect_timeout，比如下面：

代码语言：javascript复制

# 设置了batchSize和timeout的reindex
curl -XPOST 'http://localhost:9201/_reindex?pretty' -d '
{
  "source": {
    "remote": {
      "host": "http://localhost:9200",
      "socket_timeout": "1m",
      "connect_timeout": "10s"
    },
    "index": "source",
    "size": 10,
    "query": {
      "match": {
        "test": "data"
      }
    }
  },
  "dest": {
    "index": "dest"
  }
}'

# 使用默认设置进行reindex
curl -XPOST 'http://localhost:9201/_reindex?pretty' -d '
{
  "source": {
    "remote": {
      "host": "http://127.0.0.1:9200"
    },
    "index": "forum"
  },
  "dest": {
    "index": "forum"
  }
}'

更多reindex技巧请参考=> elasticsearch 基础 —— ReIndex

es 2.4.6升级至es 7.5.x（跨多版本升级）

安装es7.5.0

为了方便，就不再修改elasticsearch.yml，使用默认的设置。有需要可以使用参考原有线上配置或自定义想要的配置进行修改

代码语言：javascript复制

cd /opt
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.5.0-linux-x86_64.tar.gz
tar -zvxf elasticsearch-7.5.0-linux-x86_64.tar.gz
cd elasticsearch-7.5.0
./bin/elasticsearch

启动成功~

准备工作

禁止es1.7的shard allocation机制。执行如下命令： curl -XPUT 'http://localhost:9200/_cluster/settings?pretty' -d ' { "persistent": { "cluster.routing.allocation.enable": "none" } }'
在es1.7执行一次flush数据写入操作，所以可能需要多次执行flush，直到它执行成功。 curl -XPOST 'http://localhost:9200/_flush/synced?pretty'
如有需要备份plugins插件和jvm.options
不要覆盖config、data、log等目录

数据迁移

同es1.7数据迁移至es5.5.3一样，都是使用reindex

发现同样的命令使用报错，es7.5.x需要指定请求head

代码语言：javascript复制

# 前台方式执行
curl -XPOST 'http://127.0.0.1:9201/_reindex?pretty' -H 'content-Type:application/json' -d '
{
  "source": {
    "remote": {
      "host": "http://127.0.0.1:9200"
    },
    "index": "forum",
    "type": "article"
  },
  "dest": {
    "index": "forum_jp1"
  }
}'

# 后台方式执行，返回taskId
curl -XPOST 'http://127.0.0.1:9201/_reindex?pretty&wait_for_completion=false' -H 'content-Type:application/json' -d '
{
  "source": {
    "remote": {
      "host": "http://127.0.0.1:9200"
    },
    "index": "forum",
    "type": "article"
  },
  "dest": {
    "index": "forum_jp1"
  }
}'

# 查看所有task
curl -XGET 'http://127.0.0.1:9201/_tasks?detailed=true&actions=*reindex'
# 使用taskId查看指定task详情
curl -XGET 'http://127.0.0.1:9201/_tasks/99zZQSV6ROmXDvZX3fSoPQ:491?pretty'

更多Task API内容请参考=>Elasticsearch Task management API

reindex返回参数说明

代码语言：javascript复制

{
  "took" : 639,    // 执行全过程使用的毫秒数
  "updated": 0,    // 成功修改的条数
  "created": 123,  // 成功创建的条数
  "batches": 1,    // 批处理的个数
  "version_conflicts": 2, // 版本冲突个数
  "retries": {     // 重试机制
    "bulk": 0,     // 重试的批个数
    "search": 0    // 重试的查询个数
  }
  "throttled_millis": 0, // 由于设置requests_per_second参数而sleep的毫秒数
  "failures" : [ ]  // 失败的数据
}

报错解决方案

[1]: max number of threads [2048] for user [elasticsearch] is too low, increase to at least [4096]

切换到root用户，进入limits.d目录下修改配置文件。 vi /etc/security/limits.d/90-nproc.conf 修改如下内容为： soft nproc 4096 或者指定用户 elasticsearch soft nproc 4096

[2]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

原因：这是在因为Centos6不支持SecComp，而ES5.2.0默认bootstrap.system_call_filter为true进行检测，所以导致检测失败，失败后直接导致ES不能启动。解决：在elasticsearch.yml中配置bootstrap.system_call_filter为false，注意要在Memory下面: bootstrap.memory_lock: false bootstrap.system_call_filter: false

[3]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured

修改 elasticsearch.yml 取消注释保留一个节点 cluster.initial_master_nodes: ["node-1"] 这个的话，这里的node-1是上面一个默认的记得打开就可以了重启正常

扩展总结

如何让reindex更快

在新版本的es中先创建新的索引，手动设置合适的mapping和setting，将refresh_interval设置为-1，并且设置number_of_replica为0，主要是为了更快的reindex。reindex完成后再设置成原es的setting，如number_of_shards和number_of_replicas

官方参考手册

Upgrade Elasticsearch

集群怎么升级

滚动升级策略，集群，集群里面有多个节点，一个节点一个节点的重启和升级

如果是大版本之间的升级，集群重启策略，要先将整个集群全部停掉，如果采取滚动升级策略的话，可能导致说，一个集群内，有些节点是es 5.5，有些节点是es 2.4.3，这样的话是可能会有问题的

升级的过程，其实是跟之前的一模一样的

es在进行重大版本升级的时候，一般都需要采取full cluster restart的策略，重启整个集群来进行升级。rolling upgrade在重大版本升级的时候是不合适的。

没有启用shard allocation会怎么样

shard allocation会将replica shard分配给data node。此时可以恢复index和search操作，不过最好还是等待replica shard全部分配完之后，再去恢复读写操作。

如何知道集群升级没问题

我们可以通过下面的api来监控这个过程，如果_cat/health中的status列变成了green，那么所有的primary和replica shard都被成功分配了

代码语言：javascript复制

GET _cat/health
GET _cat/recovery

不想reindex-from-remote怎么办？

还有一种方式reindex in place，就是用elasticsearch migration plugin去做reindex。更多内容参考=>Reindex in place

reindex后数据查询不到

直接刷新可执行 POST /indexName/_refresh

refresh操作可以通过API设置：

代码语言：javascript复制

POST /index/_settings
{"refresh_interval": "10s"}

当我们进行大规模的创建索引操作的时候，最好将将refresh关闭。

代码语言：javascript复制

POST /index/_settings
{"refresh_interval": "-1"}

es默认的refresh间隔时间是1s，这也是为什么ES可以进行近乎实时的搜索。

参考=>ES中Refresh和Flush的区别

es es2 ecmascript ElasticsearchService node.js

0 人点赞