说明
本文描述问题及解决方法同样适用于 腾讯云 Elasticsearch Service(ES)。
另外使用到:腾讯云 云服务器(Cloud Virtual Machine,CVM)
本文延续上一篇 Elasticsearch压测工具esrally部署之踩坑实录(上)
本文另有延续:
Elasticsearch压力测试 - 云 社区 - 腾讯云 (tencent.com)
Elasticsearch压测工具esrally部署指南(推荐)
友情提示
- 本文全文完整记录了部署过程中踩坑的经过,不建议参考本文来直接部署,部署前建议完整阅读完 踩坑实录(上)以及 踩坑实录(下)。
- 本文另有完整的避坑版部署指南,全文可放心食用,可参考直接部署。
环境配置
注:这套环境配置为本文验证通过的环境配置及版本,避免踩坑请尽量按照环境配置里提到的配置及版本
Esrally客户端环境
- 版本
Linux环境:Centos 7.9
Python:3.8.7
Pip:pip 20.2.3 from pip (python 3.8)
Java:openjdk version 1.8.0_302 (build 1.8.0_302-b08)
Git:2.7.5
Esrally:2.3.0
- 配置
内存:32G
硬盘:SSD云硬盘 100GB
CPU个数:1
CPU核心数:16
Elasticsearch服务端环境
- 版本
Linux环境:Centos 7.2
Java:openjdk version 11.0.9.1-ga (build 11.0.9.1-ga 1, mixed mode)
Elasticsearch版本:7.10.1(腾讯云 Elasticsearch Service 白金版)
- 配置
节点数量:3
内存:16G
硬盘:SSD云硬盘 1TB
CPU个数:1
CPU核心数:4
CPU型号:AMD EPYC 7K62 48-Core Processor
背景
在大数据时代的今天,业务量越来越大,每天动辄都会产生上百GB、上TB的数据,所以拥有一个性能强劲的Elasticsearch集群就显得尤为重要。我们需要模拟大量网络日志、用户行为日志的读写动作,衡量各性能的指标,找出集群瓶颈所在,以确认我们需要怎样的硬件配置以及业务优化,才能满足现有的业务量,这就是我们在业务上线前所必要做的。
在上一篇中,我们遇到了多个难解决的问题,并且一一解决了。但殊不知,坑是填不完的。
压测
esrally 相关术语及参数
Rally 是汽车拉力赛的意思,所以关于它里面术语也是跟汽车的拉力赛有关。
- track: 即赛道的意思,这里指压测用到的样本数据和压测策略,使用
esrally list tracks
列出。rally 自带的 track 可在 https://github.com/elastic/rally-tracks 中查看,每个 track 的文件名中都存在 README.md 对压测的数据类型和参数做了详细的说明。如果没有指定 track, 则默认使用 geonames track 进行测试; - target-hosts:即远程elasticsearch的ip和端口,以ip:port的形式指定;
- pipeline: 指一个压测流程,可以通过
esrally list pipeline
查看,其中有一个benchmark-only
的流程,就是将 es 的管理交给用户来操作,rally 只用来做压测,如果你想针对已有的 es 进行压测,则使用该模式; - track-params:对默认的压测参数进行覆盖;
- user-tag:本次压测的 tag 标记;
- client-options:指定一些客户端连接选项,比如用户名和密码。
[dy@VM-10-15-centos ~]$ esrally
> --track=geonames
> --target-hosts=10.0.10.4:9200
> --pipeline=benchmark-only
> --track-params="number_of_shards:3, number_of_replicas:1"
> --user-tag="version:AMD_4C16G_1T*3"
> --client-options="basic_auth_user:'elastic', basic_auth_password:'your_password'"
____ ____
/ __ ____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|__,_/_/_/__, /
/____/
[INFO] You did not provide an explicit timeout in the client options. Assuming default of 10 seconds.
************************************************************************
************** WARNING: A dark dungeon lies ahead of you **************
************************************************************************
Rally does not have control over the configuration of the benchmarked
Elasticsearch cluster.
Be aware that results may be misleading due to problems with the setup.
Rally is also not able to gather lots of metrics at all (like CPU usage
of the benchmarked cluster) or may even produce misleading metrics (like
the index size).
************************************************************************
****** Use this pipeline only if you are aware of the tradeoffs. ******
*************************** Watch your step! ***************************
************************************************************************
[WARNING] Could not update tracks. Continuing with your locally available state.
[INFO] Racing on track [geonames], challenge [append-no-conflicts] and car ['external'] with version [7.10.1].
[WARNING] merges_total_time is 320911 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] merges_total_throttled_time is 41844 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_total_time is 1131447 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 148274 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 19577 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running delete-index [100% done]
Running create-index [100% done]
Running check-cluster-health [100% done]
Running index-append [100% done]
Running refresh-after-index [100% done]
Running force-merge [100% done]
Running refresh-after-force-merge [100% done]
Running wait-until-merges-finish [100% done]
Running index-stats [100% done]
Running node-stats [100% done]
Running default [100% done]
Running term [100% done]
Running phrase [100% done]
Running country_agg_uncached [100% done]
Running country_agg_cached [100% done]
Running scroll [100% done]
Running expression [100% done]
Running painless_static [100% done]
Running painless_dynamic [100% done]
Running decay_geo_gauss_function_score [100% done]
Running decay_geo_gauss_script_score [100% done]
Running field_value_function_score [100% done]
Running field_value_script_score [100% done]
Running large_terms [100% done]
Running large_filtered_terms [100% done]
Running large_prohibited_terms [100% done]
Running desc_sort_population [100% done]
Running asc_sort_population [100% done]
Running asc_sort_with_after_population [100% done]
Running desc_sort_geonameid [100% done]
Running desc_sort_with_after_geonameid [ 0% done]
Running desc_sort_with_after_geonameid [ 0% done][ERROR] Cannot race. Error in load generator [0]
("Cannot execute [user-defined context-manager enabled runner for [query]]. Provided parameters are: ['index', 'type', 'cache', 'request-params', 'body']. Error: ['total'].", None)
Getting further help:
*********************
* Check the log files in /home/dy/.rally/logs for errors.
* Read the documentation at https://esrally.readthedocs.io/en/1.4.1/
* Ask a question on the forum at https://discuss.elastic.co/c/elasticsearch/rally
* Raise an issue at https://github.com/elastic/rally/issues and include the log files in /home/dy/.rally/logs.
----------------------------------
[INFO] FAILURE (took 4335 seconds)
----------------------------------
果不其然,坑是填不完的,还是报错了。报错信息为:
代码语言:javascript复制Cannot execute [user-defined context-manager enabled runner for [query]]. Provided parameters are: ['index', 'type', 'cache', 'request-params', 'body']. Error: ['total'].
不支持total这个参数,那看来是版本问题了,那我们升级一下esrally吧。
首先我们看下esrally有哪些版本:
代码语言:javascript复制[root@VM-10-15-centos ~]# esrally --version
esrally 1.4.1
[root@VM-10-15-centos ~]# pip3 list | grep esrally
esrally 1.4.1
[root@VM-10-15-centos ~]# pip3 install esrally==
Looking in indexes: http://mirrors.tencentyun.com/pypi/simple
Collecting esrally==
Could not find a version that satisfies the requirement esrally== (from versions: 0.2.0, 0.2.1, 0.3.0, 0.3.1, 0.3.2, 0.3.3, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.4.4, 0.4.5, 0.4.6, 0.4.7, 0.4.8, 0.5.0, 0.5.1, 0.5.2, 0.5.3, 0.6.0, 0.6.1, 0.6.2, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.9.2, 0.9.3, 0.9.4, 0.10.0, 0.10.1, 0.11.0, 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.1.0, 1.2.1, 1.3.0, 1.4.0, 1.4.1)
No matching distribution found for esrally==
在pip源里,1.4.1已经是最新版本了,看来没法通过pip来安装了。
重新安装Esrally
1. 获取安装包
当下,只能通过esrally在GitHub上的官方项目中来获取新版安装包了。
代码语言:javascript复制https://github.com/elastic/rally/releases/
截图来自 —— Esrally官方GitHub项目
2. 解压并安装
将安装包传至服务器上并解压:
代码语言:javascript复制[root@VM-10-15-centos dy]# ll
total 5860
-rw-r--r-- 1 root root 5999548 Oct 21 12:32 esrally-dist-linux-2.3.0.tar.gz
[root@VM-10-15-centos dy]# tar -zxf esrally-dist-linux-2.3.0.tar.gz
[root@VM-10-15-centos dy]# ll
total 5864
drwxrwxr-x 3 1001 1002 4096 Oct 6 15:18 esrally-dist-2.3.0
-rw-r--r-- 1 root root 5999548 Oct 21 12:32 esrally-dist-linux-2.3.0.tar.gz
执行安装:
代码语言:javascript复制[root@VM-10-15-centos dy]# cd esrally-dist-2.3.0/
[root@VM-10-15-centos esrally-dist-2.3.0]# ll
total 8
drwxrwxr-x 2 1001 1002 4096 Oct 6 15:18 bin
-rwxrw-r-- 1 1001 1002 1193 Oct 6 15:18 install.sh
[root@VM-10-15-centos esrally-dist-2.3.0]# bash install.sh
Installing Rally 2.3.0...
Looking in links: file:///home/dy/esrally-dist-2.3.0/bin
Collecting esrally==2.3.0
esrally requires Python '>=3.8,<3.10' but the running Python is 3.6.7
这里又遇到报错了,提示python的版本需要>=3.8并且<3.10
代码语言:javascript复制[root@VM-10-15-centos esrally-dist-2.3.0]# python3 -V
Python 3.6.7
很显然,我们需要更换python版本。
2.1 下载python3.6.7源码并解压
代码语言:javascript复制[root@VM-10-15-centos dy]# wget https://www.python.org/ftp/python/3.8.7/Python-3.8.7.tgz
Saving to: ‘Python-3.8.7.tgz’
100%[====================================================================================================================================================================================================================================>] 24,468,684 4.74MB/s in 19s
2021-10-21 12:44:45 (1.20 MB/s) - ‘Python-3.8.7.tgz’ saved [24468684/24468684]
[root@VM-10-15-centos dy]# tar -zxf Python-3.8.7.tgz
2.2 编译并安装
代码语言:javascript复制[root@VM-10-15-centos dy]# mv /usr/local/python3/ /usr/local/python3.6.7
[root@VM-10-15-centos dy]# cd Python-3.8.7/
[root@VM-10-15-centos Python-3.8.7]# ./configure prefix=/usr/local/python3
configure: creating ./config.status
config.status: creating Makefile.pre
config.status: creating Misc/python.pc
config.status: creating Misc/python-embed.pc
config.status: creating Misc/python-config.sh
config.status: creating Modules/ld_so_aix
config.status: creating pyconfig.h
creating Modules/Setup.local
creating Makefile
If you want a release build with all stable optimizations active (PGO, etc),
please run ./configure --enable-optimizations
[root@VM-10-15-centos Python-3.8.7]# make && make install
Looking in links: /tmp/tmp6t_hcj5i
Processing /tmp/tmp6t_hcj5i/setuptools-49.2.1-py3-none-any.whl
Processing /tmp/tmp6t_hcj5i/pip-20.2.3-py2.py3-none-any.whl
Installing collected packages: setuptools, pip
Successfully installed pip-20.2.3 setuptools-49.2.1
2.3 配置python3环境变量
代码语言:javascript复制[root@VM-10-15-centos Python-3.8.7]# echo 'export PYTHON3_HOME=/usr/local/python3' >> /etc/profile
[root@VM-10-15-centos Python-3.8.7]# echo 'export PATH=$PATH:$PYTHON3_HOME/bin' >> /etc/profile
[root@VM-10-15-centos Python-3.8.7]# tail -2 /etc/profile
export PYTHON3_HOME=/usr/local/python3
export PATH=$PATH:$PYTHON3_HOME/bin
[root@VM-10-15-centos Python-3.8.7]# source /etc/profile
2.4 验证
代码语言:javascript复制[root@VM-10-15-centos Python-3.8.7]# python3 -V
Python 3.8.7
[root@VM-10-15-centos Python-3.8.7]# pip3 -V
pip 20.2.3 from /usr/local/python3/lib/python3.8/site-packages/pip (python 3.8)
2.5 执行安装esrally
代码语言:javascript复制[root@VM-10-15-centos Python-3.8.7]# cd ../esrally-dist-2.3.0/
[root@VM-10-15-centos esrally-dist-2.3.0]# bash install.sh
Installing Rally 2.3.0...
Looking in links: file:///home/dy/esrally-dist-2.3.0/bin
Processing ./bin/esrally-2.3.0-py3-none-any.whl
Processing ./bin/yappi-1.2.3.tar.gz
Processing ./bin/psutil-5.8.0-cp38-cp38-manylinux2010_x86_64.whl
Processing ./bin/certifi-2021.5.30-py2.py3-none-any.whl
Processing ./bin/ijson-2.6.1.tar.gz
Processing ./bin/elasticsearch-7.14.0-py2.py3-none-any.whl
Processing ./bin/jsonschema-3.1.1-py2.py3-none-any.whl
Processing ./bin/Jinja2-2.11.3-py2.py3-none-any.whl
Processing ./bin/tabulate-0.8.7-py3-none-any.whl
Processing ./bin/py-cpuinfo-7.0.0.tar.gz
Processing ./bin/thespian-3.10.1.zip
Processing ./bin/google_resumable_media-1.1.0-py2.py3-none-any.whl
Processing ./bin/google_auth-1.22.1-py2.py3-none-any.whl
Processing ./bin/urllib3-1.26.7-py2.py3-none-any.whl
Processing ./bin/aiohttp-3.7.4.post0-cp38-cp38-manylinux2014_x86_64.whl
Requirement already satisfied: setuptools in /usr/local/python3/lib/python3.8/site-packages (from jsonschema==3.1.1->esrally==2.3.0) (49.2.1)
Processing ./bin/importlib_metadata-4.8.1-py3-none-any.whl
Processing ./bin/pyrsistent-0.18.0-cp38-cp38-manylinux1_x86_64.whl
Processing ./bin/six-1.16.0-py2.py3-none-any.whl
Processing ./bin/attrs-21.2.0-py2.py3-none-any.whl
Processing ./bin/MarkupSafe-2.0.1.tar.gz
Processing ./bin/google_crc32c-1.3.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Processing ./bin/requests-2.26.0-py2.py3-none-any.whl
Processing ./bin/rsa-4.7.2-py3-none-any.whl
Processing ./bin/pyasn1_modules-0.2.8-py2.py3-none-any.whl
Processing ./bin/cachetools-4.2.4-py3-none-any.whl
Processing ./bin/async_timeout-3.0.1-py3-none-any.whl
Processing ./bin/multidict-5.2.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Processing ./bin/chardet-4.0.0-py2.py3-none-any.whl
Processing ./bin/typing_extensions-3.10.0.2-py3-none-any.whl
Processing ./bin/yarl-1.6.3-cp38-cp38-manylinux2014_x86_64.whl
Processing ./bin/zipp-3.6.0-py3-none-any.whl
Processing ./bin/charset_normalizer-2.0.6-py3-none-any.whl
Processing ./bin/idna-3.2-py3-none-any.whl
Processing ./bin/pyasn1-0.4.8-py2.py3-none-any.whl
Using legacy 'setup.py install' for yappi, since package 'wheel' is not installed.
Using legacy 'setup.py install' for ijson, since package 'wheel' is not installed.
Using legacy 'setup.py install' for py-cpuinfo, since package 'wheel' is not installed.
Using legacy 'setup.py install' for thespian, since package 'wheel' is not installed.
Using legacy 'setup.py install' for MarkupSafe, since package 'wheel' is not installed.
Installing collected packages: yappi, psutil, certifi, ijson, urllib3, async-timeout, multidict, attrs, chardet, typing-extensions, idna, yarl, aiohttp, elasticsearch, zipp, importlib-metadata, pyrsistent, six, jsonschema, MarkupSafe, Jinja2, tabulate, py-cpuinfo, thespian, google-crc32c, charset-normalizer, requests, google-resumable-media, pyasn1, rsa, pyasn1-modules, cachetools, google-auth, esrally
Running setup.py install for yappi ... done
Running setup.py install for ijson ... done
Running setup.py install for MarkupSafe ... done
Running setup.py install for py-cpuinfo ... done
Running setup.py install for thespian ... done
Successfully installed Jinja2-2.11.3 MarkupSafe-2.0.1 aiohttp-3.7.4.post0 async-timeout-3.0.1 attrs-21.2.0 cachetools-4.2.4 certifi-2021.5.30 chardet-4.0.0 charset-normalizer-2.0.6 elasticsearch-7.14.0 esrally-2.3.0 google-auth-1.22.1 google-crc32c-1.3.0 google-resumable-media-1.1.0 idna-3.2 ijson-2.6.1 importlib-metadata-4.8.1 jsonschema-3.1.1 multidict-5.2.0 psutil-5.8.0 py-cpuinfo-7.0.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyrsistent-0.18.0 requests-2.26.0 rsa-4.7.2 six-1.16.0 tabulate-0.8.7 thespian-3.10.1 typing-extensions-3.10.0.2 urllib3-1.26.7 yappi-1.2.3 yarl-1.6.3 zipp-3.6.0
[root@VM-10-15-centos esrally-dist-2.3.0]# esrally --version
esrally 2.3.0
可以看到,esrally的版本已经是2.3.0了。
重新尝试压测
执行压测:
代码语言:javascript复制[dy@VM-10-15-centos ~]$ esrally
> --track=geonames
> --target-hosts=10.0.10.4:9200
> --pipeline=benchmark-only
> --track-params="number_of_shards:3, number_of_replicas:1"
> --user-tag="version:AMD_4C16G_1T*3"
> --client-options="basic_auth_user:'elastic', basic_auth_password:'your_password'"
usage: esrally [-h] [--version] {race,list,info,create-track,generate,compare,download,install,start,stop} ...
esrally: error: argument subcommand: invalid choice: '--track-params=number_of_shards:3, number_of_replicas:1' (choose from 'race', 'list', 'info', 'create-track', 'generate', 'compare', 'download', 'install', 'start', 'stop')
发现之前可以运行的命令,现在报错了。
通过了解,新版esrally需要加上race
参数,指定需要做的是压测:
[dy@VM-10-15-centos ~]$ esrally race
> --track=geonames
> --target-hosts=10.0.10.4:9200
> --pipeline=benchmark-only
> --track-params="number_of_shards:3, number_of_replicas:1"
> --user-tag="version:AMD_4C16G_1T*3"
> --client-options="basic_auth_user:'elastic', basic_auth_password:'your_password'"
____ ____
/ __ ____ _/ / /_ __
/ /_/ / __ `/ / / / / /
/ _, _/ /_/ / / / /_/ /
/_/ |_|__,_/_/_/__, /
/____/
[INFO] Race id is [162db385-39c1-469a-bbd4-38c24dee3fd5]
[WARNING] Could not update tracks. Continuing with your locally available state.
[INFO] Racing on track [geonames], challenge [append-no-conflicts] and car ['external'] with version [7.10.1].
[WARNING] merges_total_time is 335897 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] merges_total_throttled_time is 52093 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] indexing_total_time is 1049075 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] refresh_total_time is 169466 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
[WARNING] flush_total_time is 13683 ms indicating that the cluster is not in a defined clean state. Recorded index time metrics may be misleading.
Running delete-index [100% done]
Running create-index [100% done]
Running check-cluster-health [100% done]
Running index-append [100% done]
Running refresh-after-index [100% done]
Running force-merge [100% done]
Running refresh-after-force-merge [100% done]
Running wait-until-merges-finish [100% done]
Running index-stats [100% done]
Running node-stats [100% done]
Running default [100% done]
Running term [100% done]
Running phrase [100% done]
Running country_agg_uncached [100% done]
Running country_agg_cached [100% done]
Running scroll [100% done]
Running expression [100% done]
Running painless_static [100% done]
Running painless_dynamic [100% done]
Running decay_geo_gauss_function_score [100% done]
Running decay_geo_gauss_script_score [100% done]
Running field_value_function_score [100% done]
Running field_value_script_score [100% done]
Running large_terms [100% done]
Running large_filtered_terms [100% done]
Running large_prohibited_terms [100% done]
Running desc_sort_population [100% done]
Running asc_sort_population [100% done]
Running asc_sort_with_after_population [100% done]
Running desc_sort_geonameid [100% done]
Running desc_sort_with_after_geonameid [100% done]
Running asc_sort_geonameid [100% done]
Running asc_sort_with_after_geonameid [100% done]
最终输出了一份压测报告,但由于篇幅次数限制的缘故,无法贴上压测报告。
感兴趣的同学可以移步:Elasticsearch集群3节点4核16G压测报告(AMD)
截图来源 —— Elasticsearch集群3节点4核16G压测报告(AMD)
小结
至此,esrally的安装和压测验证就结束了。后续会继续使用这个esrally客户端,对当前Elasticsearch市面上几大主流的配置机型进行实际压测,届时会继续分享给大家。
小彩蛋
新版本的esrally,在压测中途发起Ctrl C信号,esrally会返回一个骷髅头的画面,一开始还把我吓一跳 ☺