github地址: https://github.com/ankane/dexter
安装高版本ruby
代码语言:shell复制安装scl源:
yum install scl-utils centos-release-scl scl-utils-build
查看从 SCL 中安装的包的列表:
scl --list
列出scl源有哪些包可以用:
yum list all --enablerepo='centos-sclo-rh' | grep ruby
查看从 SCL 中安装的包的列表:
scl --list
安装ruby3.0版本
yum install rh-ruby30 rh-ruby30-ruby-devel
切到ruby3.0环境下
scl enable rh-ruby30 bash
# ruby --version
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
安装pgdexter组件
代码语言:shell复制$ scl enable rh-ruby30 bash
# export PATH=/usr/pgsql-15/bin/:/usr/pgsql-15/lib/:$PATH
# export https_proxy=http://192.168.31.1:7890 && export http_proxy=http://192.168.31.1:7890
# gem install pgdexter
Successfully installed google-protobuf-3.25.1-x86_64-linux
Building native extensions. This could take a while...
Successfully installed pg-1.5.4
Fetching pgdexter-0.5.1.gem
Successfully installed pgdexter-0.5.1
Parsing documentation for google-protobuf-3.25.1-x86_64-linux
Installing ri documentation for google-protobuf-3.25.1-x86_64-linux
Parsing documentation for pg-1.5.4
Installing ri documentation for pg-1.5.4
Parsing documentation for pgdexter-0.5.1
Installing ri documentation for pgdexter-0.5.1
Done installing documentation for google-protobuf, pg, pgdexter after 13 seconds
3 gems installed
对pg启用2个组件
插件不是本文重点,安装步骤忽略
代码语言:sql复制CREATE EXTENSION hypopg;
CREATE EXTENSION pg_stat_statements
造点测试数据,并进行压测
代码语言:shell复制$ pgbench postgres --initialize --fillfactor=10 --scale=10
dropping old tables...
creating tables...
generating data (client-side)...
1000000 of 1000000 tuples (100%) done (elapsed 14.24 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 23.42 s (drop tables 0.01 s, create tables 0.01 s, client-side generate 21.56 s, vacuum 0.44 s, primary keys 1.41 s).
postgres=# dt
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
-------- ------------------------- ---------- ---------- ------------- --------------- ------------ -------------
public | pgbench_accounts | table | postgres | permanent | heap | 1302 MB |
public | pgbench_branches | table | postgres | permanent | heap | 40 kB |
public | pgbench_history | table | postgres | permanent | heap | 0 bytes |
public | pgbench_tellers | table | postgres | permanent | heap | 80 kB |
然后我们把原表的主键索引删掉,制造出全表扫描的场景
postgres=# alter table pgbench_accounts drop constraint pgbench_accounts_pkey ;
ALTER TABLE
Time: 9.403 ms
另外开一个窗口,执行压测
$ pgbench -c 10 -j 10
pgbench (16.0 - Percona Distribution, server 15.4 - Percona Distribution)
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 10
number of threads: 10
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 100/100
number of failed transactions: 0 (0.000%)
latency average = 2140.750 ms
initial connection time = 74.566 ms
tps = 4.671260 (without initial connection time)
将dexer连接到pg并读取pg-stat-statements信息进行分析【推荐】
代码语言:shell复制# dexter -d postgres -Udts -h192.168.31.181 --pg-stat-statements --enable-hypopg
返回结果:
Processing 269 new query fingerprints
Index found: public.pgbench_accounts (aid)
如果没有发现需要优化的sql,则返回类似如下:
Processing 192 new query fingerprints
No new indexes found
另外,如果需要直接创建索引,可以加参数 --create 【生产上不推荐这种】
# dexter -d postgres -Udts -h192.168.31.181 --pg-stat-statements --enable-hypopg --create
Processing 269 new query fingerprints
Index found: public.pgbench_accounts (aid)
Creating index: CREATE INDEX CONCURRENTLY ON "public"."pgbench_accounts" ("aid")
Index created: 1967 ms
另外,也可以使用docker方式
代码语言:shell复制docker pull ankane/dexter
docker run --rm ankane/dexter dexter -d postgres -Udts -h192.168.31.181 --pg-stat-statements --enable-hypopg
Processing 44 new query fingerprints
Index found: public.pgbench_accounts (aid)
dexter的帮助
代码语言:shell复制# dexter --help
Usage:
dexter [options]
Input options:
--input-format input format
--pg-stat-activity use pg_stat_activity
--stdin use stdin
-s, --statement process a single statement
Connection options:
-d, --dbname database name
-h, --host database host
-p, --port database port
-U, --username database user
Processing options:
--interval time to wait between processing queries, in seconds
--min-calls only process queries that have been called a certain number of times
--min-time only process queries that have consumed a certain amount of DB time, in minutes
Indexing options:
--analyze analyze tables that haven't been analyzed in the past hour
--create create indexes
--enable-hypopg enable the HypoPG extension
--exclude prevent specific tables from being indexed
--include only include specific tables
--tablespace tablespace to create indexes
Logging options:
--log-level log level
--log-sql log sql
Other options:
-v, --version print the version
--help prints help
其它
直接从pg当前会话进行分析【推荐】
代码语言:shell复制# dexter -d postgres -Udts -h192.168.31.181 --pg-stat-activity --enable-hypopg
Started
Processing 2 new query fingerprints
Index found: public.pgbench_accounts (aid)
从pg日志文件进行分析【不推荐】
代码语言:shell复制先调低pg慢日志时间
log_min_duration_statement = 10 # ms
然后使用
# dexter -d postgres -Udts -h192.168.31.181 postgresql-Wed.log
Processing 276 new query fingerprints
Index found: public.pgbench_accounts (aid)
支持stderr、csvlog、 和jsonlog格式。
对于实时索引,请通过管道传输日志文件:
tail -F -n 1 postgresql.csv | dexter -d postgres -Udts -h192.168.31.181 --enable-hypopg --stdin
从sql文件进行分析【不推荐】
代码语言:shell复制dexter -d postgres -Udts -h192.168.31.181 --enable-hypopg queries.sql
dexter -d postgres -Udts -h192.168.31.181 --enable-hypopg -s "SELECT * FROM xxx where xx=x"
收集选项
代码语言:shell复制可以忽略某些极少量的一次性的查询被索引,只分析运行了至少100次的sql
dexter --min-calls 100
只分析运行的总时长超过10分钟的sql
dexter --min-time 10 # minutes
流式传输日志时,指定处理查询之间等待的时间
dexter --interval 60 # seconds
分析
为了获得最佳结果,请确保您的表最近已被分析过,以便统计数据是最新的。您可以要求 Dexter 分析它遇到的在过去一小时内尚未分析过的表:
代码语言:shell复制# dexter -d postgres -Udts -h192.168.31.181 --pg-stat-statements --analyze
Processing 276 new query fingerprints
Index found: public.pgbench_accounts (aid)
表格的过滤
代码语言:shell复制您可以使用以下方法从索引中排除大型表或写入量大的表:
dexter --exclude table1,table2
或者,您可以指定要索引的表:
dexter --include table3,table4
例:
# dexter -d postgres -Udts -h192.168.31.181 --pg-stat-statements --enable-hypopg --exclude=sm1.t1 --include=public.pgbench_accounts
Processing 44 new query fingerprints
Index found: public.pgbench_accounts (aid)
注意: --exclude=sm1.t1 和 --include 其中的表名不支持通配符的写法
调试
查看 Dexter 如何处理查询:
代码语言:shell复制dexter --log-sql --log-level debug2
dexter --log-sql --log-level debug3
发现的bug
update: 2024-01-07 ,作者已修复该问题并合并到master分支,详见 https://github.com/ankane/dexter/issues/48
代码语言:shell复制1、如果pg启用了auto_explain插件,且设置 auto_explain.log_min_duration = '0'
则执行dexter --pg-stat-statements 的时候会报错,如下是加了参数 --log-level debug2 打出的日志情况
# dexter -d postgres -Udts -h192.168.31.181 --pg-stat-statements --enable-hypopg --log-level debug2
Processing 5 new query fingerprints
Last analyze: public.pgbench_accounts : 2024-01-03T10:49:16 08:00
/opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:310:in `block in determine_indexes': undefined method `<' for nil:NilClass (NoMethodError)
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:306:in `each'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:306:in `determine_indexes'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:102:in `process_queries'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:32:in `process_stat_statements'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/client.rb:27:in `perform'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/client.rb:9:in `start'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/exe/dexter:7:in `<top (required)>'
from /opt/rh/rh-ruby30/root/usr/local/bin/dexter:23:in `load'
from /opt/rh/rh-ruby30/root/usr/local/bin/dexter:23:in `<main>'
根据报错日志,加些log输出
vim /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb 310 , 加一些 puts 输出变量到控制台
savings_ratio = (1 - @min_cost_savings_pct / 100.0)
puts "savings_ratio ---> ",savings_ratio
queries.each do |query|
if query.explainable? && query.high_cost?
puts "query.costs -->", query.costs
puts "query.costs.length -->", query.costs.length
new_cost, new_cost2 = query.costs[1..2]
puts "new_cost -->", new_cost
puts "new_cost2 -->", new_cost2
puts "query.initial_cost -->", query.initial_cost
puts "savings_ratio -->",savings_ratio
再次执行上面的dexter命令,可以看到输出变量情况如下
# dexter -d postgres -Udts -h192.168.31.181 --pg-stat-statements --enable-hypopg --log-level debug2
Processing 25 new query fingerprints
Last analyze: public.pgbench_accounts : 2024-01-03T10:49:16 08:00
savings_ratio --->
0.5
query.costs -->
172875.43
query.costs.length --> 注意这里数组长度为1,造成 query.costs[1..2] 解包失败,导致 new_cost和new_cost2都为nil
1
new_cost -->
new_cost2 -->
query.initial_cost -->
172875.43
savings_ratio -->
0.5
/opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:318:in `block in determine_indexes': undefined method `<' for nil:NilClass (NoMethodError)
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:308:in `each'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:308:in `determine_indexes'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:102:in `process_queries'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/indexer.rb:32:in `process_stat_statements'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/client.rb:27:in `perform'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/lib/dexter/client.rb:9:in `start'
from /opt/rh/rh-ruby30/root/usr/local/share/gems/gems/pgdexter-0.5.1/exe/dexter:7:in `<top (required)>'
from /opt/rh/rh-ruby30/root/usr/local/bin/dexter:23:in `load'
from /opt/rh/rh-ruby30/root/usr/local/bin/dexter:23:in `<main>'
2、如果pg启用了auto_explain插件,但是设置 auto_explain.log_min_duration不为0,例如设置为 100 ,则不会有报错
postgres=# select pg_stat_statements_reset() ;
pg_stat_statements_reset
--------------------------
(1 row)
Time: 2.176 ms
postgres=# select * from pgbench_accounts where aid=1000;
aid | bid | abalance | filler
------ ----- ---------- --------------------------------------------------------------------------------------
1000 | 1 | 0 |
(1 row)
# dexter -d postgres -Udts -h192.168.31.181 --pg-stat-statements --enable-hypopg --log-level debug2
Processing 5 new query fingerprints
Last analyze: public.pgbench_accounts : 2024-01-03T10:49:16 08:00
savings_ratio --->
0.5
query.costs -->
172875.43
8.07
8.07
query.costs.length -->
3
new_cost -->
8.07
new_cost2 -->
8.07
query.initial_cost -->
172875.43
savings_ratio -->
0.5
Index found: public.pgbench_accounts (aid)
--------------------------------------------------------------------------------
Query 1edaea087b983cab
Start: 172875.43
Pass1: 8.07 : public.pgbench_accounts (aid)
Pass2: 8.07 : public.pgbench_accounts (aid)
Final: 8.07 : public.pgbench_accounts (aid)
select * from pgbench_accounts where aid=$1
--------------------------------------------------------------------------------
Query 300ac36fac258016
No tables
select pg_stat_statements_reset()
--------------------------------------------------------------------------------
Query 8d6f8ae553bbb2a4
No candidate tables for indexes
SELECT installed_version FROM pg_available_extensions WHERE name = $1 /*dexter*/
--------------------------------------------------------------------------------
Query c43bbb6237d6cbc6
No tables
SET lock_timeout = '5s' /*dexter*/
--------------------------------------------------------------------------------
Query ce4b1350f91bbc5f
No tables
SHOW server_version_num /*dexter*/
大致看了下indexer.rb的代码,正常情况下 calculate_plan 会执行3次,计算3次的costs存到数组里面,流程就不会抛错。 具体是哪个逻辑里面出问题的,ruby代码不想去深入了。
简单看了下dexter的逻辑(不到800行的代码),发现和我们目前生产在用的MySQL的sql自动化索引推荐服务有点类似,但PG有hypopg加持,MySQL我只能抽样数据做成沙箱环境。