pg_rewind 概述
pg_rewind 是 postgresql 主从数据库之同步数据目录
的工具。
pg_rewind 相比 pg_basebackup 和 rsync 这样的工具来说,优势是它不需要从源目录拷贝所有的数据文件,而是会对比时间线发生偏离的点,只拷贝变化过的文件,这样对于数据量很大的情况下速度更快。
pg_rewind对服务端的要求
数据库开启 checksums 或者设置wal_log_hints参数为on,一般采用后者。
代码语言:javascript复制wal_log_hints=on
pg_rewind对权限的要求
pg_rewind工具其实只依赖如下以下四个文件读取系统函数的权限:
代码语言:javascript复制pg_ls_dir()
pg_read_file()
pg_read_binary_file()
pg_stat_file()
这四个函数在PostgreSQL 11版本之前只能超级用户才有权限使用。从PostgreSQL 11开始,pg_rewind可以不依赖超级用户只需要分配这几个系统函数权限。
pg_rewind对流复制的功能改进
代码语言:javascript复制-R / --write-recovery-conf
使用这个选项可以让pg_rewind
帮我们自动创建流复制相关的恢复配置文件,并将指定选项–source-server
里的连接字符串附加到postgresql.auto.conf
中的primary_conninfo
参数里,这个选项可以用来将原主库快速恢复为备库。
-c / --restore-target-wal
在进行pg_rewind
恢复时,源库pg_wal
目录下的wal文件可能因为某些原因不存在,因此会出现下面的报错提示:
pg_rewind: error: could not open file “/var/lib/pgsql/data/pg_wal/000000010000000000000002”: No such file or directory
pg_rewind 原理
- 1.确定新Master和旧Master数据一致性的Checkpoint位置.在该位置上,新Master和旧Master数据完全一致.这可以通过读取新旧Master节点时间线历史文件可以获得,该文件位于$PGDATA/pg_wal/目录下,文件名称为XX.history
- 2.旧Master节点根据上一步获取的Checkpoint读取本机日志文件WAL Record,获取在此Checkpoint之后出现变化的Block,并以链表的方式存储Block编号等信息
- 3.根据第2步获取的Block信息从新Master节点拷贝相应的Block,替换旧Master节点相应的Block
- 4.拷贝新Master节点上除数据文件外的所有其他文件,包括配置文件等(如果拷贝数据文件,与备份方式搭建区别不大)
- 5.旧Master启动数据库,应用从Checkpoint开始后的WAL Record.
- pg_rewind 语法
pg_rewind [option...] { -D | --target-pgdata } directory { --source-pgdata=directory | --source-server=connstr }
参数说明
代码语言:javascript复制-D directory 或 --target-pgdata=directory:#此选项指定与源同步的目标数据目录。在运行pg_rewind之前,必须干净关闭目标服务器。
–source-pgdata=directory:#指定要与之同步的源服务器的数据目录的文件系统路径。此选项要求干净关闭源服务器。
–source-server=connstr:#指定要连接到源 PostgreSQL 服务器的 libpq 连接字符串。连接必须是具有超级用户访问权限的正常(非复制)连接。此选项要求源服务器正在运行,而不是处于恢复模式。
-R 或 --write-recovery-conf:#创建standby.signal并将连接设置附加到输出目录中的 postgresql.auto.conf 中。–source-server对于此选项是必需的。
-n 或 --dry-run:#除了实际修改目标目录之外,执行所有操作。
-N 或 --no-sync:#默认情况下,pg_rewind 将等待所有文件安全地写入磁盘。 此选项会导致 pg_rewind 不等待即可返回,这更快,但意味着后续操作系统崩溃会使同步数据目录损坏。通常情况,此选项可用于测试,但不应使用于生产安装。
-P 或 --progress:#启用进度报告。在从源集簇拷贝数据时,打开这个选项将会发送一个近似的进度报告。
-c 或 --restore-target-wal:#如果在 pg_wal 目录中不再可用这些文件,请使用在目标群集配置中定义的 restore_command 从WAL存档中检索WAL文件。
–debug:#打印冗长的调试输出,这主要对于调试pg_rewind的开发者有用。
–no-ensure-shutdown:#pg_rewind 要求目标服务器在重放之前彻底关闭。默认情况下,如果目标服务器没有完全关闭,pg_rewind 会以单用户模式启动目标服务器,先完成崩溃恢复,然后将其停止。通过传递这个选项,如果服务器没有完全关闭,pg_rewind 会跳过这个并立即出错。 在这种情况下,用户应该自己处理这种情况。
-V --version:#显示版本信息然后退出。
pg_rewind 测试
测试环境
- 主库:192.168.13.31
- 备库:192.168.13.32
- 测试数据库版本:PostgreSQL 14.5
两节点配置好流复制
测试过程
配置测试环境,主备一致:
代码语言:javascript复制 /*192.168.13.31*/
[postgres@duqk01 ~]$ psql
psql (14.5)
Type "help" for help.
postgres=# show wal_log_hints;
wal_log_hints
---------------
on
(1 row)
postgres=# show full_page_writes;
full_page_writes
------------------
on
(1 row)
postgres=# show max_standby_streaming_delay;
max_standby_streaming_delay
-----------------------------
-1
(1 row)
postgres=# show restore_command;
restore_command
----------------------
cp /data/pgarc/%f %p
(1 row)
postgres=# show archive_mode;
archive_mode
--------------
always
(1 row)
postgres=# show archive_command;
archive_command
----------------------
cp %p /data/pgarc/%f
(1 row)
postgres=#
postgres=# select * from pg_replication_slotsgx
-[ RECORD 1 ]------- ----------
slot_name | rep
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | t
active_pid | 9888
xmin | 742
catalog_xmin |
restart_lsn | 0/6000110
confirmed_flush_lsn |
wal_status | reserved
safe_wal_size |
two_phase | f
postgres=#
创建测试数据,并在备库上开启长事物:
代码语言:javascript复制 /*192.168.13.31*/
postgres=# create table t(id int);
CREATE TABLE
postgres=# insert into t values(1);
INSERT 0 1
postgres=# create table tt(like t);
CREATE TABLE
postgres=# insert into tt values(1);
INSERT 0 1
postgres=#
* /192.168.13.32/ *
[postgres@duqk02 ~]$ psql
psql (14.5)
Type "help" for help.
postgres=# begin;
BEGIN
postgres=*# select * from tt;
id
----
1
(1 row)
postgres=*# select * from t;
id
----
1
(1 row)
postgres=*#
备库开启长事物,主库对数据库修改
代码语言:javascript复制* /192.168.13.31/ *
终端1
postgres=# drop table tt;
DROP TABLE
postgres=# insert into t select generate_series(1,10000);
INSERT 0 10000
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/60C1480
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/70000F0
(1 row)
postgres=# insert into t select generate_series(1,10000);
INSERT 0 10000
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/809D4D8
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/90000F0
(1 row)
postgres=# insert into t select generate_series(1,10000);
INSERT 0 10000
postgres=# insert into t select generate_series(1,10000);
INSERT 0 10000
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/A13A848
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/B16E5C0
(1 row)
postgres=# insert into t select generate_series(1,10000);
INSERT 0 10000
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/C0A4D48
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/D0000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/E0000F0
(1 row)
postgres=#
终端2
[postgres@duqk01 pg_wal]$ ls -trl
total 65540
-rw------- 1 postgres postgres 338 Sep 7 16:34 000000010000000000000004.00000028.backup
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000C
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000D
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000E
drwx------ 2 postgres postgres 170 Jan 13 08:56 archive_status
-rw------- 1 postgres postgres 16777216 Jan 13 08:57 00000001000000000000000F
[postgres@duqk01 pg_wal]$
---当前主库的最新的wal日志为:00000001000000000000000F
[postgres@duqk01 pg_wal]$ ls -trl /data/pgarc/
total 229380
-rw------- 1 postgres postgres 16777216 Sep 7 16:13 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Sep 7 16:29 000000010000000000000002
-rw------- 1 postgres postgres 16777216 Sep 7 16:34 000000010000000000000003
-rw------- 1 postgres postgres 16777216 Sep 7 16:34 000000010000000000000004
-rw------- 1 postgres postgres 338 Sep 7 16:34 000000010000000000000004.00000028.backup
-rw------- 1 postgres postgres 16777216 Sep 7 16:44 000000010000000000000005
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000007
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000008
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000009
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000A
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000B
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000C
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000D
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000E
[postgres@duqk01 pg_wal]$
备库确认复制延迟,但是此时wal日志和归档日志都已经传到备库上了的
代码语言:javascript复制* /192.168.13.32/ *
终端1
postgres=*# select * from tt;
id
----
1
(1 row)
postgres=*# select * from t;
id
----
1
(1 row)
postgres=*#
终端2
[postgres@duqk02 pg_wal]$ ls -trl
total 196612
-rw------- 1 postgres postgres 16777216 Sep 7 16:34 000000010000000000000004
-rw------- 1 postgres postgres 16777216 Jan 13 08:54 000000010000000000000005
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000007
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000008
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000009
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000A
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000B
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000C
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000D
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000E
drwx------ 2 postgres postgres 4096 Jan 13 08:56 archive_status
-rw------- 1 postgres postgres 16777216 Jan 13 08:57 00000001000000000000000F
[postgres@duqk02 pg_wal]$
---备库的最新的wal日志也为:00000001000000000000000F
[postgres@duqk02 pg_wal]$ ls -trl /data/pgarc/
total 163840
-rw------- 1 postgres postgres 16777216 Jan 13 08:54 000000010000000000000005
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000007
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000008
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000009
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000A
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000B
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000C
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000D
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000E
[postgres@duqk02 pg_wal]$
在主库上通过SQL查询,再次确认复制延迟
代码语言:javascript复制 /*192.168.13.31*/
postgres=# select pg_wal_lsn_diff(pg_current_wal_lsn(),sent_lsn) sent_wal,pg_wal_lsn_diff(pg_current_wal_lsn(),write_lsn) write_wal,pg_wal_lsn_diff(pg_current_wal_lsn(),flush_lsn) flush_wal,pg_wal_lsn_diff(pg_current_wal_lsn(),replay_lsn) replay_wal from pg_stat_replication;
sent_wal | write_wal | flush_wal | replay_wal
---------- ----------- ----------- ------------
0 | 0 | 0 | 151253816
(1 row)
postgres=#
在备库有复制延迟的前提下,提升备库为主库:
代码语言:javascript复制* /192.168.13.32/ *
终端2
[postgres@duqk02 ~]$ pg_ctl promote -D /data/pgdata
waiting for server to promote............................................................... stopped waiting
pg_ctl: server did not promote in time
[postgres@duqk02 ~]$
---原因是因为备库上有长事物在运行,所以promote会失败。
[postgres@duqk02 ~]$ pg_ctl promote -D /data/pgdata
waiting for server to promote..... done
server promoted
[postgres@duqk02 ~]$
---在运行的过程中结束终端1的长事物
原备库提升为主库之后,创建流复制所需的复制槽,并模拟数据变动
代码语言:javascript复制 /*192.168.13.32*/
终端1
postgres=# select count(*) from t;
count
-------
50001
(1 row)
postgres=# select count(*) from tt;
ERROR: relation "tt" does not exist
LINE 1: select count(*) from tt;
^
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
----------- -------- ----------- -------- ---------- ----------- -------- ------------ ------ -------------- ------------- --------------------- ------------ --------------- -----------
(0 rows)
postgres=# select * from pg_create_physical_replication_slot('rep');
slot_name | lsn
----------- -----
rep |
(1 row)
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
----------- -------- ----------- -------- ---------- ----------- -------- ------------ ------ -------------- ------------- --------------------- ------------ --------------- -----------
rep | | physical | | | f | f | | | | | | | | f
(1 row)
postgres=#
postgres=# truncate table t;
TRUNCATE TABLE
postgres=#
postgres=# insert into t select generate_series(1,100);
INSERT 0 100
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/F225258
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/100000F0
(1 row)
postgres=#
postgres=# insert into t select generate_series(1,100);
INSERT 0 100
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/110028B0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/120000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/130000F0
(1 row)
postgres=# insert into t select generate_series(1,100);
INSERT 0 100
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/1400E128
(1 row)
postgres=#
终端2
[postgres@duqk02 pg_wal]$ ls -trl
total 196612
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000020000000000000016
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000020000000000000017
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000020000000000000018
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000020000000000000019
-rw------- 1 postgres postgres 16777216 Jan 13 08:57 00000002000000000000001B
-rw------- 1 postgres postgres 41 Jan 13 08:59 00000002.history
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 00000002000000000000001A
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 00000002000000000000001C
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 00000002000000000000001D
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000012
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000013
-rw------- 1 postgres postgres 16777216 Jan 13 09:02 000000020000000000000014
drwx------ 2 postgres postgres 146 Jan 13 09:02 archive_status
-rw------- 1 postgres postgres 16777216 Jan 13 09:02 000000020000000000000015
[postgres@duqk02 pg_wal]$ ls -trl /data/pgarc/
total 262148
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000007
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000008
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000009
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000A
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000B
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000C
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000D
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000E
-rw------- 1 postgres postgres 41 Jan 13 08:59 00000002.history
-rw------- 1 postgres postgres 16777216 Jan 13 08:59 00000001000000000000000F.partial
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 00000002000000000000000F
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000010
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000011
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000012
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000013
-rw------- 1 postgres postgres 16777216 Jan 13 09:02 000000020000000000000014
[postgres@duqk02 pg_wal]$
在原主库上模拟数据变动
代码语言:javascript复制* /192.168.13.31/ *
终端1
postgres=# select count(*) from t;
count
-------
50001
(1 row)
postgres=# insert into t select generate_series(1,10000);
INSERT 0 10000
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/F106C98
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/100000F0
(1 row)
postgres=# insert into t select generate_series(1,10000);
INSERT 0 10000
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/11106D88
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/120000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/130000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/140000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/150000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/16000128
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/17000128
(1 row)
postgres=# insert into t select generate_series(1,10000);
INSERT 0 10000
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/180A4558
(1 row)
postgres=#
postgres=# select count(*) from t;
count
-------
80001
(1 row)
postgres=#
终端2
[postgres@duqk01 pg_wal]$ ls -trl
total 180232
-rw------- 1 postgres postgres 338 Sep 7 16:34 000000010000000000000004.00000028.backup
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 00000001000000000000000F
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000010
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000011
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000012
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000013
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000014
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000015
-rw------- 1 postgres postgres 16777216 Jan 13 09:04 000000010000000000000016
-rw------- 1 postgres postgres 16777216 Jan 13 09:04 000000010000000000000017
-rw------- 1 postgres postgres 16777216 Jan 13 09:04 000000010000000000000018
drwx------ 2 postgres postgres 4096 Jan 13 09:04 archive_status
-rw------- 1 postgres postgres 16777216 Jan 13 09:05 000000010000000000000019
[postgres@duqk01 pg_wal]$ ls -trl /data/pgarc/
total 393220
-rw------- 1 postgres postgres 16777216 Sep 7 16:13 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Sep 7 16:29 000000010000000000000002
-rw------- 1 postgres postgres 16777216 Sep 7 16:34 000000010000000000000003
-rw------- 1 postgres postgres 16777216 Sep 7 16:34 000000010000000000000004
-rw------- 1 postgres postgres 338 Sep 7 16:34 000000010000000000000004.00000028.backup
-rw------- 1 postgres postgres 16777216 Sep 7 16:44 000000010000000000000005
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000007
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000008
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000009
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000A
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000B
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000C
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000D
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000E
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 00000001000000000000000F
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000010
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000011
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000012
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000013
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000014
-rw------- 1 postgres postgres 16777216 Jan 13 09:03 000000010000000000000015
-rw------- 1 postgres postgres 16777216 Jan 13 09:04 000000010000000000000016
-rw------- 1 postgres postgres 16777216 Jan 13 09:04 000000010000000000000017
-rw------- 1 postgres postgres 16777216 Jan 13 09:04 000000010000000000000018
[postgres@duqk01 pg_wal]$
在原主库上修改数据库日志目录的文件和内容:
代码语言:javascript复制[postgres@duqk01 ~]$ ll /data/pgdata/pg_log
total 80K
-rw------- 1 postgres postgres 53K Sep 7 16:44 postgresql-2022-09-07.csv
-rw------- 1 postgres postgres 1.0K Sep 7 16:29 postgresql-2022-09-07.log
-rw------- 1 postgres postgres 15K Jan 13 09:04 postgresql-2023-01-13.csv
-rw------- 1 postgres postgres 253 Jan 13 08:54 postgresql-2023-01-13.log
[postgres@duqk01 ~]$ rm -fr /data/pgdata/pg_log/postgresql-2022-09-07*
[postgres@duqk01 ~]$ ll /data/pgdata/pg_log
total 20K
-rw------- 1 postgres postgres 15K Jan 13 09:04 postgresql-2023-01-13.csv
-rw------- 1 postgres postgres 253 Jan 13 08:54 postgresql-2023-01-13.log
[postgres@duqk01 ~]$ vi /data/pgdata/pg_log/postgresql-2023-01-13.log
[postgres@duqk01 ~]$ cat /data/pgdata/pg_log/postgresql-2023-01-13.log
mytest
2023-01-13 08:54:29.538 CST [9878] LOG: 00000: ending log output to stderr
2023-01-13 08:54:29.538 CST [9878] HINT: Future log output will go to log destination "csvlog".
2023-01-13 08:54:29.538 CST [9878] LOCATION: PostmasterMain, postmaster.c:1119
[postgres@duqk01 ~]$
确认新主库的postgresql.auto.conf 的内容:
代码语言:javascript复制 /*192.168.13.32*/
[postgres@duqk02 ~]$ cat /data/pgdata/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=replica password=replica channel_binding=prefer host=192.168.13.31 port=1921 sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
[postgres@duqk02 ~]$
对原主库进行pg_rewind. 需要指定–restore-target-wal -R
/*192.168.13.31*/
[postgres@duqk01 ~]$ pg_ctl stop
waiting for server to shut down.... done
server stopped
[postgres@duqk01 pg_wal]$ pg_rewind --target-pgdata=$PGDATA --source-server='host=192.168.13.32 port=1921 user=postgres password=postgres dbname=postgres' -P
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/F063640 on timeline 1
pg_rewind: error: could not open file "/data/pgdata/pg_wal/00000001000000000000000E": No such file or directory
pg_rewind: fatal: could not find previous WAL record at 0/E0000D8
[postgres@duqk01 ~]$ cat /data/pgdata/postgresql.conf|grep restore
restore_command = 'cp /data/pgarc/%f %p' # command to use to restore an archived logfile segment
# placeholders: %p = path of file to restore
#recovery_target_name = '' # the named restore point to which recovery will proceed
[postgres@duqk01 ~]$ pg_rewind --target-pgdata=$PGDATA --source-server='host=192.168.13.32 port=1921 user=postgres password=postgres dbname=postgres' -P --restore-target-wal -R
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/F063640 on timeline 1
pg_rewind: rewinding from last common checkpoint at 0/E000060 on timeline 1
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 197 MB (total source directory size is 225 MB)
201937/201937 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!
[postgres@duqk01 ~]$
pg_rewind之后的验证,/data/pgdata/postgresql.auto.conf
增加了primary_conninfo
条目,pg_wal和pg_log目录内容发生变换,从新的主库复制过来的:
* /192.168.13.31/ *
[postgres@duqk01 ~]$ cat /data/pgdata/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=replica password=replica channel_binding=prefer host=192.168.13.31 port=1921 sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
primary_conninfo = 'user=postgres password=postgres channel_binding=prefer host=192.168.13.32 port=1921 sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
[postgres@duqk01 ~]$ ll /data/pgdata/pg_wal/
total 193M
-rw------- 1 postgres postgres 16M Jan 13 09:10 000000020000000000000013
-rw------- 1 postgres postgres 16M Jan 13 09:10 000000020000000000000014
-rw------- 1 postgres postgres 16M Jan 13 09:10 000000020000000000000015
-rw------- 1 postgres postgres 16M Jan 13 09:10 000000020000000000000016
-rw------- 1 postgres postgres 16M Jan 13 09:10 000000020000000000000017
-rw------- 1 postgres postgres 16M Jan 13 09:10 000000020000000000000018
-rw------- 1 postgres postgres 16M Jan 13 09:10 000000020000000000000019
-rw------- 1 postgres postgres 16M Jan 13 09:10 00000002000000000000001A
-rw------- 1 postgres postgres 16M Jan 13 09:10 00000002000000000000001B
-rw------- 1 postgres postgres 16M Jan 13 09:10 00000002000000000000001C
-rw------- 1 postgres postgres 16M Jan 13 09:10 00000002000000000000001D
-rw------- 1 postgres postgres 16M Jan 13 09:10 00000002000000000000001E
-rw------- 1 postgres postgres 41 Jan 13 09:10 00000002.history
drwx------ 2 postgres postgres 109 Jan 13 09:10 archive_status
[postgres@duqk01 ~]$ ll /data/pgdata/pg_log/
total 60K
-rw------- 1 postgres postgres 34K Jan 13 09:10 postgresql-2022-09-07.csv
-rw------- 1 postgres postgres 5.0K Jan 13 09:10 postgresql-2022-09-07.log
-rw------- 1 postgres postgres 12K Jan 13 09:10 postgresql-2023-01-13.csv
-rw------- 1 postgres postgres 823 Jan 13 09:10 postgresql-2023-01-13.log
[postgres@duqk01 ~]$ cat /data/pgdata/pg_log/postgresql-2023-01-13.log
2023-01-13 08:54:34.649 CST [9916] LOG: 00000: ending log output to stderr
2023-01-13 08:54:34.649 CST [9916] HINT: Future log output will go to log destination "csvlog".
2023-01-13 08:54:34.649 CST [9916] LOCATION: PostmasterMain, postmaster.c:1119
cp: cannot stat ‘/data/pgarc/00000002.history’: No such file or directory
cp: cannot stat ‘/data/pgarc/000000010000000000000005’: No such file or directory
cp: cannot stat ‘/data/pgarc/00000002.history’: No such file or directory
cp: cannot stat ‘/data/pgarc/00000001000000000000000F’: No such file or directory
cp: cannot stat ‘/data/pgarc/00000001000000000000000F’: No such file or directory
cp: cannot stat ‘/data/pgarc/00000002.history’: No such file or directory
cp: cannot stat ‘/data/pgarc/00000001.history’: No such file or directory
[postgres@duqk01 ~]$
新的主库继续模拟数据变化
代码语言:javascript复制 /*192.168.13.32*/
终端1
postgres=# insert into t select generate_series(1,100);
INSERT 0 100
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/15007E18
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/160000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/170000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/18000128
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/190000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/1A0000F0
(1 row)
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/1B0000F0
(1 row)
postgres=# insert into t select generate_series(1,100);
INSERT 0 100
postgres=# checkpoint;select pg_switch_wal();
CHECKPOINT
pg_switch_wal
---------------
0/1C003318
(1 row)
postgres=#
终端2
[postgres@duqk02 pg_wal]$ ls -trl
total 147460
-rw------- 1 postgres postgres 41 Jan 13 08:59 00000002.history
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 00000002000000000000001E
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 00000002000000000000001F
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 000000020000000000000020
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 000000020000000000000021
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 000000020000000000000022
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 00000002000000000000001A
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 00000002000000000000001B
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 00000002000000000000001C
drwx------ 2 postgres postgres 146 Jan 13 09:14 archive_status
-rw------- 1 postgres postgres 16777216 Jan 13 09:15 00000002000000000000001D
[postgres@duqk02 pg_wal]$ ls -trl /data/pgarc/
total 393220
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000007
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000008
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 000000010000000000000009
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000A
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000B
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000C
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000D
-rw------- 1 postgres postgres 16777216 Jan 13 08:56 00000001000000000000000E
-rw------- 1 postgres postgres 41 Jan 13 08:59 00000002.history
-rw------- 1 postgres postgres 16777216 Jan 13 08:59 00000001000000000000000F.partial
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 00000002000000000000000F
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000010
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000011
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000012
-rw------- 1 postgres postgres 16777216 Jan 13 09:01 000000020000000000000013
-rw------- 1 postgres postgres 16777216 Jan 13 09:02 000000020000000000000014
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 000000020000000000000015
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 000000020000000000000016
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 000000020000000000000017
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 000000020000000000000018
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 000000020000000000000019
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 00000002000000000000001A
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 00000002000000000000001B
-rw------- 1 postgres postgres 16777216 Jan 13 09:14 00000002000000000000001C
[postgres@duqk02 pg_wal]$
新主库添加pg_hba条目:
代码语言:javascript复制* /192.168.13.32/ *
[postgres@duqk02 pg_wal]$ vi /data/pgdata/pg_hba.conf
增加
host replication all 192.168.13.31/32 md5
host replication all 192.168.13.32/32 md5
[postgres@duqk02 pg_wal]$ pg_ctl reload
server signaled
[postgres@duqk02 pg_wal]$
原主库启动数据库
代码语言:javascript复制 /*192.168.13.31*/
[postgres@duqk01 ~]$ touch /data/pgdata/standby.signal
[postgres@duqk01 ~]$ pg_ctl start
waiting for server to start....2023-01-13 09:22:14.949 CST [10085] LOG: 00000: redirecting log output to logging collector process
2023-01-13 09:22:14.949 CST [10085] HINT: Future log output will appear in directory "pg_log".
2023-01-13 09:22:14.949 CST [10085] LOCATION: SysLogger_Start, syslogger.c:674
............................ stopped waiting
pg_ctl: could not start server
Examine the log output.
数据库日志:
代码语言:javascript复制2023-01-13 09:22:14.949 CST,,,10085,,63c0b246.2765,1,,2023-01-13 09:22:14 CST,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,"PostmasterMain, postmaster.c:1119","","postmaster",,0
2023-01-13 09:22:14.949 CST,,,10085,,63c0b246.2765,2,,2023-01-13 09:22:14 CST,,0,LOG,00000,"starting PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit",,,,,,,,"PostmasterMain, postmaster.c:1129","","postmaster",,0
2023-01-13 09:22:14.949 CST,,,10085,,63c0b246.2765,3,,2023-01-13 09:22:14 CST,,0,LOG,00000,"listening on IPv4 address ""0.0.0.0"", port 1921",,,,,,,,"StreamServerPort, pqcomm.c:585","","postmaster",,0
2023-01-13 09:22:14.949 CST,,,10085,,63c0b246.2765,4,,2023-01-13 09:22:14 CST,,0,LOG,00000,"listening on IPv6 address ""::"", port 1921",,,,,,,,"StreamServerPort, pqcomm.c:585","","postmaster",,0
2023-01-13 09:22:14.950 CST,,,10085,,63c0b246.2765,5,,2023-01-13 09:22:14 CST,,0,LOG,00000,"listening on Unix socket ""./.s.PGSQL.1921""",,,,,,,,"StreamServerPort, pqcomm.c:579","","postmaster",,0
2023-01-13 09:22:14.954 CST,,,10087,,63c0b246.2767,1,,2023-01-13 09:22:14 CST,,0,LOG,00000,"database system was interrupted while in recovery at log time 2023-01-13 09:07:01 CST",,"If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.",,,,,,"StartupXLOG, xlog.c:6570","","startup",,0
2023-01-13 09:22:14.969 CST,,,10087,,63c0b246.2767,2,,2023-01-13 09:22:14 CST,,0,LOG,00000,"entering standby mode",,,,,,,,"StartupXLOG, xlog.c:6638","","startup",,0
2023-01-13 09:22:15.253 CST,,,10087,,63c0b246.2767,3,,2023-01-13 09:22:14 CST,,0,LOG,00000,"restored log file ""00000001000000000000000E"" from archive",,,,,,,,"RestoreArchivedFile, xlogarchive.c:215","","startup",,0
2023-01-13 09:22:15.273 CST,,,10087,,63c0b246.2767,4,,2023-01-13 09:22:14 CST,1/0,0,LOG,00000,"redo starts at 0/E000028",,,,,,,,"StartupXLOG, xlog.c:7378","","startup",,0
2023-01-13 09:22:15.733 CST,,,10087,,63c0b246.2767,5,,2023-01-13 09:22:14 CST,1/0,0,LOG,00000,"restored log file ""00000001000000000000000F"" from archive",,,,,,,,"RestoreArchivedFile, xlogarchive.c:215","","startup",,0
2023-01-13 09:22:16.178 CST,,,10087,,63c0b246.2767,6,,2023-01-13 09:22:14 CST,1/0,0,LOG,00000,"restored log file ""000000010000000000000010"" from archive",,,,,,,,"RestoreArchivedFile, xlogarchive.c:215","","startup",,0
2023-01-13 09:22:16.613 CST,,,10087,,63c0b246.2767,7,,2023-01-13 09:22:14 CST,1/0,0,LOG,00000,"restored log file ""000000010000000000000011"" from archive",,,,,,,,"RestoreArchivedFile, xlogarchive.c:215","","startup",,0
2023-01-13 09:22:17.221 CST,,,10087,,63c0b246.2767,8,,2023-01-13 09:22:14 CST,1/0,0,LOG,00000,"restored log file ""000000010000000000000012"" from archive",,,,,,,,"RestoreArchivedFile, xlogarchive.c:215","","startup",,0
2023-01-13 09:22:17.276 CST,,,10087,,63c0b246.2767,9,,2023-01-13 09:22:14 CST,1/0,0,PANIC,XX000,"unexpected timeline ID 2 (should be 1) in checkpoint record",,,,,"WAL redo at 0/13000060 for XLOG/CHECKPOINT_ONLINE: redo 0/13000028; tli 2; prev tli 2; fpw true; xid 0:757; oid 32774; multi 1; offset 0; oldest xid 727 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 757; online",,,"xlog_redo, xlog.c:10490","","startup",,0
2023-01-13 09:22:43.677 CST,,,10085,,63c0b246.2765,6,,2023-01-13 09:22:14 CST,,0,LOG,00000,"startup process (PID 10087) was terminated by signal 6: Aborted",,,,,,,,"LogChildExit, postmaster.c:3759","","postmaster",,0
2023-01-13 09:22:43.677 CST,,,10085,,63c0b246.2765,7,,2023-01-13 09:22:14 CST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,"HandleChildCrash, postmaster.c:3482","","postmaster",,0
2023-01-13 09:22:43.677 CST,,,10085,,63c0b246.2765,8,,2023-01-13 09:22:14 CST,,0,LOG,00000,"shutting down due to startup process failure",,,,,,,,"PostmasterStateMachine, postmaster.c:4020","","postmaster",,0
2023-01-13 09:22:43.779 CST,,,10085,,63c0b246.2765,9,,2023-01-13 09:22:14 CST,,0,LOG,00000,"database system is shut down",,,,,,,,"UnlinkLockFiles, miscinit.c:963","","postmaster",,0
由于此时原主库的restore_command是指向本地的归档,应用不到新主库的归档,所以得修改restore_command指向新的主库归档,前提是节点之间做好互信。
代码语言:javascript复制 /*192.168.13.31*/
[postgres@duqk01 ~]$ cat /data/pgdata/postgresql.conf |grep restore
restore_command = 'cp /data/pgarc/%f %p' # command to use to restore an archived logfile segment
# placeholders: %p = path of file to restore
#recovery_target_name = '' # the named restore point to which recovery will proceed
[postgres@duqk01 ~]$
[postgres@duqk01 ~]$ vi /data/pgdata/postgresql.conf
[postgres@duqk01 ~]$ cat /data/pgdata/postgresql.conf |grep restore
restore_command = 'scp 192.168.13.32:/data/pgarc/%f %p' # command to use to restore an archived logfile segment
# placeholders: %p = path of file to restore
#recovery_target_name = '' # the named restore point to which recovery will proceed
[postgres@duqk01 ~]$
----先设置好节点间的互信
[postgres@duqk01 ~]$ pg_ctl start
waiting for server to start....2023-01-13 09:24:05.990 CST [10118] LOG: 00000: redirecting log output to logging collector process
2023-01-13 09:24:05.990 CST [10118] HINT: Future log output will appear in directory "pg_log".
2023-01-13 09:24:05.990 CST [10118] LOCATION: SysLogger_Start, syslogger.c:674
.... done
server started
[postgres@duqk01 ~]$
[postgres@duqk01 ~]$ psql
psql (14.5)
Type "help" for help.
postgres=# select count(*) from t;
count
-------
500
(1 row)
postgres=#
在新的主库上查看流复制状态
代码语言:javascript复制 /*192.168.13.32*/
postgres=# select * from pg_replication_slotsgx
-[ RECORD 1 ]------- -----------
slot_name | rep
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | t
active_pid | 10518
xmin | 763
catalog_xmin |
restart_lsn | 0/1D003820
confirmed_flush_lsn |
wal_status | reserved
safe_wal_size |
two_phase | f
postgres=# select * from pg_replication_slotsgx
-[ RECORD 1 ]------- -----------
slot_name | rep
plugin |
slot_type | physical
datoid |
database |
temporary | f
active | t
active_pid | 10518
xmin | 763
catalog_xmin |
restart_lsn | 0/1D003820
confirmed_flush_lsn |
wal_status | reserved
safe_wal_size |
two_phase | f
postgres=#
结论
1.结合pg_rewind的原理,验证了pg_rewind后,原主库除了数据块和新的主库一致之外,原主库的wal日志,数据库日志,$PGDATA下的配置文件等,都是从新的主库拉取。
2.对于设置为archive_mode为always模式的流复制环境,在原主库做pg_rewind的时候,是需要本机的归档的。如果pg_rewind成功后,在启动原主库的时候,是需要新的主库的归档。
参考文章:https://blog.csdn.net/songyundong1993/article/ details/123380377 https://blog.csdn.net/dqk1982/article/ details/128670473