本文档详细记录如何一步一步在两台Linux服务器上,搭建PostgreSQL数据库主从模式的高可用数据库环境,以及如何手工切换主备库,及其注意事项。
为了实现数据库对于应用层的透明,配合浮动IP来实现,即使数据库执行主备切换,应用层依然通过浮动IP来访问数据库。即:后端数据库执行主备切换前后,应用系统无需做任何更改。
搭建环境选择在Rehat 7.9 PostgreSQL 13.2上。
基本思路和流程大致如下:
1 主库真实 IP为192.168.1.106,配置浮动IP为192.168.1.126,主机名为pgprimary,在其上安装并初始化PostgreSQL数据库;
2 备库真实IP为192.168.1.116,配置浮动IP为192.168.1.126,主机名为pgstandby,在其上只安装PostgreSQL数据库软件;
3 通常情况下,浮动IP 192.168.1.126运行在主库上,当主库出故障时,手工执行数据库的主备切换,备库成为新主库,然后新主库上启用浮动IP;
4 修复并重配原主库,使其成为新主库的备库;
主库
代码语言:javascript复制[root@Centos ~]# cp /etc/sysconfig/network-scripts/ifcfg-ens33 /etc/sysconfig/network-scripts/ifcfg-ens33:1
[root@Centos ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens33:1 TYPE="Ethernet" PROXY_METHOD="none" BROWSER_ONLY="no" BOOTPROTO="none" DEFROUTE="yes" IPV4_FAILURE_FATAL="no" IPV6INIT="yes" IPV6_AUTOCONF="yes" IPV6_DEFROUTE="yes" IPV6_FAILURE_FATAL="no" IPV6_ADDR_GEN_MODE="stable-privacy" NAME="ens33" UUID="02bc077a-e6b9-492b-a5a3-91bbd808b4e9" DEVICE="ens33:1" ONBOOT="yes" IPADDR="192.168.1.116" PREFIX="24" NM_CONTROLLED=no DNS1="114.114.114.114" IPV6_PRIVACY="no"
一、执行stream主备配置流程
1.1 主库创建流复制的用户
代码语言:javascript复制[postgres@pgprimary data]$ psql
postgres=# CREATE ROLE replica login replication encrypted password 'replica';
CREATE ROLE
postgres=#
1.2 主库修改pg_hba.conf文件,允许备库IP通过复制用户访问数据库
代码语言:javascript复制[postgres@pgprimary data]$ vi pg_hba.conf
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
host all all 0.0.0.0/0 md5
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host replication replica 192.168.1.116/32 md5
最后一行,添加了replica用户可以从备库IP 192.168.1.116访问主库。
代码语言:javascript复制[postgres@pgstandby 13.2]$ ll
total 16
drwxrwxr-x. 2 postgres postgres 4096 Jan 7 14:12 bin
drwxrwxr-x. 6 postgres postgres 4096 Jan 7 14:12 include
drwxrwxr-x. 4 postgres postgres 4096 Jan 7 14:12 lib
drwxrwxr-x. 8 postgres postgres 4096 Jan 7 14:12 share
1.3 备库上执行对于主库的基础备份
代码语言:javascript复制[postgres@pgstandby data]$ pwd
/data/postgres/13.2/data
[postgres@pgstandby 13.2]$ pg_basebackup -h 192.168.1.106 -p 5432 -U replica --password -X stream -Fp --progress -D $PGDATA -R
Password:
40128/40128 kB (100%), 2/2 tablespaces
注意,备份选项上带有-R选项。
1.4 备库就可以执行pg_ctl start启动了
这时,就可以看到备库服务器上自动生成了standby.signal文件。同时,也看到在$PGDATA路径下,数据库自动帮我们配置了关于流复制的主库的信息:
代码语言:javascript复制[postgres@pgstandby 13.2]$ pg_ctl start
[postgres@pgstandby 13.2]$ cat data/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=replica password=replica channel_binding=disable host=192.168.1.106 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
当然了,如果我们没有使用-R来备份主库的话。我们完全可以在备库上手工创建standby.signal文件,然后手工编辑postgresql.auto.conf,并在其内容中配置主库的信息。
代码语言:javascript复制
[postgres@pgstandby 13.2]$ ll
total 20
drwxrwxr-x. 2 postgres postgres 4096 Jan 7 14:12 bin
drwx------. 19 postgres postgres 4096 Jan 10 21:04 data
drwxrwxr-x. 6 postgres postgres 4096 Jan 7 14:12 include
drwxrwxr-x. 4 postgres postgres 4096 Jan 7 14:12 lib
drwxrwxr-x. 8 postgres postgres 4096 Jan 7 14:12 share
[postgres@pgstandby 13.2]$ cd data
[postgres@pgstandby data]$ ll
total 272
-rw-------. 1 postgres postgres 224 Jan 10 21:04 backup_label
-rw-------. 1 postgres postgres 220496 Jan 10 21:04 backup_manifest
drwx------. 7 postgres postgres 67 Jan 10 21:04 base
drwx------. 2 postgres postgres 4096 Jan 10 21:04 global
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_commit_ts
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_dynshmem
-rw-------. 1 postgres postgres 4896 Jan 10 21:04 pg_hba.conf
-rw-------. 1 postgres postgres 1636 Jan 10 21:04 pg_ident.conf
drwx------. 4 postgres postgres 68 Jan 10 21:04 pg_logical
drwx------. 4 postgres postgres 36 Jan 10 21:04 pg_multixact
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_notify
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_replslot
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_serial
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_snapshots
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_stat
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_stat_tmp
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_subtrans
drwx------. 2 postgres postgres 19 Jan 10 21:04 pg_tblspc
drwx------. 2 postgres postgres 6 Jan 10 21:04 pg_twophase
-rw-------. 1 postgres postgres 3 Jan 10 21:04 PG_VERSION
drwx------. 3 postgres postgres 60 Jan 10 21:04 pg_wal
drwx------. 2 postgres postgres 18 Jan 10 21:04 pg_xact
-rw-------. 1 postgres postgres 322 Jan 10 21:04 postgresql.auto.conf
-rw-------. 1 postgres postgres 27981 Jan 10 21:04 postgresql.conf
-rw-------. 1 postgres postgres 0 Jan 10 21:04 standby.signal
1.5 备库数据库进程信息
代码语言:javascript复制[postgres@pgstandby data]$ ps -ef|grep postgres
root 19760 19406 0 19:43 pts/1 00:00:00 su - postgres
postgres 19761 19760 0 19:43 pts/1 00:00:00 -bash
root 20509 20400 0 20:41 pts/2 00:00:00 su - postgres
postgres 20510 20509 0 20:41 pts/2 00:00:00 -bash
postgres 20924 1 0 21:18 ? 00:00:00 /data/postgres/13.2/bin/postgres
postgres 20925 20924 0 21:18 ? 00:00:00 postgres: startup recovering 000000010000000000000004
postgres 20926 20924 0 21:18 ? 00:00:00 postgres: checkpointer
postgres 20927 20924 0 21:18 ? 00:00:00 postgres: background writer
postgres 20928 20924 0 21:18 ? 00:00:00 postgres: stats collector
postgres 20929 20924 0 21:18 ? 00:00:00 postgres: walreceiver streaming 0/4000148
postgres 20944 20510 0 21:20 pts/2 00:00:00 ps -ef
postgres 20945 20510 0 21:20 pts/2 00:00:00 grep --color=auto postgres
[postgres@pgstandby data]$
备库上,可以看到walreceiver进程,正在读取日志streaming 0/4000148,执行恢复recovering 000000010000000000000004。
1.6 主库数据库进程信息
代码语言:javascript复制[postgres@pgprimary data]$ ps -ef|grep postgres
root 20334 19836 0 19:46 pts/2 00:00:00 su - postgres
postgres 20335 20334 0 19:46 pts/2 00:00:00 -bash
postgres 21221 1 0 20:57 ? 00:00:00 /data/postgres/13.2/bin/postgres
postgres 21223 21221 0 20:57 ? 00:00:00 postgres: checkpointer
postgres 21224 21221 0 20:57 ? 00:00:00 postgres: background writer
postgres 21225 21221 0 20:57 ? 00:00:00 postgres: walwriter
postgres 21226 21221 0 20:57 ? 00:00:00 postgres: autovacuum launcher
postgres 21227 21221 0 20:57 ? 00:00:00 postgres: stats collector
postgres 21228 21221 0 20:57 ? 00:00:00 postgres: logical replication launcher
postgres 21487 21221 0 21:18 ? 00:00:00 postgres: walsender replica 192.168.1.116(43648) streaming 0/4000148
postgres 21537 20335 0 21:22 pts/2 00:00:00 ps -ef
postgres 21538 20335 0 21:22 pts/2 00:00:00 grep --color=auto postgres
主库上看到,后台进程walsender,正在向replica 192.168.1.116(43648) streaming 0/4000148推送日志信息
1.7 主库查看数据库复制信息
代码语言:javascript复制[postgres@pgprimary data]$ psql -xc "select * from pg_stat_replication"
-[ RECORD 1 ]---- ------------------------------
pid | 21487
usesysid | 16404
usename | replica
application_name | walreceiver
client_addr | 192.168.1.116
client_hostname |
client_port | 43648
backend_start | 2022-01-10 21:18:57.112831 08
backend_xmin |
state | streaming
sent_lsn | 0/4000148
write_lsn | 0/4000148
flush_lsn | 0/4000148
replay_lsn | 0/4000148
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
reply_time | 2022-01-10 21:23:47.870841 08
二、主备切换及注意事项
如果因为意外或故障导致主库不可用的情况下,可以直接将备库提升为主库对外提供服务。然后视具体情况看原来的主库是否需要重建,或者是否待故障恢复之后,可以直接作为新的备库,然后从新的主库(原备库)同步数据。
下面是模拟切换步骤:
2.1 主库停止,模拟故障
代码语言:javascript复制[postgres@pgprimary ~]$ ps -ef|grep postgres
root 18132 18071 0 11:54 pts/0 00:00:00 su - postgres
postgres 18133 18132 0 11:54 pts/0 00:00:00 -bash
postgres 20582 1 0 15:14 ? 00:00:00 /data/postgres/13.2/bin/postgres
postgres 20584 20582 0 15:14 ? 00:00:00 postgres: checkpointer
postgres 20585 20582 0 15:14 ? 00:00:00 postgres: background writer
postgres 20586 20582 0 15:14 ? 00:00:00 postgres: walwriter
postgres 20587 20582 0 15:14 ? 00:00:00 postgres: autovacuum launcher
postgres 20588 20582 0 15:14 ? 00:00:00 postgres: stats collector
postgres 20589 20582 0 15:14 ? 00:00:00 postgres: logical replication launcher
root 20930 20816 0 15:32 pts/0 00:00:00 su - postgres
postgres 20931 20930 0 15:32 pts/0 00:00:00 -bash
postgres 21179 20582 0 15:50 ? 00:00:00 postgres: walsender replica 192.168.1.116(41508) streaming 0/8000148
postgres 21330 20931 0 16:03 pts/0 00:00:00 ps -ef
postgres 21331 20931 0 16:03 pts/0 00:00:00 grep --color=auto postgres
[postgres@pgprimary ~]$ pg_ctl status
pg_ctl: server is running (PID: 20582)
/data/postgres/13.2/bin/postgres
[postgres@pgprimary ~]$ pg_ctl stop -m fast
waiting for server to shut down....2022-05-27 16:03:33.186 CST [20582] LOG: received fast shutdown request
2022-05-27 16:03:33.198 CST [20582] LOG: aborting any active transactions
2022-05-27 16:03:33.199 CST [20582] LOG: background worker "logical replication launcher" (PID 20589) exited with exit code 1
2022-05-27 16:03:33.199 CST [20584] LOG: shutting down
2022-05-27 16:03:33.220 CST [20582] LOG: database system is shut down
done
server stopped
[postgres@pgprimary ~]$ ps -ef|grep postgres
root 18132 18071 0 11:54 pts/0 00:00:00 su - postgres
postgres 18133 18132 0 11:54 pts/0 00:00:00 -bash
root 20930 20816 0 15:32 pts/0 00:00:00 su - postgres
postgres 20931 20930 0 15:32 pts/0 00:00:00 -bash
postgres 21343 20931 0 16:03 pts/0 00:00:00 ps -ef
postgres 21344 20931 0 16:03 pts/0 00:00:00 grep --color=auto postgres
[postgres@pgprimary ~]$
通过pg_ctl stop -m fast停止原来的主库之后,数据库后台进程都没有了。
2.2 备库提升为新主库,对外提供服务
代码语言:javascript复制[postgres@pgstandby 13.2]$ ps -ef|grep postgres
root 18284 18060 0 14:44 pts/0 00:00:00 su - postgres
postgres 18285 18284 0 14:44 pts/0 00:00:00 -bash
postgres 19107 1 0 15:50 ? 00:00:00 /data/postgres/13.2/bin/postgres
postgres 19108 19107 0 15:50 ? 00:00:00 postgres: startup recovering 000000010000000000000008
postgres 19109 19107 0 15:50 ? 00:00:00 postgres: checkpointer
postgres 19110 19107 0 15:50 ? 00:00:00 postgres: background writer
postgres 19111 19107 0 15:50 ? 00:00:00 postgres: stats collector
postgres 19268 18285 0 16:05 pts/0 00:00:00 ps -ef
postgres 19269 18285 0 16:05 pts/0 00:00:00 grep --color=auto postgres
[postgres@pgstandby 13.2]$ pg_ctl status
pg_ctl: server is running (PID: 19107)
/data/postgres/13.2/bin/postgres
[postgres@pgstandby 13.2]$ pg_ctl promote
waiting for server to promote....2022-05-27 16:06:25.714 CST [19108] LOG: received promote request
2022-05-27 16:06:25.715 CST [19108] LOG: redo done at 0/8000148
2022-05-27 16:06:25.728 CST [19108] LOG: selected new timeline ID: 2
2022-05-27 16:06:25.979 CST [19108] LOG: archive recovery complete
2022-05-27 16:06:25.982 CST [19107] LOG: database system is ready to accept connections
done
server promoted
[postgres@pgstandby 13.2]$
[postgres@pgstandby 13.2]$ ps -ef|grep postgres
root 18284 18060 0 14:44 pts/0 00:00:00 su - postgres
postgres 18285 18284 0 14:44 pts/0 00:00:00 -bash
postgres 19107 1 0 15:50 ? 00:00:00 /data/postgres/13.2/bin/postgres
postgres 19109 19107 0 15:50 ? 00:00:00 postgres: checkpointer
postgres 19110 19107 0 15:50 ? 00:00:00 postgres: background writer
postgres 19111 19107 0 15:50 ? 00:00:00 postgres: stats collector
postgres 19347 19107 0 16:06 ? 00:00:00 postgres: walwriter
postgres 19348 19107 0 16:06 ? 00:00:00 postgres: autovacuum launcher
postgres 19349 19107 0 16:06 ? 00:00:00 postgres: logical replication launcher
postgres 19407 18285 0 16:07 pts/0 00:00:00 ps -ef
postgres 19408 18285 0 16:07 pts/0 00:00:00 grep --color=auto postgres
[postgres@pgstandby 13.2]$
重要1:启动备库为新主库的命令是pg_ctl promote。
提升备库为主库之后,可以看到,后台进程中不再有startup recovering,以及walreceiver streaming进程了。同时,多了postgres: walwriter 写进程。
重要2:$PGDATA/standby.signal文件自动消失了。这是告诉PostgreSQL,我现在不再是备库了,我的身份是主库了
2.3 新主库修改pg_hba.conf文件
修改新主库(原备库192.168.1.116)的$PGDATA/pg_hba.conf文件,在其中添加允许新备库(原主库192.168.1.106)可以通过replica用户访问数据库的条目信息。
代码语言:javascript复制host replication all 192.168.1.106/32 md5
注意:这里的192.168.1.126是原主库上配置的1个浮动IP地址,绑定在eth0:1设备上。如果主从环境的数据库没有配置浮动IP的话,则这里的IP地址,应该直接填原主库的实际IP地址。
2.4 原主库新建$PGDATA/standby.signal文件
代码语言:javascript复制[postgres@pgprimary ~]$ cd $PGDATA
[postgres@pgprimary data]$ touch standby.signal
[postgres@pgprimary data]$ pwd
/data/postgres/13.2/data
[postgres@pgprimary data]$ ll standby.signal
-rw-rw-r--. 1 postgres postgres 0 May 27 16:17 standby.signal
注意:这一步骤非常非常重要,如果不配置该文件的话,那么原来的主库一旦重新启动话,就将成为了1个新的独立主库,脱离了主从数据库环境。
2.5 原主库修改$PGDATA/postgresql.auto.conf文件
代码语言:javascript复制[postgres@pgprimary data]$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
[postgres@pgprimary data]$ vim postgresql.auto.conf
[postgres@pgprimary data]$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo='user=replica password=replica host=192.168.1.116 port=5432'
[postgres@pgprimary data]$
2.6 启动原主库,变为新备库
代码语言:javascript复制[postgres@pgprimary data]$ pg_ctl start -l /home/postgres/startup.log
waiting for server to start.... done
server started
[postgres@pgprimary data]$
[postgres@pgprimary data]$ ps -ef|grep postgres
root 18132 18071 0 11:54 pts/0 00:00:00 su - postgres
postgres 18133 18132 0 11:54 pts/0 00:00:00 -bash
root 20930 20816 0 15:32 pts/0 00:00:00 su - postgres
postgres 20931 20930 0 15:32 pts/0 00:00:00 -bash
root 22329 22133 0 16:56 pts/0 00:00:00 su - postgres
postgres 22330 22329 0 16:56 pts/0 00:00:00 -bash
postgres 22391 1 0 16:58 ? 00:00:00 /data/postgres/13.2/bin/postgres
postgres 22392 22391 0 16:58 ? 00:00:00 postgres: startup recovering 000000020000000000000008
postgres 22393 22391 0 16:58 ? 00:00:00 postgres: checkpointer
postgres 22394 22391 0 16:58 ? 00:00:00 postgres: background writer
postgres 22395 22391 0 16:58 ? 00:00:00 postgres: stats collector
root 22918 22717 0 17:17 pts/0 00:00:00 su - postgres
postgres 22919 22918 0 17:17 pts/0 00:00:00 -bash
postgres 23002 22391 0 17:18 ? 00:00:00 postgres: walreceiver streaming 0/8000798
root 23142 23100 0 17:28 pts/1 00:00:00 su - postgres
postgres 23143 23142 0 17:28 pts/1 00:00:00 -bash
postgres 23193 23143 0 17:28 pts/1 00:00:00 psql
postgres 23194 22391 0 17:28 ? 00:00:00 postgres: postgres postgres [local] idle
postgres 23228 22919 0 17:31 pts/0 00:00:00 ps -ef
postgres 23229 22919 0 17:31 pts/0 00:00:00 grep --color=auto postgres
三、 测试同步状态
代码语言:javascript复制[postgres@pgstandby ~]$ psql
psql (13.2)
Type "help" for help.
postgres=# select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 13.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
(1 row)
#查看所有数据库信息
postgres=# l
postgres=# du
List of roles
Role name | Attributes | Member of
----------- ------------------------------------------------------------ -----------
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
replica | Replication | {}
t_user | | {}
postgres=# c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# c testdb
You are now connected to database "testdb" as user "postgres".
testdb=# d
List of relations
Schema | Name | Type | Owner
-------- ------------ ------- --------
public | test_table | table | t_user
(1 row)
3.1 新主库(原备库192.168.1.116)插入数据
代码语言:javascript复制
postgres=# select * from pg_test0524;
id | name
---- ---------
1 | beijing
(1 row)
postgres=# insert into pg_test0524 values(2,'shanghai');
INSERT 0 1
postgres=# insert into pg_test0524 values(3,'tianjin');
INSERT 0 1
3.2 同步数据正常(新备库原主库)
代码语言:javascript复制Last login: Fri May 27 17:11:44 2022
[root@pgprimary ~]# su - postgres
Last login: Fri May 27 17:17:10 CST 2022 on pts/0
[postgres@pgprimary ~]$ psql
psql (13.2)
Type "help" for help.
postgres=# select * from pg_test0524;
id | name
---- ----------
1 | beijing
2 | shanghai
(2 rows)
postgres=# select * from pg_test0524;
id | name
---- ----------
1 | beijing
2 | shanghai
3 | tianjin
(3 rows)
postgres=#
四、手动切换回原主库
4.1 查看主库状态(192.168.1.116)
代码语言:javascript复制[postgres@pgstandby data]$ pg_ctl status
pg_ctl: server is running (PID: 18137)
/data/postgres/13.2/bin/postgres "-D" "/data/postgres/13.2/data"
[postgres@pgstandby data]$ echo $PGDATA
/data/postgres/13.2/data
[postgres@pgstandby data]$ ps -ef|grep postgres
root 18006 17942 0 11:54 pts/0 00:00:00 su - postgres
postgres 18007 18006 0 11:54 pts/0 00:00:00 -bash
postgres 18137 1 0 12:02 ? 00:00:00 /data/postgres/13.2/bin/postgres -D /data/postgres/13.2/data
postgres 18139 18137 0 12:02 ? 00:00:00 postgres: checkpointer
postgres 18140 18137 0 12:02 ? 00:00:00 postgres: background writer
postgres 18141 18137 0 12:02 ? 00:00:00 postgres: walwriter
postgres 18142 18137 0 12:02 ? 00:00:00 postgres: autovacuum launcher
postgres 18143 18137 0 12:02 ? 00:00:00 postgres: stats collector
postgres 18144 18137 0 12:02 ? 00:00:00 postgres: logical replication launcher
postgres 19604 18137 0 14:07 ? 00:00:00 postgres: walsender replica 192.168.1.106(41370) streaming 0/9000D80
root 19778 19623 0 14:18 pts/1 00:00:00 su - postgres
postgres 19779 19778 0 14:18 pts/1 00:00:00 -bash
postgres 20222 19779 0 14:53 pts/1 00:00:00 ps -ef
postgres 20223 19779 0 14:53 pts/1 00:00:00 grep --color=auto postgres
[postgres@pgstandby data]$ pg_ctl status
pg_ctl: server is running (PID: 18137)
/data/postgres/13.2/bin/postgres "-D" "/data/postgres/13.2/data"
4.2 停止主库,模拟故障
代码语言:javascript复制[postgres@pgstandby data]$ pg_ctl stop -m fast
waiting for server to shut down.... done
server stopped
[postgres@pgstandby data]$ ps -ef|grep postgres
root 18006 17942 0 11:54 pts/0 00:00:00 su - postgres
postgres 18007 18006 0 11:54 pts/0 00:00:00 -bash
root 19778 19623 0 14:18 pts/1 00:00:00 su - postgres
postgres 19779 19778 0 14:18 pts/1 00:00:00 -bash
postgres 20236 19779 0 14:54 pts/1 00:00:00 ps -ef
postgres 20237 19779 0 14:54 pts/1 00:00:00 grep --color=auto postgres
通过pg_ctl stop -m fast停止主库(192.168.30.116)之后,数据库后台进程都没有了。
代码语言:javascript复制查看备库的状态
[postgres@pgprimary data]$ ps -ef|grep postgres
root 18287 18230 0 11:54 pts/0 00:00:00 su - postgres
postgres 18288 18287 0 11:54 pts/0 00:00:00 -bash
postgres 18417 1 0 12:02 ? 00:00:00 /data/postgres/13.2/bin/postgres -D /data/postgres/13.2/data
postgres 18418 18417 0 12:02 ? 00:00:00 postgres: startup recovering 000000020000000000000009
postgres 18419 18417 0 12:02 ? 00:00:00 postgres: checkpointer
postgres 18420 18417 0 12:02 ? 00:00:00 postgres: background writer
postgres 18421 18417 0 12:02 ? 00:00:00 postgres: stats collector
postgres 19326 18288 0 13:34 pts/0 00:00:00 psql
postgres 19336 18417 0 13:34 ? 00:00:00 postgres: postgres postgres [local] idle
root 19779 19654 0 14:18 pts/1 00:00:00 su - postgres
postgres 19780 19779 0 14:18 pts/1 00:00:00 -bash
postgres 20177 19780 0 14:54 pts/1 00:00:00 ps -ef
postgres 20178 19780 0 14:54 pts/1 00:00:00 grep --color=auto postgres
[postgres@pgprimary data]$ pg_ctl status
pg_ctl: server is running (PID: 18417)
/data/postgres/13.2/bin/postgres "-D" "/data/postgres/13.2/data"
4.3 备注提升为主库pg_ctl promote
代码语言:javascript复制[postgres@pgprimary data]$ pg_ctl promote
waiting for server to promote.... done
server promoted
[postgres@pgprimary data]$ ps -ef|grep postgres
root 18287 18230 0 11:54 pts/0 00:00:00 su - postgres
postgres 18288 18287 0 11:54 pts/0 00:00:00 -bash
postgres 18417 1 0 12:02 ? 00:00:00 /data/postgres/13.2/bin/postgres -D /data/postgres/13.2/data
postgres 18419 18417 0 12:02 ? 00:00:00 postgres: checkpointer
postgres 18420 18417 0 12:02 ? 00:00:00 postgres: background writer
postgres 18421 18417 0 12:02 ? 00:00:00 postgres: stats collector
postgres 19326 18288 0 13:34 pts/0 00:00:00 psql
postgres 19336 18417 0 13:34 ? 00:00:00 postgres: postgres postgres [local] idle
root 19779 19654 0 14:18 pts/1 00:00:00 su - postgres
postgres 19780 19779 0 14:18 pts/1 00:00:00 -bash
postgres 20201 18417 0 14:55 ? 00:00:00 postgres: walwriter
postgres 20202 18417 0 14:55 ? 00:00:00 postgres: autovacuum launcher
postgres 20203 18417 0 14:55 ? 00:00:00 postgres: logical replication launcher
postgres 20204 19780 0 14:55 pts/1 00:00:00 ps -ef
postgres 20205 19780 0 14:55 pts/1 00:00:00 grep --color=auto postgres
4.4新主库修改pg_hba.conf文件,加入备库访问
代码语言:javascript复制[postgres@pgprimary data]$ cat $PGDATA/pg_hba.conf
# PostgreSQL Client Authentication Configuration File
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
host all all 0.0.0.0/0 md5
# IPv6 local connections:
host all all ::1/128 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host replication replica 192.168.1.116/32 md5
4.5 新备库创建文件 standby.signal
代码语言:javascript复制[postgres@pgstandby data]$ cd $PGDATA
[postgres@pgstandby data]$ pwd
/data/postgres/13.2/data
[postgres@pgstandby data]$ touch standby.signal
[postgres@pgstandby data]$ ll
查看连接主库的信息
[postgres@pgstandby data]$ vim postgresql.auto.conf
[postgres@pgstandby data]$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
primary_conninfo = 'user=replica password=replica channel_binding=disable host=192.168.1.106 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
4.6 启动原主库(第一次切换后的主库),变为新备库
代码语言:javascript复制[postgres@pgstandby data]$ pg_ctl start -l /home/postgres/startup.log
waiting for server to start.... done
server started
[postgres@pgstandby data]$ ps -ef|grep postgres
root 18006 17942 0 11:54 pts/0 00:00:00 su - postgres
postgres 18007 18006 0 11:54 pts/0 00:00:00 -bash
root 19778 19623 0 14:18 pts/1 00:00:00 su - postgres
postgres 19779 19778 0 14:18 pts/1 00:00:00 -bash
postgres 20330 1 0 15:02 ? 00:00:00 /data/postgres/13.2/bin/postgres
postgres 20331 20330 0 15:02 ? 00:00:00 postgres: startup recovering 000000030000000000000009
postgres 20332 20330 0 15:02 ? 00:00:00 postgres: checkpointer
postgres 20333 20330 0 15:02 ? 00:00:00 postgres: background writer
postgres 20334 20330 0 15:02 ? 00:00:00 postgres: stats collector
postgres 20335 20330 2 15:02 ? 00:00:00 postgres: walreceiver streaming 0/9000F10
postgres 20336 19779 0 15:02 pts/1 00:00:00 ps -ef
postgres 20337 19779 0 15:02 pts/1 00:00:00 grep --color=auto postgres
可以看到恢复进程开始工作,wal 进程 postgres: walreceiver streaming 0/9000F10
4.7 测试数据同步
代码语言:javascript复制主库插入数据
[postgres@pgprimary data]$ psql
psql (13.2)
Type "help" for help.
postgres-# d
List of relations
Schema | Name | Type | Owner
-------- ------------- ------- ----------
public | pg_test | table | postgres
public | pg_test0524 | table | postgres
public | test_tbs | table | postgres
(3 rows)
postgres-# c
You are now connected to database "postgres" as user "postgres".
postgres=# select * from pg_test0524;
id | name
---- -----------
1 | beijing
2 | shanghai
3 | tianjin
4 | xian
5 | guangzhou
6 | shenzheng
(6 rows)
postgres=# insert into pg_test0524 values(7,'langfang');
INSERT 0 1
postgres=#
备库查看同步情况
[postgres@pgstandby data]$ psql
psql (13.2)
Type "help" for help.
postgres=# d
List of relations
Schema | Name | Type | Owner
-------- ------------- ------- ----------
public | pg_test | table | postgres
public | pg_test0524 | table | postgres
public | test_tbs | table | postgres
(3 rows)
postgres=# c
You are now connected to database "postgres" as user "postgres".
postgres=# select * from pg_test0524;
id | name
---- -----------
1 | beijing
2 | shanghai
3 | tianjin
4 | xian
5 | guangzhou
6 | shenzheng
(6 rows)
postgres=# select * from pg_test0524;
id | name
---- -----------
1 | beijing
2 | shanghai
3 | tianjin
4 | xian
5 | guangzhou
6 | shenzheng
7 | langfang
(7 rows)
postgres=#
五 小结
- 随着新版本的发行,数据库的配置和使用也越来越简单顺手了。
- 备库提升为主库的命令:pg_ctl promote;
- 新主库(原备库)的pg_hba.conf文件,要开放允许流复制访问数据库的信息给原主库的IP地址;
- 原主库配置为新备库的时候,务必要创建$PGDATA/standby.signal文件;
- 原主库配置为新备库的时候,务必要修改$PGDATA/postgresql.auto.conf文件,添加主库primary_conninfo的信息;