作者:杨文
DBA,负责客户项目的需求与维护,会点数据库,不限于MySQL、Redis、Cassandra、GreenPlum、ClickHouse、Elastic、TDSQL等等。
本文来源:原创投稿
*爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。
一、前情提要:
我们知道Greenplum集群由Master Severs和Segment Severs组成。其中故障存在三种类别:Master故障、Segment故障、数据异常。之前我们已经聊过“Master故障”和“数据异常”的处理方式,今天将介绍Segment故障的处理方式。
二、本地模拟故障环境:
2.1、第一种情况:段故障。
代码语言:sql复制[gpadmin@master ~]$ gpstate
20221127:22:39:00:022659 gpstate:master:gpadmin-[INFO]:-Starting gpstate with args:
20221127:22:39:00:022659 gpstate:master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b'
20221127:22:39:00:022659 gpstate:master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Apr 16 2020 02:24:06'
20221127:22:39:00:022659 gpstate:master:gpadmin-[INFO]:-Obtaining Segment details from master...
20221127:22:39:00:022659 gpstate:master:gpadmin-[INFO]:-Gathering data from segments...
...
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:-Greenplum instance status summary
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Master instance = Active
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Master standby = standby
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Standby master state = Standby host passive
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total segment instance count from metadata = 40
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Primary Segment Status
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total primary segments = 20
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total primary segment valid (at master) = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total primary segment failures (at master) = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number of postmaster.pid files missing = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number of postmaster.pid files found = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number of postmaster.pid PIDs missing = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number of /tmp lock files missing = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number of /tmp lock files found = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number postmaster processes missing = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number postmaster processes found = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Mirror Segment Status
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total mirror segments = 20
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total mirror segment valid (at master) = 20
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total mirror segment failures (at master) = 0
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number of postmaster.pid files missing = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number of postmaster.pid files found = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number of postmaster.pid PIDs missing = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number of postmaster.pid PIDs found = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number of /tmp lock files missing = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number of /tmp lock files found = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number postmaster processes missing = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number postmaster processes found = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[WARNING]:-Total number mirror segments acting as primary segments = 4 <<<<<<<<
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:- Total number mirror segments acting as mirror segments = 16
20221127:22:39:03:022659 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
代码语言:sql复制[gpadmin@master ~]$ gpstate -m
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:-Starting gpstate with args: -m
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b'
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Apr 16 2020 02:24:06'
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:-Obtaining Segment details from master...
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:--------------------------------------------------------------
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:--Current GPDB mirror list and status
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:--Type = Group
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:--------------------------------------------------------------
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- Mirror Datadir Port Status Data Status
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data02 /greenplum/gpdata/mirror/gpseg0 56000 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data02 /greenplum/gpdata/mirror/gpseg1 56001 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data02 /greenplum/gpdata/mirror/gpseg2 56002 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data03 /greenplum/gpdata/mirror/gpseg3 56000 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data03 /greenplum/gpdata/mirror/gpseg4 56001 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data03 /greenplum/gpdata/mirror/gpseg5 56002 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data01 /greenplum/gpdata/mirror/gpseg6 56000 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data01 /greenplum/gpdata/mirror/gpseg7 56001 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data01 /greenplum/gpdata/mirror/gpseg8 56002 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[WARNING]:-data05 /greenplum/gpdata/mirror/gpseg9 56000 Failed <<<<<<<<
20221127:22:44:55:023196 gpstate:master:gpadmin-[WARNING]:-data05 /greenplum/gpdata/mirror/gpseg10 56001 Failed <<<<<<<<
20221127:22:44:55:023196 gpstate:master:gpadmin-[WARNING]:-data05 /greenplum/gpdata/mirror/gpseg11 56002 Failed <<<<<<<<
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data04 /greenplum/gpdata/mirror/gpseg12 56000 Acting as Primary Not In Sync
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data04 /greenplum/gpdata/mirror/gpseg13 56001 Acting as Primary Not In Sync
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data04 /greenplum/gpdata/mirror/gpseg14 56002 Acting as Primary Not In Sync
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data02 /greenplum/gpdata/mirror/gpseg15 56003 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data03 /greenplum/gpdata/mirror/gpseg16 56003 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data04 /greenplum/gpdata/mirror/gpseg17 56003 Passive Synchronized
20221127:22:44:55:023196 gpstate:master:gpadmin-[WARNING]:-data05 /greenplum/gpdata/mirror/gpseg18 56003 Failed <<<<<<<<
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:- data01 /greenplum/gpdata/mirror/gpseg19 56003 Acting as Primary Not In Sync
20221127:22:44:55:023196 gpstate:master:gpadmin-[INFO]:--------------------------------------------------------------
20221127:22:44:55:023196 gpstate:master:gpadmin-[WARNING]:-4 segment(s) configured as mirror(s) are acting as primaries
20221127:22:44:55:023196 gpstate:master:gpadmin-[WARNING]:-4 segment(s) configured as mirror(s) have failed
20221127:22:44:55:023196 gpstate:master:gpadmin-[WARNING]:-4 mirror segment(s) acting as primaries are not synchronized
2.2、第二种情况:表空间故障。
代码语言:sql复制[gpadmin@data05 ~]$ cd /greenplum/gpdata/mirror/gpseg10
[gpadmin@data05 gpseg10]$ ls
backup_label.old gpmetrics pg_clog pg_logical pg_stat PG_VERSION postmaster.pid
base gpperfmon pg_distributedlog pg_multixact pg_stat_tmp pg_xlog recovery.conf
fts_probe_file.bak gpsegconfig_dump pg_dynshmem pg_notify pg_subtrans postgresql.auto.conf recovery.done
global gpssh.conf pg_hba.conf pg_replslot pg_tblspc postgresql.conf
gpexpand.pid internal.auto.conf pg_ident.conf pg_serial pg_twophase postgresql.conf.bak
gpexpand.status internal.auto.conf.bak pg_log pg_snapshots pg_utilitymodedtmredo postmaster.opts
[gpadmin@data05 gpseg10]$ rm -rf pg_tblspc/
代码语言:sql复制[gpadmin@master ~]$ gpstate -e
20221127:23:13:29:026114 gpstate:master:gpadmin-[INFO]:-Starting gpstate with args: -e
20221127:23:13:29:026114 gpstate:master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b'
20221127:23:13:29:026114 gpstate:master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Apr 16 2020 02:24:06'
20221127:23:13:29:026114 gpstate:master:gpadmin-[INFO]:-Obtaining Segment details from master...
20221127:23:13:29:026114 gpstate:master:gpadmin-[INFO]:-Gathering data from segments...
20221127:23:13:30:026114 gpstate:master:gpadmin-[WARNING]:-pg_stat_replication shows no standby connections
20221127:23:13:30:026114 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:23:13:30:026114 gpstate:master:gpadmin-[INFO]:-Segment Mirroring Status Report
20221127:23:13:30:026114 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:23:13:30:026114 gpstate:master:gpadmin-[INFO]:-Downed Segments (may include segments where status could not be retrieved)
20221127:23:13:30:026114 gpstate:master:gpadmin-[INFO]:- Segment Port Config status Status
20221127:23:13:30:026114 gpstate:master:gpadmin-[INFO]:- data05 56001 Up Process error -- database process may be down
三、故障分析及解决:
3.1、针对“2.1”情况的处理:
在线生成一个配置文件:
代码语言:sql复制[gpadmin@master ~]$ gprecoverseg -o ./recover1
20221127:22:48:41:023405 gprecoverseg:master:gpadmin-[INFO]:-Starting gprecoverseg with args: -o ./recover1
20221127:22:48:41:023405 gprecoverseg:master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b'
20221127:22:48:41:023405 gprecoverseg:master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Apr 16 2020 02:24:06'
20221127:22:48:41:023405 gprecoverseg:master:gpadmin-[INFO]:-Obtaining Segment details from master...
20221127:22:48:41:023405 gprecoverseg:master:gpadmin-[INFO]:-Configuration file output to ./recover1 successfully.
[gpadmin@master ~]$ more recover1
data05|55000|/greenplum/gpdata/primary/gpseg12
data05|55001|/greenplum/gpdata/primary/gpseg13
data05|55002|/greenplum/gpdata/primary/gpseg14
data05|55003|/greenplum/gpdata/primary/gpseg19
通过生成的配置文件进行修复集群:
代码语言:sql复制[gpadmin@master ~]$ gprecoverseg -i ./recover1 -a
检查状态:
代码语言:sql复制[gpadmin@master ~]$ gpstate -e
20221127:22:56:57:024771 gpstate:master:gpadmin-[INFO]:-Starting gpstate with args: -e
20221127:22:56:57:024771 gpstate:master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b'
20221127:22:56:57:024771 gpstate:master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Apr 16 2020 02:24:06'
20221127:22:56:57:024771 gpstate:master:gpadmin-[INFO]:-Obtaining Segment details from master...
20221127:22:56:57:024771 gpstate:master:gpadmin-[INFO]:-Gathering data from segments...
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:-Segment Mirroring Status Report
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:-Segments with Primary and Mirror Roles Switched
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:- Current Primary Port Mirror Port
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:- data04 56000 data05 55000
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:- data04 56001 data05 55001
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:- data04 56002 data05 55002
20221127:22:56:58:024771 gpstate:master:gpadmin-[INFO]:- data01 56003 data05 55003
代码语言:sql复制[gpadmin@master ~]$ psql -c "select * from gp_segment_configuration order by content asc,dbid;"
dbid | content | role | preferred_role | mode | status | port | hostname | address | datadir
------ --------- ------ ---------------- ------ -------- ------- ---------- --------- -----------------------------------
44 | -1 | p | p | s | u | 5432 | master | master | /greenplum/gpdata/master/gpseg-1
45 | -1 | m | m | s | u | 5432 | standby | standby | /greenplum/gpdata/master/gpseg-1
2 | 0 | p | p | s | u | 55000 | data01 | data01 | /greenplum/gpdata/primary/gpseg0
11 | 0 | m | m | s | u | 56000 | data02 | data02 | /greenplum/gpdata/mirror/gpseg0
3 | 1 | p | p | s | u | 55001 | data01 | data01 | /greenplum/gpdata/primary/gpseg1
12 | 1 | m | m | s | u | 56001 | data02 | data02 | /greenplum/gpdata/mirror/gpseg1
4 | 2 | p | p | s | u | 55002 | data01 | data01 | /greenplum/gpdata/primary/gpseg2
13 | 2 | m | m | s | u | 56002 | data02 | data02 | /greenplum/gpdata/mirror/gpseg2
5 | 3 | p | p | s | u | 55000 | data02 | data02 | /greenplum/gpdata/primary/gpseg3
14 | 3 | m | m | s | u | 56000 | data03 | data03 | /greenplum/gpdata/mirror/gpseg3
6 | 4 | p | p | s | u | 55001 | data02 | data02 | /greenplum/gpdata/primary/gpseg4
15 | 4 | m | m | s | u | 56001 | data03 | data03 | /greenplum/gpdata/mirror/gpseg4
7 | 5 | p | p | s | u | 55002 | data02 | data02 | /greenplum/gpdata/primary/gpseg5
16 | 5 | m | m | s | u | 56002 | data03 | data03 | /greenplum/gpdata/mirror/gpseg5
8 | 6 | p | p | s | u | 55000 | data03 | data03 | /greenplum/gpdata/primary/gpseg6
17 | 6 | m | m | s | u | 56000 | data01 | data01 | /greenplum/gpdata/mirror/gpseg6
9 | 7 | p | p | s | u | 55001 | data03 | data03 | /greenplum/gpdata/primary/gpseg7
18 | 7 | m | m | s | u | 56001 | data01 | data01 | /greenplum/gpdata/mirror/gpseg7
10 | 8 | p | p | s | u | 55002 | data03 | data03 | /greenplum/gpdata/primary/gpseg8
19 | 8 | m | m | s | u | 56002 | data01 | data01 | /greenplum/gpdata/mirror/gpseg8
21 | 9 | p | p | s | u | 55000 | data04 | data04 | /greenplum/gpdata/primary/gpseg9
30 | 9 | m | m | s | u | 56000 | data05 | data05 | /greenplum/gpdata/mirror/gpseg9
22 | 10 | p | p | s | u | 55001 | data04 | data04 | /greenplum/gpdata/primary/gpseg10
31 | 10 | m | m | s | u | 56001 | data05 | data05 | /greenplum/gpdata/mirror/gpseg10
23 | 11 | p | p | s | u | 55002 | data04 | data04 | /greenplum/gpdata/primary/gpseg11
32 | 11 | m | m | s | u | 56002 | data05 | data05 | /greenplum/gpdata/mirror/gpseg11
24 | 12 | m | p | s | u | 55000 | data05 | data05 | /greenplum/gpdata/primary/gpseg12
27 | 12 | p | m | s | u | 56000 | data04 | data04 | /greenplum/gpdata/mirror/gpseg12
25 | 13 | m | p | s | u | 55001 | data05 | data05 | /greenplum/gpdata/primary/gpseg13
28 | 13 | p | m | s | u | 56001 | data04 | data04 | /greenplum/gpdata/mirror/gpseg13
26 | 14 | m | p | s | u | 55002 | data05 | data05 | /greenplum/gpdata/primary/gpseg14
29 | 14 | p | m | s | u | 56002 | data04 | data04 | /greenplum/gpdata/mirror/gpseg14
33 | 15 | p | p | s | u | 55003 | data01 | data01 | /greenplum/gpdata/primary/gpseg15
39 | 15 | m | m | s | u | 56003 | data02 | data02 | /greenplum/gpdata/mirror/gpseg15
34 | 16 | p | p | s | u | 55003 | data02 | data02 | /greenplum/gpdata/primary/gpseg16
40 | 16 | m | m | s | u | 56003 | data03 | data03 | /greenplum/gpdata/mirror/gpseg16
35 | 17 | p | p | s | u | 55003 | data03 | data03 | /greenplum/gpdata/primary/gpseg17
41 | 17 | m | m | s | u | 56003 | data04 | data04 | /greenplum/gpdata/mirror/gpseg17
36 | 18 | p | p | s | u | 55003 | data04 | data04 | /greenplum/gpdata/primary/gpseg18
42 | 18 | m | m | s | u | 56003 | data05 | data05 | /greenplum/gpdata/mirror/gpseg18
37 | 19 | m | p | s | u | 55003 | data05 | data05 | /greenplum/gpdata/primary/gpseg19
38 | 19 | p | m | s | u | 56003 | data01 | data01 | /greenplum/gpdata/mirror/gpseg19
(42 rows)
可以看到所有段都是up状态了,但存在部分段角色异常。
修复角色状态:
代码语言:sql复制[gpadmin@master ~]$ gprecoverseg -r
再次检查确认状态,此处省略。
3.2、针对“2.2”情况的处理:
如果可以自动生成配置文件,就使用自动生成的。如果无法自动生成,则手工创建:
代码语言:sql复制[gpadmin@master ~]$ vi recover2
data05|56001|/greenplum/gpdata/mirror/gpseg10
通过生成的配置文件进行修复集群:
代码语言:sql复制[gpadmin@master ~]$ gprecoverseg -i ./recover2 -a
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:-Starting gprecoverseg with args: -i ./recover2 -F
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b'
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Apr 16 2020 02:24:06'
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:-Obtaining Segment details from master...
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:-Heap checksum setting is consistent between master and the segments that are candidates for recoverseg
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:-Greenplum instance recovery parameters
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:----------------------------------------------------------
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:-Recovery from configuration -i option supplied
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:----------------------------------------------------------
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:-Recovery 1 of 1
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:----------------------------------------------------------
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Synchronization mode = Full
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Failed instance host = data05
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Failed instance address = data05
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Failed instance directory = /greenplum/gpdata/mirror/gpseg10
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Failed instance port = 56001
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Recovery Source instance host = data04
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Recovery Source instance address = data04
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Recovery Source instance directory = /greenplum/gpdata/primary/gpseg10
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Recovery Source instance port = 55001
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:- Recovery Target = in-place
20221127:23:15:43:026332 gprecoverseg:master:gpadmin-[INFO]:----------------------------------------------------------
20221127:23:15:47:026332 gprecoverseg:master:gpadmin-[INFO]:-1 segment(s) to recover
20221127:23:15:47:026332 gprecoverseg:master:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20221127:23:15:47:026332 gprecoverseg:master:gpadmin-[INFO]:-Ensuring that shared memory is cleaned up for stopped segments
20221127:23:15:47:026332 gprecoverseg:master:gpadmin-[INFO]:-Validating remote directories
20221127:23:15:48:026332 gprecoverseg:master:gpadmin-[INFO]:-Configuring new segments data05 (dbid 31): pg_basebackup: base backup completed
20221127:23:15:51:026332 gprecoverseg:master:gpadmin-[INFO]:-Updating configuration with new mirrors
20221127:23:15:51:026332 gprecoverseg:master:gpadmin-[INFO]:-Updating mirrors
20221127:23:15:51:026332 gprecoverseg:master:gpadmin-[INFO]:-Starting mirrors
20221127:23:15:51:026332 gprecoverseg:master:gpadmin-[INFO]:-era is c6f862530103c913_221127213422
20221127:23:15:51:026332 gprecoverseg:master:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
20221127:23:15:51:026332 gprecoverseg:master:gpadmin-[INFO]:-Process results...
20221127:23:15:51:026332 gprecoverseg:master:gpadmin-[INFO]:-Triggering FTS probe
20221127:23:15:52:026332 gprecoverseg:master:gpadmin-[INFO]:-******************************************************************
20221127:23:15:52:026332 gprecoverseg:master:gpadmin-[INFO]:-Updating segments for streaming is completed.
20221127:23:15:52:026332 gprecoverseg:master:gpadmin-[INFO]:-For segments updated successfully, streaming will continue in the background.
20221127:23:15:52:026332 gprecoverseg:master:gpadmin-[INFO]:-Use gpstate -s to check the streaming progress.
20221127:23:15:52:026332 gprecoverseg:master:gpadmin-[INFO]:-******************************************************************
进程检查:
代码语言:sql复制[gpadmin@data05 gpseg10]$ ps -ef |grep postgres
gpadmin 45364 1 0 22:53 ? 00:00:00 /usr/local/greenplum-db-6.7.0/bin/postgres -D /greenplum/gpdata/primary/gpseg13 -p 55001
gpadmin 45367 1 0 22:53 ? 00:00:00 /usr/local/greenplum-db-6.7.0/bin/postgres -D /greenplum/gpdata/primary/gpseg12 -p 55000
gpadmin 45369 1 0 22:53 ? 00:00:00 /usr/local/greenplum-db-6.7.0/bin/postgres -D /greenplum/gpdata/primary/gpseg14 -p 55002
gpadmin 45373 1 0 22:53 ? 00:00:00 /usr/local/greenplum-db-6.7.0/bin/postgres -D /greenplum/gpdata/primary/gpseg19 -p 55003
gpadmin 45378 1 0 22:53 ? 00:00:00 /usr/local/greenplum-db-6.7.0/bin/postgres -D /greenplum/gpdata/mirror/gpseg9 -p 56000
gpadmin 45380 1 0 22:53 ? 00:00:00 /usr/local/greenplum-db-6.7.0/bin/postgres -D /greenplum/gpdata/mirror/gpseg18 -p 56003
gpadmin 45382 1 0 22:53 ? 00:00:00 /usr/local/greenplum-db-6.7.0/bin/postgres -D /greenplum/gpdata/mirror/gpseg11 -p 56002
gpadmin 47899 1 0 23:15 ? 00:00:00 /usr/local/greenplum-db-6.7.0/bin/postgres -D /greenplum/gpdata/mirror/gpseg10 -p 56001
......
表空间检查:
代码语言:sql复制[gpadmin@data05 gpseg10]$ ls
backup_label.old gpmetrics pg_clog pg_logical pg_stat PG_VERSION postmaster.pid
base gpperfmon pg_distributedlog pg_multixact pg_stat_tmp pg_xlog recovery.conf
fts_probe_file.bak gpsegconfig_dump pg_dynshmem pg_notify pg_subtrans postgresql.auto.conf recovery.done
global gpssh.conf pg_hba.conf pg_replslot pg_tblspc postgresql.conf
gpexpand.pid internal.auto.conf pg_ident.conf pg_serial pg_twophase postgresql.conf.bak
gpexpand.status internal.auto.conf.bak pg_log pg_snapshots pg_utilitymodedtmredo postmaster.opts
状态检查:
代码语言:sql复制[gpadmin@master ~]$ gpstate -e
20221127:23:23:01:026934 gpstate:master:gpadmin-[INFO]:-Starting gpstate with args: -e
20221127:23:23:01:026934 gpstate:master:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b'
20221127:23:23:01:026934 gpstate:master:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.7.0 build commit:2fbc274bc15a19b5de3c6e44ad5073464cd4f47b) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Apr 16 2020 02:24:06'
20221127:23:23:01:026934 gpstate:master:gpadmin-[INFO]:-Obtaining Segment details from master...
20221127:23:23:01:026934 gpstate:master:gpadmin-[INFO]:-Gathering data from segments...
20221127:23:23:02:026934 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:23:23:02:026934 gpstate:master:gpadmin-[INFO]:-Segment Mirroring Status Report
20221127:23:23:02:026934 gpstate:master:gpadmin-[INFO]:-----------------------------------------------------
20221127:23:23:02:026934 gpstate:master:gpadmin-[INFO]:-All segments are running normally
对于这种情况,一般不会存在数据节点状态异常的情况:
代码语言:sql复制[gpadmin@master ~]$ psql -c "select * from gp_segment_configuration order by content asc,dbid;"
dbid | content | role | preferred_role | mode | status | port | hostname | address | datadir
------ --------- ------ ---------------- ------ -------- ------- ---------- --------- -----------------------------------
44 | -1 | p | p | s | u | 5432 | master | master | /greenplum/gpdata/master/gpseg-1
45 | -1 | m | m | s | u | 5432 | standby | standby | /greenplum/gpdata/master/gpseg-1
2 | 0 | p | p | s | u | 55000 | data01 | data01 | /greenplum/gpdata/primary/gpseg0
11 | 0 | m | m | s | u | 56000 | data02 | data02 | /greenplum/gpdata/mirror/gpseg0
3 | 1 | p | p | s | u | 55001 | data01 | data01 | /greenplum/gpdata/primary/gpseg1
12 | 1 | m | m | s | u | 56001 | data02 | data02 | /greenplum/gpdata/mirror/gpseg1
4 | 2 | p | p | s | u | 55002 | data01 | data01 | /greenplum/gpdata/primary/gpseg2
13 | 2 | m | m | s | u | 56002 | data02 | data02 | /greenplum/gpdata/mirror/gpseg2
5 | 3 | p | p | s | u | 55000 | data02 | data02 | /greenplum/gpdata/primary/gpseg3
14 | 3 | m | m | s | u | 56000 | data03 | data03 | /greenplum/gpdata/mirror/gpseg3
6 | 4 | p | p | s | u | 55001 | data02 | data02 | /greenplum/gpdata/primary/gpseg4
15 | 4 | m | m | s | u | 56001 | data03 | data03 | /greenplum/gpdata/mirror/gpseg4
7 | 5 | p | p | s | u | 55002 | data02 | data02 | /greenplum/gpdata/primary/gpseg5
16 | 5 | m | m | s | u | 56002 | data03 | data03 | /greenplum/gpdata/mirror/gpseg5
8 | 6 | p | p | s | u | 55000 | data03 | data03 | /greenplum/gpdata/primary/gpseg6
17 | 6 | m | m | s | u | 56000 | data01 | data01 | /greenplum/gpdata/mirror/gpseg6
9 | 7 | p | p | s | u | 55001 | data03 | data03 | /greenplum/gpdata/primary/gpseg7
18 | 7 | m | m | s | u | 56001 | data01 | data01 | /greenplum/gpdata/mirror/gpseg7
10 | 8 | p | p | s | u | 55002 | data03 | data03 | /greenplum/gpdata/primary/gpseg8
19 | 8 | m | m | s | u | 56002 | data01 | data01 | /greenplum/gpdata/mirror/gpseg8
21 | 9 | p | p | s | u | 55000 | data04 | data04 | /greenplum/gpdata/primary/gpseg9
30 | 9 | m | m | s | u | 56000 | data05 | data05 | /greenplum/gpdata/mirror/gpseg9
22 | 10 | p | p | s | u | 55001 | data04 | data04 | /greenplum/gpdata/primary/gpseg10
31 | 10 | m | m | s | u | 56001 | data05 | data05 | /greenplum/gpdata/mirror/gpseg10
23 | 11 | p | p | s | u | 55002 | data04 | data04 | /greenplum/gpdata/primary/gpseg11
32 | 11 | m | m | s | u | 56002 | data05 | data05 | /greenplum/gpdata/mirror/gpseg11
24 | 12 | p | p | s | u | 55000 | data05 | data05 | /greenplum/gpdata/primary/gpseg12
27 | 12 | m | m | s | u | 56000 | data04 | data04 | /greenplum/gpdata/mirror/gpseg12
25 | 13 | p | p | s | u | 55001 | data05 | data05 | /greenplum/gpdata/primary/gpseg13
28 | 13 | m | m | s | u | 56001 | data04 | data04 | /greenplum/gpdata/mirror/gpseg13
26 | 14 | p | p | s | u | 55002 | data05 | data05 | /greenplum/gpdata/primary/gpseg14
29 | 14 | m | m | s | u | 56002 | data04 | data04 | /greenplum/gpdata/mirror/gpseg14
33 | 15 | p | p | s | u | 55003 | data01 | data01 | /greenplum/gpdata/primary/gpseg15
39 | 15 | m | m | s | u | 56003 | data02 | data02 | /greenplum/gpdata/mirror/gpseg15
34 | 16 | p | p | s | u | 55003 | data02 | data02 | /greenplum/gpdata/primary/gpseg16
40 | 16 | m | m | s | u | 56003 | data03 | data03 | /greenplum/gpdata/mirror/gpseg16
35 | 17 | p | p | s | u | 55003 | data03 | data03 | /greenplum/gpdata/primary/gpseg17
41 | 17 | m | m | s | u | 56003 | data04 | data04 | /greenplum/gpdata/mirror/gpseg17
36 | 18 | p | p | s | u | 55003 | data04 | data04 | /greenplum/gpdata/primary/gpseg18
42 | 18 | m | m | s | u | 56003 | data05 | data05 | /greenplum/gpdata/mirror/gpseg18
37 | 19 | p | p | s | u | 55003 | data05 | data05 | /greenplum/gpdata/primary/gpseg19
38 | 19 | m | m | s | u | 56003 | data01 | data01 | /greenplum/gpdata/mirror/gpseg19
查看数据:
代码语言:sql复制[gpadmin@master ~]$ psql -c "select gp_segment_id,count(*) from test_yw;"
同样可以看到所有数据节点上的数据都是正常的。