OGG 21c 远程捕获进程延迟问题处理

2022-12-07 14:12:37 浏览数 (1)

作者 | JiekeXu

来源 |公众号 JiekeXu DBA之路(ID: JiekeXu_IT)

如需转载请联系授权

大家好,我是 JiekeXu,很高兴又和大家见面了,今天和大家一起来看看 OGG 21c 远程捕获进程延迟问题处理

前面几天发布了一篇使用 OGG 21c 遇到的几个问题之后,居然在新的周一后又发现了一个与 OGG 相关的问题,那是什么问题呢?且听小哥我慢慢道来。说来也简单就是使用 OGG21c 在中间机配置好了远程捕获 EXTRACT 进程,将数据导入到目标端 19c 后,源端捕获进程有很大的延迟,按照这个延迟来看,本地的归档日志也都删除了,可这个捕获进程呢没有任何报错,延迟一直在增大。

代码语言:javascript复制
info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING
EXTRACT     RUNNING     EXT3        96:40:04      00:00:06

配置过程简单如下

代码语言:javascript复制
--重建前的步骤
dblogin USERIDALIAS source_11g
--先强行 kill 掉进程
kill EXT3
delete ext3
--从数据库注册里去掉 ext3
unregister extract ext3 database

--登录数据库
dblogin USERIDALIAS source_11g
--注册 ext3 到数据库中
register extract ext3 database

2022-11-07 15:25:33  INFO    OGG-02003  Extract group EXT3 successfully registered with database at SCN 13714764296.

--添加 ext3
ADD EXTRACT ext3 INTEGRATED TRANLOG, BEGIN NOW

--注册 ext3 trail 文件
add exttrail /home/oracle/ogg21c/dirdat/T4, extract ext3, megabytes 1024

--编辑 ext3 参数
edit params ext3 

EXTRACT ext3
USERIDALIAS source_11g
DDL INCLUDE MAPPED
DDLOPTIONS REPORT
TRANLOGOPTIONS EXCLUDETAG 99
TRANLOGOPTIONS NOUSENATIVEOBJSUPPORT
EXTTRAIL /home/oracle/ogg21c/dirdat/T4
TABLE OPS.*;
SEQUENCE OPS.*;

然后查看 ggserr 日志无明显报错。view report ext3 查看日志也没有明显的错误。

代码语言:javascript复制
ggserr.log 日志

2022-11-07T15:31:04.513 0800  INFO    OGG-06604  Oracle GoldenGate Capture for Oracle, ext3.prm:  Connected to database jieke, CPU info: CPU Count 4, CPU Core Count 4, CPU Socket Count 4.
2022-11-07T15:31:04.513 0800  INFO    OGG-06618  Oracle GoldenGate Capture for Oracle, ext3.prm:  Database jieke Platform: Linux x86 64-bit.
2022-11-07T15:31:04.723 0800  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): info all.
2022-11-07T15:31:05.546 0800  WARNING OGG-02045  Oracle GoldenGate Capture for Oracle, ext3.prm:  Database does not have streams_pool_size initialization parameter configured.
2022-11-07T15:31:05.708 0800  INFO    OGG-02248  Oracle GoldenGate Capture for Oracle, ext3.prm:  Logmining server DDL filtering enabled.
2022-11-07T15:31:08.306 0800  INFO    OGG-02339  Oracle GoldenGate Capture for Oracle, ext3.prm:  Integrated capture successfully attached to logmining server OGG$CAP_EXT3 using 60 second streaming duration.
2022-11-07T15:31:08.306 0800  INFO    OGG-02089  Oracle GoldenGate Capture for Oracle, ext3.prm:  Source redo compatibility version is: 11.2.0.4.0.
2022-11-07T15:31:08.306 0800  INFO    OGG-15446  Oracle GoldenGate Capture for Oracle, ext3.prm:  Extract configured as  resource group.
2022-11-07T15:31:08.306 0800  INFO    OGG-02086  Oracle GoldenGate Capture for Oracle, ext3.prm:  Integrated Dictionary will be used.
2022-11-07T15:31:08.449 0800  INFO    OGG-02710  Oracle GoldenGate Capture for Oracle, ext3.prm:  Database metadata information is obtained from source database.
2022-11-07T15:31:08.517 0800  INFO    OGG-02776  Oracle GoldenGate Capture for Oracle, ext3.prm:  Native data capture is enabled for Oracle NUMBER data type.
2022-11-07T15:31:08.523 0800  INFO    OGG-01971  Oracle GoldenGate Capture for Oracle, ext3.prm:  The previous message, 'INFO OGG-02776', repeated 1 times.
2022-11-07T15:31:08.523 0800  INFO    OGG-00993  Oracle GoldenGate Capture for Oracle, ext3.prm:  Extract group EXTRACT EXT3 started.
2022-11-07T15:31:08.524 0800  INFO    OGG-01052  Oracle GoldenGate Capture for Oracle, ext3.prm:  No recovery is required for target file /home/oracle/ogg21c/dirdat/T4000000000, at RBA 0 (file not opened).
2022-11-07T15:31:08.524 0800  INFO    OGG-01478  Oracle GoldenGate Capture for Oracle, ext3.prm:  Output file /home/oracle/ogg21c/dirdat/T4 is using format RELEASE 19.1/21.1.
2022-11-07T15:31:51.789 0800  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): info all.

view report ext3 的日志

2022-11-07 15:31:05  INFO    OGG-02248  Logmining server DDL filtering enabled.
2022-11-07 15:31:08  INFO    OGG-02339  Integrated capture successfully attached to logmining server OGG$CAP_EXT3 using 60 second streaming duration.
2022-11-07 15:31:08  INFO    OGG-02089  Source redo compatibility version is: 11.2.0.4.0.
2022-11-07 15:31:08  INFO    OGG-15446  Extract configured as  resource group.
2022-11-07 15:31:08  INFO    OGG-02086  Integrated Dictionary will be used.
2022-11-07 15:31:08  INFO    OGG-02710  Database metadata information is obtained from source database.
2022-11-07 15:31:08  INFO    OGG-02776  Native data capture is enabled for Oracle NUMBER data type.
2022-11-07 15:31:08  INFO    OGG-01971  The previous message, 'INFO OGG-02776', repeated 1 times.
2022-11-07 15:31:08  INFO    OGG-01052  No recovery is required for target file /home/oracle/ogg21c/dirdat/T4000000000, at RBA 0 (file not  opened).
2022-11-07 15:31:08  INFO    OGG-01478  Output file /home/oracle/ogg21c/dirdat/T4 is using format RELEASE 19.1/21.1.
***********************************************************************
**                     Run Time Messages                             **
***********************************************************************

--查看统计信息
stats ext3
Sending STATS request to Extract group EXT3 ...
Start of statistics at 2022-11-07 15:52:52.
DDL replication statistics (for all trails):

*** Total statistics since extract started     ***
        Operations                                         0.00
        Mapped operations                                  0.00
        Unmapped operations                                0.00
        Other operations                                   0.00
        Excluded operations                                0.00

Output to /home/oracle/ogg21c/dirdat/T4:

info ext3,showch

Extract    EXT3      Last Started 2022-11-07 15:31   Status RUNNING
Checkpoint Lag       00:28:35 (updated 00:00:06 ago)
Process ID           18534
Log Read Checkpoint  Oracle Integrated Redo Logs
                     2022-11-07 15:25:40
                     SCN 3.829909378 (13714811266)
Current Checkpoint Detail:
Read Checkpoint #1
  Oracle Integrated Redo Log

  Startup Checkpoint (starting position in the data source):
    Timestamp: 2022-11-07 15:25:40.000000
    SCN: 0.0 (0)

  Recovery Checkpoint (position of oldest unprocessed transaction in the data source):
    Timestamp: 2022-11-07 15:25:40.000000
    SCN: 3.829909372 (13714811260)

  Current Checkpoint (position of last record read in the data source):
    Timestamp: 2022-11-07 15:25:40.000000
    SCN: 3.829909378 (13714811266)

send ext3 showtrans

Sending SHOWTRANS request to Extract group EXT3 ...
------------------------------------------------------------
XID:                  0.60.26.988124        
Items:                0        
Extract:              EXT3      
Redo Thread:          3      
Start Time:           2022-11-07:15:25:40  
SCN:                  3.829909394 (13714811282)         
Redo Seq:             35499
Redo RBA:             3025424             
Status:               Running             

--其他常用命令
info ext3,showch
info ext3,detail
send ext3 showtrans
send extract ext3 status
send ext3 showtrans  

没有任何报错,这便无从查起,便是很苦恼,删除进程重新添加观察一段时间之后也是这样的延迟,难道是因为数据库是 11g 的原因吗?但记得之前测试环境单机 11g 用的 OGG21c 也是可以的,那么是因为我这环境是 RAC 的问题吗?是因为远程复制吗?苦思不得其解,久久不能找到答案。

终于在昨天看到李老师公众号发布的文章因视图引起集成模式不同步的问题排查过程,于是跟着排查步骤一步一步检查自己的环境。

终于发现我们遇到了同样的问题,查询视图 DBA_GOLDENGATE_SUPPORT_MODE 很慢很慢,这个视图呢是记录的捕获进程对数据库中表的支持级别的信息。

代码语言:javascript复制
select * from DBA_GOLDENGATE_SUPPORT_MODE where support_mode in ('ID KEY', 'NONE') order by owner,object_name;

查看这个视图的定义

代码语言:javascript复制
set long 9999
SELECT DBMS_METADATA.GET_DDL('VIEW','DBA_GOLDENGATE_SUPPORT_MODE','SYS') DDL_SQL FROM DUAL;

CREATE OR REPLACE FORCE VIEW "SYS"."DBA_GOLDENGATE_SUPPORT_MODE" ("OWNER", "OBJECT_NAME", "SUPPORT_MODE") AS
select "OWNER","OBJECT_NAME","SUPPORT_MODE" from DBA_XSTREAM_OUT_SUPPORT_MODE;

SELECT * FROM dba_dependencies  where name='DBA_GOLDENGATE_SUPPORT_MODE';
代码语言:javascript复制
发现视图 dba_goldengate_support_mode 基于视图 DBA_XSTREAM_OUT_SUPPORT_MODE:

查看视图 DBA_XSTREAM_OUT_SUPPORT_MODE 的定义,比较复杂,由三个 union all 组成,把视图 dba_xstream_out_support_mode 的组成部分分开来查询,发现涉及到 dba_logstdby_unsupported_table 视图的几个部分查询会夯住,count(*) 73 条内容竟然需要 22 分钟多:

代码语言:javascript复制
SELECT DBMS_METADATA.GET_DDL('VIEW','DBA_XSTREAM_OUT_SUPPORT_MODE','SYS') DDL_SQL FROM DUAL;

select owner, table_name, 'FULL' from dba_logstdby_unsupported_table;

select owner, table_name, 'ID KEY' from dba_logstdby_unsupported_table
     where table_name not like 'AQ$_%' and (owner, table_name) not in
      (select owner, queue_table from dba_queue_tables) and (owner, table_name) not in
      (select owner, table_name from "_DBA_XSTREAM_OUT_ADT_PK_TABLES");
代码语言:javascript复制
DBA_LOGSTDBY_UNSUPPORTED_TABLE displays the data tables that are 
not supported by Logical Standby. 
该视图显示的是逻辑 DG 备库不支持的表。
https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/DBA_LOGSTDBY_UNSUPPORTED_TABLE.html#GUID-2CF061B2-E31B-48FB-8BE2-96FD64851B33

目前我的数据库是 四节点 RAC 11.2.0.4 补丁为 190416,那么到底是不是 bug 21281961 呢,死马当活马医咯,先试试吧,首先去 MOS 上下载所需要的对应平台和版本号的补丁。

代码语言:javascript复制
opatch lspatches
29141201;OCW Patch Set Update : 11.2.0.4.190416 (29141201)
29141056;Database Patch Set Update : 11.2.0.4.190416 (29141056)

下载补丁

Patch 21281961: DBA_LOGSTDBY_UNSUPPORTED VIEW IS UNACCEPTABLY SLOW

因我的数据库打的补丁是 190416 故下载这个补丁时需要找对应的补丁,没有对应的补丁要找小于这个版本的补丁,不能找高于这个版本的补丁。故这里下载 DATABASE PATCH SET UPDATE 11.2.0.4.180717 补丁。

安装补丁

代码语言:javascript复制
export PATH=${ORACLE_HOME}/bin:${ORACLE_HOME}/OPatch:${PATH}
unzip p21281961_11204180717_Generic.zip
1、首先确保 27734982 补丁已经安装
代码语言:javascript复制
opatch lsinv | grep 27734982
Sub-patch  27734982; "Database Patch Set Update : 11.2.0.4.180717 (27734982)"
代码语言:javascript复制
2、确保 Opatch 版本大于 11.2.0.3.5
代码语言:javascript复制
opatch version 
OPatch Version: 11.2.0.3.21
3、确保 $PATH 定义有以下可执行文件:make, ar, ld 和 nm。

这些可执行文件的位置取决于您的操作系统。在许多操作系统中,它们位于 /usr/ccs/bin 目录中。

4、补丁冲突检测

当 OPatch 启动时,它会验证补丁,并确保与 ORACLE_HOME 中已经安装的软件没有冲突。OPatch 将冲突分为以下类型: 与已应用到 ORACLE_HOME 的补丁冲突,该补丁是您试图应用的补丁的子集—在这种情况下,请继续安装补丁,因为新补丁包含 ORACLE_HOME 中现有补丁的所有补丁。在安装新补丁之前,将自动回滚子集补丁。

代码语言:javascript复制
cd 21281961
opatch prereq CheckConflictAgainstOHWithDetail -ph ./

Oracle Interim Patch Installer version 11.2.0.3.21
Copyright (c) 2022, Oracle Corporation.  All rights reserved.

PREREQ session

Oracle Home       : /u01/app/oracle/product/11.2.0/db_1
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.2.0/db_1/oraInst.loc
OPatch version    : 11.2.0.3.21
OUI version       : 11.2.0.4.0
Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2022-11-07_16-23-14PM_1.log

Invoking prereq "checkconflictagainstohwithdetail"

Prereq "checkConflictAgainstOHWithDetail" passed.

OPatch succeeded.
5、各个节点应用补丁

cd 21281961 opatch apply

代码语言:javascript复制
opatch apply
Oracle Interim Patch Installer version 11.2.0.3.21
Copyright (c) 2022, Oracle Corporation.  All rights reserved.


Oracle Home       : /u01/app/oracle/product/11.2.0/db_1
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/oracle/product/11.2.0/db_1/oraInst.loc
OPatch version    : 11.2.0.3.21
OUI version       : 11.2.0.4.0
Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2022-11-07_16-47-38PM_1.log

Verifying environment and performing prerequisite checks...
OPatch continues with these patches:   21281961  

Do you want to proceed? [y|n]
y
User Responded with: Y
All checks passed.
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name: 

You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]:  Y


Backing up files...
Applying interim patch '21281961' to OH '/u01/app/oracle/product/11.2.0/db_1'

Patching component oracle.rdbms.dbscripts, 11.2.0.4.0...
Patch 21281961 successfully applied.
Log file location: /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2022-11-07_16-47-38PM_1.log

OPatch succeeded.

opatch lsinventory
6、执行 SQL 脚本
代码语言:javascript复制
$ sqlplus / AS SYSDBA
SQL> @?/sqlpatch/21281961/postinstall.sql
7、补丁回退(出现问题时需要回退)
代码语言:javascript复制
opatch rollback -id 21281961
$ sqlplus / AS SYSDBA
SQL> @?/sqlpatch/21281961/postdeinstall.sql

打完补丁,重启捕获进程 ext3

打完补丁,重启进程后,捕获进程一个多小时的延迟,几秒内瞬间就没有了。

代码语言:javascript复制
GGSCI (jiekedb as ogg@jieke2) 43> stop ext3

Sending STOP request to Extract group EXT3 ...
Request processed.

GGSCI (jiekedb as ogg@jieke2) 44> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     EXT3        01:25:21      00:00:03    
  

GGSCI (jiekedb as ogg@jieke2) 45> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     EXT3        01:25:21      00:00:28    


GGSCI (jiekedb as ogg@jieke2) 47> kill ext3

Sending KILL request to Manager ...
Terminated process (18534) for EXTRACT EXT3


GGSCI (jiekedb as ogg@jieke2) 48> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     STOPPED     EXT3        01:25:21      00:01:06    

GGSCI (jiekedb as ogg@jieke2) 49> exit

GGSCI (jiekedb) 2> dblogin USERIDALIAS source_11g
Successfully logged into database.

GGSCI (jiekedb as ogg@jieke1) 3> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     STOPPED     EXT3        01:25:21      00:01:40    

GGSCI (jiekedb as ogg@jieke1) 4> start ext3

Sending START request to Manager ...
Extract group EXT3 starting.


GGSCI (jiekedb as ogg@jieke1) 5> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     EXT3        01:25:21      00:01:49    

GGSCI (jiekedb as ogg@jieke1) 6> info ext3

Extract    EXT3      Last Started 2022-11-07 15:31   Status RUNNING
Checkpoint Lag       01:25:21 (updated 00:01:55 ago)
Process ID           30433
Log Read Checkpoint  Oracle Integrated Redo Logs
                     2022-11-07 15:25:40
                     SCN 3.829909402 (13714811290)


GGSCI (jiekedb as ogg@jieke1) 7> !
info ext3

Extract    EXT3      Last Started 2022-11-07 16:53   Status RUNNING
Checkpoint Lag       01:27:21 (updated 00:00:05 ago)
Process ID           30433
Log Read Checkpoint  Oracle Integrated Redo Logs
                     2022-11-07 15:25:40
                     SCN 3.829909402 (13714811290)


GGSCI (jiekedb as ogg@jieke1) 8> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     EXT3        00:18:43      00:00:00    
GGSCI (jiekedb as ogg@jieke1) 9> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     EXT3        00:00:04      00:00:05

查看这个补丁发现 cat ./21281961/files/sqlpatch/21281961/postinstall.sql 主要是执行了 catlsby.sql 脚本,那么在这个脚本中我们可以发现是有重建 dba_logstdby_unsupported_table 视图的,这个和小麦苗老师说的另一种方案一样可以直接重建视图快速解决问题,但是生产环境如果出现问题还是保守一些以打补丁为好。 exec :scriptFile := ‘?/rdbms/admin/catlsby.sql’;

代码语言:javascript复制
cat ./21281961/files/rdbms/admin/catlsby.sql | grep dba_logstdby_unsupported_table -A10
create or replace view dba_logstdby_unsupported_table
as
  select owner, name table_name 
  from (
    select u.owner, u.name, u.type#, u.obj#, u.current_sby, u.gensby 
    from logstdby_support_tab_10_1 u, 
         (select (CASE d.database_role
                  WHEN 'PRIMARY' THEN p.value
                  ELSE  nvl(
                   (select s.redo_compat 
                    from system.logstdby$parameters p, system.logmnr_session$ s ……
代码语言:javascript复制
打完补丁观察了两天也没有出现延迟问题,但是对于视图 DBA_GOLDENGATE_SUPPORT_MODE 

的查询是一点效果也没有,还是很慢很慢。在其他环境和 19c 环境查询很快就可以出现结果,为何还会这样呢真的是这个环境的问题?有感兴趣的小伙伴可以继续研究研究私信我,今天就先到这里咯。

代码语言:javascript复制
--其他正常 11g 环境
select count(*) from DBA_GOLDENGATE_SUPPORT_MODE where support_mode in ('ID KEY', 'NONE') order by owner,object_name;

  COUNT(*)
----------
46

Elapsed: 00:00:05.90

参考链接

代码语言:javascript复制
https://mp.weixin.qq.com/s/sch4-eEvWAcd_aIts2Qttg
https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/DBA_LOGSTDBY_UNSUPPORTED_TABLE.html#GUID-2CF061B2-E31B-48FB-8BE2-96FD64851B33
https://docs.oracle.com/en/database/oracle/oracle-database/19/refrn/DBA_GOLDENGATE_SUPPORT_

0 人点赞