goldengate classic extract在什么情况需要重建以及如何重建

2020-08-05 15:16:33 浏览数 (1)

【什么情况需要重建抽取进程】

1、对于RAC增加与删除节点后出现RAC线程与OGG线程不一致.

2、对于抽取进程所在的文件系统损坏或者被删除的情况,这个重建需要依赖应用进程或者传输进程,需要具体问题具体分析(如果文件损坏或误删除,大概率传输也会遭殃(具体情况具体分析),所以依赖应用进程比较靠谱)

【重建抽取进程--增加与删除节点大概步骤】

1、对于删除节点后,此线程对应日志必须extract完成,才能停止extract否则会丢失数据.

对于新加节点后,RAC线程与OGG线程不一致,可以停止extract进程.对于OGG停止之前,先判断数据库是否有长事务运行,如果有等待长事务结束.

对于单纯重建extract的,以下步骤也是适用的.与上面新加与删除节点原理都是一样的.对于非静态系统的执行2-9步骤,如果是静态系统的话,等待extract无延迟后,直接执行6,9的2个步骤,不存在更新checkpoint信息,但是exttrail或者rmttrail信息需要关注下(参考步骤7中)

2、停止EXTRACT--stop extract extractname,status extract extractname(前者是停止,后者是检查是否状态)

3、备份EXTRACT的dirchk目录下extractname.cpe文件,避免误操作可以恢复(extractname默认都是大写)

4、获取EXTRACT的checkpoint信息,用于后续重建--info extractname showch,info extractname detail

5、删除EXTRACT(可以先dblogin数据库,这样数据库checkpoint table中信息以及register进程被正常删除)--delete extract extractname

6、重新增加EXTRACT--add extract extractname,begin now,threads n(这个n表示总的线程数,跟thread n有区别,后者表示第几个线程)

7、更新EXTRACT的checkpoint信息,利用info extractname showch输出信息的,更新包括recovery checkpoint,current checkpoint以及增加exttrail

增加exttrail或者rmttrail信息(根据实际情况,这些是否存在)--add exttrail exttrailname,megabytes xxm,extract extractname(exttrailname可以是相对路径

或者绝对路径,各种好处,相对迁移之类简单,绝对路径管理容易些)

8、再次检查extract checkpoint信息--info extractname showch,info extractname detail

9、启动extract --start extract extractname

【重建抽取进程--增加与删除节点:以oracle为例子】

1、先停止extract进程

GGSCI> stop EXIAOXU

Sending STOP request to EXTRACT EXIAOXU ...

Request processed.

2、检查extract进程是否状态

GGSCI> info EXIAOXU

EXTRACT EXIAOXU Last Started 2018-02-05 23:48 Status STOPPED

Checkpoint Lag 00:00:00 (updated 00:03:28 ago)

3、获取extract EXIAOXU的checkpoint信息

备注:checkpoint信息包括:Recovery Checkpoint、Current Checkpoint以及Write Checkpoint三个信息,用于后续重建时使用

GGSCI> info EXIAOXU showch

EXTRACT EXIAOXU Last Started 2018-02-05 23:48 Status STOPPED

Checkpoint Lag 00:00:00 (updated 00:03:28 ago)

Log Read Checkpoint Oracle Redo Logs

2018-03-21 23:29:30 Thread 1, Seqno 19290, RBA 3958848

SCN 312.1419249828 (131449046180)

Log Read Checkpoint Oracle Redo Logs

2018-03-21 23:29:28 Thread 2, Seqno 21339, RBA 58046720

SCN 312.1419248878 (131449045230)

Current Checkpoint Detail:

Read Checkpoint #1

Oracle Threaded Redo Log

Startup Checkpoint (starting position in the data source):

Thread #: 1

Sequence #: 196024

RBA: 76904464

Timestamp: 2018-02-05 18:24:29.000000

SCN: 307.4007933433 (132562893305)

Redo File: FRA/xiaoxu/onlinelog/group5

Recovery Checkpoint (position of oldest unprocessed transaction in the data source):

Thread #: 1

Sequence #: 19290

RBA: 3958352

Timestamp: 2018-03-21 23:29:30.000000

SCN: 312.1419249828 (131449046180)

Redo File: FRA/xiaoxu/onlinelog/group6

Current Checkpoint (position of last record read in the data source):

Thread #: 1

Sequence #: 19290

RBA: 3958848

Timestamp: 2018-03-21 23:29:30.000000

SCN: 312.1419249828 (131449046180)

Redo File: FRA/xiaoxu/onlinelog/group6

BR Previous Recovery Checkpoint:

Thread #: 1

Sequence #: 0

RBA: 0

Timestamp: 2018-02-05 23:48:51.600282

SCN: Not available

Redo File:

BR Begin Recovery Checkpoint:

Thread #: 1

Sequence #: 19287

RBA: 18735824

Timestamp: 2018-03-21 20:13:00.000000

SCN: 312.1372881091 (131402677443)

Redo File:

BR End Recovery Checkpoint:

Thread #: 1

Sequence #: 19287

RBA: 18823360

Timestamp: 2018-03-21 20:13:01.000000

SCN: 312.1372881363 (131402677715)

Redo File:

Read Checkpoint #2

Oracle Threaded Redo Log

Startup Checkpoint (starting position in the data source):

Thread #: 2

Sequence #: 21542

RBA: 6912912

Timestamp: 2018-02-05 18:24:30.000000

SCN: 307.4007933465 (132562893337)

Redo File: FRA/xiaoxu/onlinelog/group8

Recovery Checkpoint (position of oldest unprocessed transaction in the data source):

Thread #: 2

Sequence #: 21339

RBA: 58008848

Timestamp: 2018-03-21 23:29:28.000000

SCN: 312.1419248878 (131449045230)

Redo File: FRA/xiaoxu/onlinelog/group9

Current Checkpoint (position of last record read in the data source):

Thread #: 2

Sequence #: 21339

RBA: 58046720

Timestamp: 2018-03-21 23:29:28.000000

SCN: 312.1419248878 (131449045230)

Redo File: FRA/xiaoxu/onlinelog/group9

BR Previous Recovery Checkpoint:

Thread #: 2

Sequence #: 0

RBA: 0

Timestamp: 2018-02-05 23:48:51.600282

SCN: Not available

Redo File:

BR Begin Recovery Checkpoint:

Thread #: 2

Sequence #: 21336

RBA: 23109392

Timestamp: 2018-03-21 20:13:01.000000

SCN: 312.1372881231 (131402677583)

Redo File:

BR End Recovery Checkpoint:

Thread #: 2

Sequence #: 21336

RBA: 23337728

Timestamp: 2018-03-21 20:13:02.000000

SCN: 312.1372881525 (131402677877)

Redo File:

Write Checkpoint #1

GGS Log Trail

Current Checkpoint (current write position):

Sequence #: 6715

RBA: 6717

Timestamp: 2018-03-21 23:29:30.937451

Extract Trail: /u01goldengate/112105/dirdat/tt

4、创建extract EXIAOXU

备注:创建抽取进程EXIAOXU,线程总数是2,从当前时间开始读取日志

add extract EXIAOXU,tranlog,begin now,threads 2

alter extract EXIAOXU,extseqno 19290,extrba 3958848 thread 1--update current checkpoint

alter extract EXIAOXU,ioextseqno 19290,ioextrba 3958352 thread 1--update recovery checkpoint,这一步不是必须的,如果停止没长有事务运行

alter extract EXIAOXU,extseqno 2139,extrba 5846720 thread 2--update current checkpoint

alter extract EXIAOXU,ioextseqno 2139,ioextrba 5808848 thread 2--update recovery checkpoint,这一步不是必须的,如果停止没长有事务运行

--以下是具体执行步骤

GGSCI > add extract EXIAOXU,tranlog,begin now,threads 2

EXTRACT added.

GGSCI > alter extract EXIAOXU,extseqno 19290,extrba 3958848 thread 1

EXTRACT altered.

GGSCI > alter extract EXIAOXU,ioextseqno 19290,ioextrba 3958352 thread 1

2018-03-21 23:40:19 INFO OGG-00988 WARNING: Unsupported operation. This might cause transactional inconsistency. Modifying input checkpoint #1, Oracle thread #1 of EXTRACT: ioseq = 19290 iorba = 3958352.

Are you sure you want to continue? y

EXTRACT altered.

GGSCI> alter extract EXIAOXU,extseqno 21339,extrba 58046720 thread 2

EXTRACT altered.

GGSCI> alter extract EXIAOXU,ioextseqno 21339,ioextrba 58008848 thread 2

2018-03-21 23:40:37 INFO OGG-00988 WARNING: Unsupported operation. This might cause transactional inconsistency. Modifying input checkpoint #2, Oracle thread #2 of EXTRACT: ioseq = 21339 iorba = 58008848.

Are you sure you want to continue? y

EXTRACT altered.

5、增加extract EXIAOXU的exttrail信息

备注--有bug,不能区分rmttrail和exttrail全部被删除了.

当没有pump时,手动指定exttrail的seqno是下一个时,replicat无法察觉到extract已经切换,此时replicat的seqno是6715,rba是1727,那么extract指定为相同即可,因为extract重新启动后触发切换到一个.这个也需要注意的.类似存在pump也是一样,需要由extract或者pump去触发下游去自动轮询到下一个trailfile,否则会出现饿死的情况.

add exttrail /u01/goldengate/112105/dirdat/tt,seqno 6715,rba 1727 extract EXIAOXU

GGSCI > add exttrail /u01/goldengate/112105/dirdat/tt,seqno 6715,rba 1727 extract EXIAOXU

EXTTRAIL added.

6、启动extract EXIAOXU

GGSCI > start EXIAOXU

7、检查extract EXIAOXU状态

GGSCI> info EXIAOXU

EXTRACT EXIAOXU Last Started 2018-03-05 11:48 Status RUNNING

Checkpoint Lag 00:00:00 (updated 00:03:28 ago)

【重建抽取进程--对于抽取进程所在的文件系统损坏或者被删除的情况】

--备注:对于OGG 11.2.1.0.5开始,replicat可以跳过已经应用的事务,所以源端extract将时间点往前推N个小时,不过还是会存在丢失事务的情况,例如N 1之前启动的事务还没有提交,第N-2小时又应用需要同步的表,则此时会出现丢失数据的情况.根据自己的系统情况来判断最长事务避免丢失事务的情况.

1、如果源端是相对静态,可以直接重新创建即可

2、如果源端归档日志存在(包括恢复),可以直接重建即可

3、如果源端无法恢复或者已删除,则需要重建抽取进程并重新初始化.

4、如果不考虑数据可以丢失的情况,如果可以丢失的话,重建抽取进程使用begin now.

【通过获取replicat端信息来重建抽取进程:以oracle为例】

1、正常停止ogg replicat进程,获取replicat的checkpoint信息

GGSCI> info RXIAOXU showch

REPLICAT RXIAOXU Last Started 2019-03-24 12:44 Status STOPPED

Checkpoint Lag 00:00:00 (updated 45:04:11 ago)

Log Read Checkpoint File /u01/goldengate/112106/dirdat/tt000013

2018-03-25 17:11:11.000000 RBA 3824279

Current Checkpoint Detail:

Current Checkpoint (position of last record read in the data source):

Sequence #: 13

RBA: 3824279

Timestamp: 2018-03-25 17:11:11.000000

Extract Trail: /u01/goldengate/112106/dirdat/tt

2、通过logdump获取详细信息

--获取事务提交时间,事务scn,记录rowid、DML的类型以及记录属于哪个归档日志等信息

Logdump 6567 >open /u01/goldengate/112106/dirdat/tt000013

Current LogTrail is /u01/goldengate/112106/dirdat/tt000013

Logdump 6568 >ghdr on

Logdump 6569 >detail data on

Logdump 6570 >ggstoken detail

Logdump 6571 >pos 45953765

Reading forward from RBA 45953765

Logdump 6572 >n

___________________________________________________________________

Hdr-Ind : E (x45) Partition : . (x04)

UndoFlag : . (x00) BeforeAfter: A (x41)

RecLength : 279 (x0117) IO Time : 2018/03/20 10:33:18.000.000 --这个事务提交时间

IOType : 5 (x05) OrigNode : 255 (xff)

TransInd : . (x03) FormatType : R (x52)

SyskeyLen : 0 (x00) Incomplete : . (x00)

AuditRBA : 140545 AuditPos : 3628567568

Continued : N (x00) RecCount : 1 (x01)

--TransInd来表事务的开始,中间位置,结尾还是单条记录.

TransInd : . (x00) 表示记录开始位

TransInd : . (x01) 表示记录中间位置

TransInd : . (x02) 表示记录结尾位置

TransInd : . (x03) 表示这个事务只有1条记录

2018/03/20 10:33:18.000.000 Insert Len 279 RBA 45953765

Name: XIAOXU.XIAOXU

--对XIAOXU用户下的XIAOXU表进行插入数据且当前事务只有一条记录.

Column 0 (x0000), Len 40 (x0028)

0000 0024 3764 6630 3232 3365 2d33 3932 632d 3438 | ...$7df0223e-392c-48

3838 2d38 3735 362d 6331 3439 3935 3539 6264 6265 | 88-8756-c1499559bdbe

Column 1 (x0001), Len 40 (x0028)

0000 0024 6665 3138 6437 6437 2d36 3535 372d 3462 | ...$fe18d7d7-6557-4b

3565 2d38 3964 612d 6639 6266 3165 6533 6266 3031 | 5e-89da-f9bf1ee3bf01

Column 2 (x0002), Len 3 (x0003)

0000 59 | ..Y

GGS tokens:

TokenID x52 'R' ORAROWID Info x00 Length 20

4141 4f4b 3651 4143 3741 4149 495a 7141 4142 0001 | AAOK6QAC7AAIIZqCCB..--这个是记录的rowid,之前遇到遇到有的软件复制时候通过rowid来更新数据,这个最快,但是如果源或者目标表的记录rowid改变就异常了.

TokenID x4c 'L' LOGCSN Info x00 Length 13

3133 3430 3739 3237 3834 3236 39 | 1340792784269 --这个是源端数据库事务的scn

TokenID x36 '6' TRANID Info x00 Length 14

3334 3332 2e32 2e31 3330 3632 3939 | 3432.2.1306299

3、通过上面信息来重建extract

--本次以IO Time:2018/03/20 10:33:18.000.000来重建源端的extract,往前推2小时,目标replicat通过aftercsn来跳过重复事务.

--如果goldengate软件丢失,需要预先安装满足数据库版本的软件

本次跳过安装.

--重建extract

add extract exiaoxu,tranlog,begin 2018/03/20 08:33:18.000.000,threads 2

add exttrail /u01/goldengate/112106/dirdat/tt,extract exiaoxu

--重建pump

add extract pxiaoxu,exttrailsource /u01/goldengate/112106/dirdat/tt

add rmttrail /u01/goldengate/112106/dirdat/tt,seqno 14,rba 0

--replicat

alter replicat rxiaoxu,etrollover

alter replicat rxiaoxu,aftercsn 1340792784269

至此完成2种下重建extract,欢迎大家多交流和评论,如果觉得不错,可以转发,顺手关注udapp.谢谢。

0 人点赞