导读
上一章讲了基础的undo文件结构.我们知道了undo文件和ibd文件一样. 只不过把index_page换成了 21(FIL_PAGE_TYPE_RSEG_ARRAY) 6(FIL_PAGE_TYPE_SYS) 2(FIL_PAGE_UNDO_LOG) , 所有本章主要讲这两仨儿.
undo_001
这仨页其实就是 对应的rollback segment
,undo log segment
,undo log
. 名字比较绕. undo_001就是undo tablespace
. 是不是更绕了
undo log segment is collection of undo logs. Undo log segments exists within rollback segments.An undo log segment might contain undo logs from multiple transactions. An undo log segment can only be used by one transaction at a time but can be reused after it is released at transaction commit or rollback.
An undo tablespace contains undo logs. Undo logs exist within undo log segments, which are contained within rollback segments
每个undo文件有128个rollback segment(由FIL_PAGE_TYPE_RSEG_ARRAY页记录), 每个rollback segement有1024个undo segement(由FIL_PAGE_TYPE_SYS页记录), 每个undo segement由若干个undo log构成(由FIL_PAGE_UNDO_LOG构成). 每个undo segement某一时刻只能由1个事务所有, 所以1个undo tablespace最大事务数量就为: 1*128*1024=131072
FIL_PAGE_TYPE_RSEG_ARRAY
先来看看存rollback segment的页.
代码语言:c 复制/** The RSEG ARRAY base version is a number derived from the string
'RSEG' [0x 52 53 45 47] for extra validation. Each new version
increments the base version by 1. */
constexpr uint32_t RSEG_ARRAY_VERSION = 0x52534547 1;
/** The RSEG ARRAY version offset in the header. */
constexpr uint32_t RSEG_ARRAY_VERSION_OFFSET = 0;
/** The current number of rollback segments being tracked in this array */
constexpr uint32_t RSEG_ARRAY_SIZE_OFFSET = 4;
/** This is the pointer to the file segment inode that tracks this
rseg array page. */
constexpr uint32_t RSEG_ARRAY_FSEG_HEADER_OFFSET = 8;
/* Slot size of the array of rollback segment header page numbers */
constexpr uint32_t RSEG_ARRAY_SLOT_SIZE = 4;
Each undo tablespace and the global temporary tablespace individually support a maximum of 128 rollback segments. The innodb_rollback_segments variable defines the number of rollback segments.
变量sysvar_innodb_rollback_segments可以定义回滚段数量
来个图看起来方便点
我们主要看RSEG_ARRAT_SLOT
, 这里的是槽的PAGE_ID. 可使用如下代码快速得到回滚段的PAGE ID信息
import struct
filename = '/data/mysql_3314/mysqldata/undo_001'
with open(filename,'rb') as f:
status = f.seek(3*16384,0)
data = f.read(16384)
slot_list = struct.unpack('>128L',data[56:56 4*128])
for x in range(128):
print(slot_list[x], end='n' if (x 1)==0 else 't' )
结合上一章的内容, 我们知道这些page id都是对应的FIL_PAGE_TYPE_SYS (回滚段)
page_id=88是inode信息.
FIL_PAGE_TYPE_SYS
这种页是rollback segment, 用来记录的undo segment信息的.
代码语言:c 复制/* Transaction rollback segment header */
/*-------------------------------------------------------------*/
/** Maximum allowed size for rollback segment in pages */
constexpr uint32_t TRX_RSEG_MAX_SIZE = 0;
/** Number of file pages occupied by the logs in the history list */
constexpr uint32_t TRX_RSEG_HISTORY_SIZE = 4;
/* The update undo logs for committed transactions */
constexpr uint32_t TRX_RSEG_HISTORY = 8;
/* Header for the file segment where this page is placed */
constexpr uint32_t TRX_RSEG_FSEG_HEADER = 8 FLST_BASE_NODE_SIZE;
SLOT的数量定义如下:
代码语言:c 复制/** Number of undo log slots in a rollback segment file copy */
#define TRX_RSEG_N_SLOTS (UNIV_PAGE_SIZE / 16)
/** Maximum number of transactions supported by a single rollback segment */
#define TRX_RSEG_MAX_N_TRXS (TRX_RSEG_N_SLOTS / 2)
我们可以使用如下python代码解析某个FIL_PAGE_TYPE_SYS 页
代码语言:python代码运行次数:0复制import struct
filename = '/data/mysql_3314/mysqldata/undo_001'
with open(filename,'rb') as f:
status = f.seek(4*16384,0) # 仅解析第4页
data = f.read(16384)
slot_list = struct.unpack('>1024L',data[72:72 4*1024])
for x in range(1024):
print(slot_list[x], end='n' if (x 1)==0 else 't' )
我们可以看到很多undo segment都是空的. 毕竟我本地的环境没啥业务. 而且这只是其中一个rollback segment.
FIL_PAGE_UNDO_LOG
undo log page就是存储具体的回滚数据的. 第1页(rollback segment记录的)比较特殊, 会存储一些额外信息(undo segment header)
好家伙! 不算fil_header(图片中的TRX_UNDO_PAGE_HDR) 都有3种header了.
我们来分别简单看一下吧
undo log page header
这是undo log page都有的,用来记录基础信息(undo page的类型之类的).
代码语言:c 复制/** The offset of the undo log page header on pages of the undo log */
constexpr uint32_t TRX_UNDO_PAGE_HDR = FSEG_PAGE_DATA;
/*-------------------------------------------------------------*/
/** Transaction undo log page header offsets */
/** @{ */
/** TRX_UNDO_INSERT or TRX_UNDO_UPDATE */
constexpr uint32_t TRX_UNDO_PAGE_TYPE = 0;
/** Byte offset where the undo log records for the LATEST transaction start on
this page (remember that in an update undo log, the first page can contain
several undo logs) */
constexpr uint32_t TRX_UNDO_PAGE_START = 2;
/** On each page of the undo log this field contains the byte offset of the
first free byte on the page */
constexpr uint32_t TRX_UNDO_PAGE_FREE = 4;
/** The file list node in the chain of undo log pages */
constexpr uint32_t TRX_UNDO_PAGE_NODE = 6;
/*-------------------------------------------------------------*/
/** Size of the transaction undo log page header, in bytes */
constexpr uint32_t TRX_UNDO_PAGE_HDR_SIZE = 6 FLST_NODE_SIZE;
这里的page start是指最新事务(这又涉及到undo复用问题了..)中的起始undo log位置(offset). page_free是第一个可用的undo log位置, 通常看作是undo log的结束位置.
上面提到的FLST_NODE_SIZE这种结构是由 PRE_PAGE_ID(4),PRE_OFFSET(2), NEXT_PAGE_ID(4), NEXT_OFFSET(2)这种结构. 之前讲ibd2sql的时候说过. FLST_BASE_NODE_SIZE的话, 就是在前面多了4字节的SPACE_ID.
undo log segment header
这是只有第一页才有的.
代码语言:c 复制/** The offset of the undo log segment header on the first page of the undo
log segment */
/** TRX_UNDO_ACTIVE, ... */
constexpr uint32_t TRX_UNDO_STATE = 0;
/** Offset of the last undo log header on the segment header page, 0 if none */
constexpr uint32_t TRX_UNDO_LAST_LOG = 2;
/** Header for the file segment which the undo log segment occupies */
constexpr uint32_t TRX_UNDO_FSEG_HEADER = 4;
/** Base node for the list of pages in the undo log segment; defined only on the
undo log segment's first page */
constexpr uint32_t TRX_UNDO_PAGE_LIST = 4 FSEG_HEADER_SIZE;
和事务相关的,比如:
undo log header
这个才是真正的undo log头信息.也是一些事务信息.我这里就不去细看了 (storage/innobase/include/trx0undo.h)
我们本次解析undo文件是按照结构来的, 而不是按照事务来的.
undo log
讲了那么多,实际上还没有到Undo log -_-
undo log分为2大类: insert(TRX_UNDO_INSERT_REC)和update
TRX_UNDO_INSERT_REC
insert比较简单, 只需要记录主键即可(没得主键就是row_id). 结构如下:
(叉会儿腰.jpg)
但是各值的计算就比较复杂了...
undo_log_type_flag 0-3这4字节表示类型, 第4,5字节是flag, 第6字节表示是否有lob,第7字节有额外存储
undo no和table id的存储就更复杂了.
代码语言:c 复制static inline uint32_t mach_read_next_compressed(const byte **b) {
ulint val = mach_read_from_1(*b);
if (val < 0x80) {
/* 0nnnnnnn (7 bits) */
*b;
} else if (val < 0xC0) {
/* 10nnnnnn nnnnnnnn (14 bits) */
val = mach_read_from_2(*b) & 0x3FFF;
ut_ad(val > 0x7F);
*b = 2;
} else if (val < 0xE0) {
/* 110nnnnn nnnnnnnn nnnnnnnn (21 bits) */
val = mach_read_from_3(*b) & 0x1FFFFF;
ut_ad(val > 0x3FFF);
*b = 3;
} else if (val < 0xF0) {
/* 1110nnnn nnnnnnnn nnnnnnnn nnnnnnnn (28 bits) */
val = mach_read_from_4(*b) & 0xFFFFFFF;
ut_ad(val > 0x1FFFFF);
*b = 4;
} else if (val < 0xF8) {
/* 11110000 nnnnnnnn nnnnnnnn nnnnnnnn nnnnnnnn (32 bits) */
ut_ad(val == 0xF0);
val = mach_read_from_4(*b 1);
/* this can treat not-extended format also. */
ut_ad(val > 0xFFFFFFF);
*b = 5;
} else if (val < 0xFC) {
/* 111110nn nnnnnnnn (10 bits) (extended) */
val = (mach_read_from_2(*b) & 0x3FF) | 0xFFFFFC00;
*b = 2;
} else if (val < 0xFE) {
/* 1111110n nnnnnnnn nnnnnnnn (17 bits) (extended) */
val = (mach_read_from_3(*b) & 0x1FFFF) | 0xFFFE0000;
ut_ad(val < 0xFFFFFC00);
*b = 3;
} else {
/* 11111110 nnnnnnnn nnnnnnnn nnnnnnnn (24 bits) (extended) */
ut_ad(val == 0xFE);
val = mach_read_from_3(*b 1) | 0xFF000000;
ut_ad(val < 0xFFFE0000);
*b = 4;
}
return (static_cast<uint32_t>(val));
}
static inline uint64_t mach_read_next_much_compressed(const byte **b) {
uint64_t val = mach_read_from_1(*b);
if (val < 0x80) {
/* 0nnnnnnn (7 bits) */
*b;
} else if (val < 0xC0) {
/* 10nnnnnn nnnnnnnn (14 bits) */
val = mach_read_from_2(*b) & 0x3FFF;
ut_ad(val > 0x7F);
*b = 2;
} else if (val < 0xE0) {
/* 110nnnnn nnnnnnnn nnnnnnnn (21 bits) */
val = mach_read_from_3(*b) & 0x1FFFFF;
ut_ad(val > 0x3FFF);
*b = 3;
} else if (val < 0xF0) {
/* 1110nnnn nnnnnnnn nnnnnnnn nnnnnnnn (28 bits) */
val = mach_read_from_4(*b) & 0xFFFFFFF;
ut_ad(val > 0x1FFFFF);
*b = 4;
} else if (val < 0xF8) {
/* 11110000 nnnnnnnn nnnnnnnn nnnnnnnn nnnnnnnn (32 bits) */
ut_ad(val == 0xF0);
val = mach_read_from_4(*b 1);
/* this can treat not-extended format also. */
ut_ad(val > 0xFFFFFFF);
*b = 5;
} else if (val < 0xFC) {
/* 111110nn nnnnnnnn (10 bits) (extended) */
val = (mach_read_from_2(*b) & 0x3FF) | 0xFFFFFC00;
*b = 2;
} else if (val < 0xFE) {
/* 1111110n nnnnnnnn nnnnnnnn (17 bits) (extended) */
val = (mach_read_from_3(*b) & 0x1FFFF) | 0xFFFE0000;
ut_ad(val < 0xFFFFFC00);
*b = 3;
} else if (val == 0xFE) {
/* 11111110 nnnnnnnn nnnnnnnn nnnnnnnn (24 bits) (extended) */
ut_ad(val == 0xFE);
val = mach_read_from_3(*b 1) | 0xFF000000;
ut_ad(val < 0xFFFE0000);
*b = 4;
} else {
/* 11111111 followed by up to 80 bits */
ut_ad(val == 0xFF);
*b;
val = mach_read_next_compressed(b);
ut_ad(val > 0);
val <<= 32;
val |= mach_read_next_compressed(b);
}
return (val);
}
不要紧, 我们不需要了解为啥这么存(基本上都是处于空间考虑), 我们直接用py写一遍就是了.
代码语言:python代码运行次数:0复制def mach_read_next_compressed(tdata):
val = struct.unpack('>B',tdata.readn(1))[0]
if val < 0x80:
val = struct.unpack('>B',tdata.read(1))[0]
elif val < 0xC0:
val = struct.unpack('>H',tdata.read(2))[0] & 0x3FFF
elif val < 0xE0:
val2,val1 = struct.unpack('>HB',tdata.read(3))
val = ((val2<<8) val1) & 0x1FFFFF
elif val < 0xF0:
val = struct.unpack('>L',tdata.read(4))[0] & 0xFFFFFFF
elif val < 0xF8:
tdata.read(1)
val = struct.unpack('>L',tdata.read(4))[0]
elif val < 0xFC:
val = (struct.unpack('>H',tdata.read(2))[0] & 0x3FF) | 0xFFFFFC00
elif val < 0xFE:
val2,val1 = struct.unpack('>HB',tdata.read(3))
valt = (val2<<8) val1
val = (valt & 0x1FFFF) | 0xFFFE0000;
else:
val = (((struct.unpack('>L',tdata.read(4))[0])<<8)>>8) | 0xFF000000
return val
# 花里胡哨的存储
def mach_read_next_much_compressed(tdata):
val = struct.unpack('>B',tdata.readn(1))[0]
if val < 0x80:
#pass
val = struct.unpack('>B',tdata.read(1))[0]
elif val < 0xC0:
val = struct.unpack('>H',tdata.read(2))[0] & 0x3FFF
elif val < 0xE0:
val2,val1 = struct.unpack('>HB',tdata.read(3))
val = ((val2<<8) val1) & 0x1FFFFF
elif val < 0xF0:
val = struct.unpack('>L',tdata.read(4))[0] & 0xFFFFFFF
elif val < 0xF8:
tdata.read(1)
val = struct.unpack('>L',tdata.read(4))[0]
elif val < 0xFC:
val = (struct.unpack('>H',tdata.read(2))[0] & 0x3FF) | 0xFFFFFC00
elif val < 0xFE:
val2,val1 = struct.unpack('>HB',tdata.read(3))
valt = (val2<<8) val1
val = (valt & 0x1FFFF) | 0xFFFE0000;
elif val == 0xFE:
val = (((struct.unpack('>L',tdata.read(4))[0])<<8)>>8) | 0xFF000000
else:
tdata.read(1)
val = mach_read_next_much_compressed(tdata)
val <<= 32
val |= mach_read_next_compressed(tdata)
return val
或许这就是python的魅力吧.
TRX_UNDO_UPD_XXX_REC
update要比insert多一点事务信息.
我们就不去细究了, 太复杂了. 有兴趣的可以去看mysql源码: storage/innobase/trx/trx0rec.cc
验证
还是来使用代码进行愉快的验证吧.
我们显示开启一个事务, 先不提交
然后我们使用ibd2sql的debug功能来获取对应数据行的回滚段信息
然后我们根据这个rollptr去解析实际的undo文件
然后我们再完整的解析undo文件, 并过滤指定的页
这里解析得到是做了update, 并且是TRX_UNDO_ACTIVE的.table_id是19151 我们去数据库里面验证下
查询出来确实是我们更新的那张表.再来看看数据.
解析出来为: (我们不知道元数据信息, 所以实际上是无法解析出来具体数据的. 所以这个工具后续应该不会再更新了. 没啥dio用)
代码语言:txt复制x80x00x00x02 innodb解析多了, 直接就能看出来是2
x02bb bb
x02bb bb
x02bb bb
x05newdd newdd
我们再开一个会话去验证下数据:
也是能对得上的. 到这里了还是会有点成就感的, 但不多.
总结
mysql会自己去使用undo, 即使undo有问题了,重新初始化就行, 所以undo一般不咋重视. 后续也不在对undo进行解析了. 如果存在长时间不提交的事务, 可能会导致undo文件异常大.(之前遇到过1个, 几百GB了). 如下为完整的undo文件结构图.
参考: https://github.com/mysql/mysql-server
https://dev.mysql.com/doc/refman/8.0
附源码
github地址: https://github.com/ddcw/ddcw/blob/master/python/undo_reader
代码语言:python代码运行次数:0复制#!/usr/bin/env python
# -*- coding: utf-8 -*-
# write by ddcw @https://github.com/ddcw
# 解析undo log的
import argparse
import struct
import os,sys
class bdata_buffer(object):
def __init__(self,bdata):
self.bdata = bdata
self.offset = 0
def read(self,n):
data = self.bdata[self.offset:self.offset n]
self.offset = n
return data
def readn(self,n):
return self.bdata[self.offset:self.offset n]
def mach_read_next_compressed(tdata):
val = struct.unpack('>B',tdata.readn(1))[0]
if val < 0x80:
val = struct.unpack('>B',tdata.read(1))[0]
elif val < 0xC0:
val = struct.unpack('>H',tdata.read(2))[0] & 0x3FFF
elif val < 0xE0:
val2,val1 = struct.unpack('>HB',tdata.read(3))
val = ((val2<<8) val1) & 0x1FFFFF
elif val < 0xF0:
val = struct.unpack('>L',tdata.read(4))[0] & 0xFFFFFFF
elif val < 0xF8:
tdata.read(1)
val = struct.unpack('>L',tdata.read(4))[0]
elif val < 0xFC:
val = (struct.unpack('>H',tdata.read(2))[0] & 0x3FF) | 0xFFFFFC00
elif val < 0xFE:
val2,val1 = struct.unpack('>HB',tdata.read(3))
valt = (val2<<8) val1
val = (valt & 0x1FFFF) | 0xFFFE0000;
else:
val = (((struct.unpack('>L',tdata.read(4))[0])<<8)>>8) | 0xFF000000
return val
# 花里胡哨的存储
def mach_read_next_much_compressed(tdata):
val = struct.unpack('>B',tdata.readn(1))[0]
if val < 0x80:
#pass
val = struct.unpack('>B',tdata.read(1))[0]
elif val < 0xC0:
val = struct.unpack('>H',tdata.read(2))[0] & 0x3FFF
elif val < 0xE0:
val2,val1 = struct.unpack('>HB',tdata.read(3))
val = ((val2<<8) val1) & 0x1FFFFF
elif val < 0xF0:
val = struct.unpack('>L',tdata.read(4))[0] & 0xFFFFFFF
elif val < 0xF8:
tdata.read(1)
val = struct.unpack('>L',tdata.read(4))[0]
elif val < 0xFC:
val = (struct.unpack('>H',tdata.read(2))[0] & 0x3FF) | 0xFFFFFC00
elif val < 0xFE:
val2,val1 = struct.unpack('>HB',tdata.read(3))
valt = (val2<<8) val1
val = (valt & 0x1FFFF) | 0xFFFE0000;
elif val == 0xFE:
val = (((struct.unpack('>L',tdata.read(4))[0])<<8)>>8) | 0xFF000000
else:
tdata.read(1)
val = mach_read_next_much_compressed(tdata)
val <<= 32
val |= mach_read_next_compressed(tdata)
return val
UNDO_LOG_SEGMENT_STAT = [
None,
'TRX_UNDO_ACTIVE',
'TRX_UNDO_CACHED',
'TRX_UNDO_TO_FREE',
'TRX_UNDO_TO_PURGE',
'TRX_UNDO_PREPARED_80028',
'TRX_UNDO_PREPARED',
'TRX_UNDO_PREPARED_IN_TC'
]
UNDO_LOG_SEGEMT_TYPE = [
None,
'TRX_UNDO_INSERT',
'TRX_UNDO_UPDATE' # update delete
]
UNDO_LOG_TYPE = {11:'TRX_UNDO_INSERT_REC',12:'TRX_UNDO_UPD_EXIST_REC',13:'TRX_UNDO_UPD_DEL_REC',14:'TRX_UNDO_DEL_MARK_REC',16:'TRX_UNDO_CMPL_INFO_MULT',64:'TRX_UNDO_MODIFY_BLOB',128:'TRX_UNDO_UPD_EXTERN'}
UNDO_LOG_FLAG = {1:'TRX_UNDO_INSERT_OP',2:'TRX_UNDO_MODIFY_OP'}
def undo_log_parse(bdata,f):
print(struct.unpack('>B',bdata[2:3]))
def inode_page(bdata):
return # 懒得打印了, 太多了
data = bdata_buffer(bdata)
FIL_PAGE_SPACE_OR_CHKSUM, FIL_PAGE_OFFSET, FIL_PAGE_PREV, FIL_PAGE_NEXT, FIL_PAGE_LSN, FIL_PAGE_TYPE, FIL_PAGE_FILE_FLUSH_LSN, FIL_PAGE_SPACE_ID = struct.unpack('>4LQHQL',data.read(38))
print(f"[INODE]PAGE TYPE:{FIL_PAGE_TYPE}")
INODE_PRE_PAGE,INODE_PRE_OFFSET,INODE_NEXT_PAGE,INODE_NEXT_OFFSET = struct.unpack('>LHLH',data.read(12))
print(f"[INODE] PRE:{INODE_PRE_PAGE}:{INODE_PRE_OFFSET} NEXT:{INODE_NEXT_PAGE}:{INODE_NEXT_OFFSET}")
FLST_BASE_NODE = "LLHLH" # LENGTH,FIRST_PAGE,FIRST_OFFSET,LAST_PAGE,LAST_OFFSET
for _ in range(85):
FSEG_ID = struct.unpack('>Q',data.read(8))
FSEG_NOT_FULL_N_USED = struct.unpack('>L',data.read(4))
FSEG_FREE = struct.unpack(f'>{FLST_BASE_NODE}',data.read(16))
FSEG_NOT_FULL = struct.unpack(f'>{FLST_BASE_NODE}',data.read(16))
FSEG_FULL = struct.unpack(f'>{FLST_BASE_NODE}',data.read(16))
FSEG_MAGIC = struct.unpack('>L',data.read(4))
FSEG_FRAG_ARR = struct.unpack('>32L',data.read(32*4))
print(f"[INODE] FSEG_ID:{FSEG_ID}")
print(f"[INODE] FSEG_NOT_FULL_N_USED:{FSEG_NOT_FULL_N_USED}")
print(f"[INODE] FSEG_FREE:{FSEG_FREE}")
print(f"[INODE] FSEG_NOT_FULL:{FSEG_NOT_FULL}")
print(f"[INODE] FSEG_FULL:{FSEG_FULL}")
print(f"[INODE] FSEG_MAGIC:{FSEG_MAGIC} (? 97937874)")
print(f"[INODE] FSEG_FRAG_ARR:{FSEG_FRAG_ARR}")
def rseg_array_page(bdata):
#data = bdata_buffer(bdata)
array_version,array_size = struct.unpack('>2L',bdata[38:46])
array_fseg_header = struct.unpack('>LLH',bdata[46:56]) # space_id,page_id,page_offset
array_slot = struct.unpack('>128L',bdata[56:56 128*4])
#print(array_slot)
print("[RSEG_ARRAY] array_fseg_header:",array_fseg_header)
print("[RSEG_ARRAY] array_slot:",array_slot)
return array_slot
def sys_page(bdata):
# TRX_RSEG_N_SLOTS = (UNIV_PAGE_SIZE / 16) = 1024
# TRX_RSEG_SLOT_SIZE = 4
rseg_max_size,rseg_history_size = struct.unpack('>2L',bdata[38:46])
rseg_history = struct.unpack('>LLHLH',bdata[46:62])
space_id,page_id,offset = struct.unpack('>2LH',bdata[62:72])
undo_slot = struct.unpack('>1024L',bdata[72:72 4*1024])
print(f"[SYS] rseg_max_size:{rseg_max_size}")
print(f"[SYS] rseg_history_size:{rseg_history_size}")
print(f"[SYS] rseg_history:{rseg_history}")
print(f"[SYS] space_id:page_id {space_id}:{page_id}")
print(f"[SYS] offset: {offset}")
print(f"[SYS] undo_slot:{undo_slot}")
def _argparse():
parser = argparse.ArgumentParser(add_help=True, description='解析mysql的undo文件的脚本 https://github.com/ddcw')
parser.add_argument('--rollptr', '-r', dest="ROLLPTR", default=-1, type=int, help='要解析的rollptr')
parser.add_argument(dest='FILENAME', help='undo filename', nargs='?')
return parser.parse_args()
if __name__ == '__main__':
parser = _argparse()
filename = parser.FILENAME
if filename is None or not os.path.exists(filename): # 只考虑一个undo文件的情况
sys.stderr.write(f"nno file {filename}nn")
sys.exit(1)
if parser.ROLLPTR > -1:
rolll_ptr = parser.ROLLPTR
offset = rolll_ptr & 0xFFFF
page_no = (rolll_ptr>>16) & 0xFFFFFFFF
rseg_id = (rolll_ptr>>48) & 0x7F
is_insert = True if rolll_ptr>>55 == 1 else False
if rseg_id != int(filename.split('_')[-1]):
sys.stderr.write(f"nno 这个rollptr({rolll_ptr})不在这个undo里面nn")
sys.exit(1)
with open(filename,'rb') as f:
f.seek(page_no*16384,0)
data = f.read(16384)
end_offset = struct.unpack('>H',data[offset:offset 2])[0]
print(f"PAGENO:{page_no} OFFSET:{offset} --> {end_offset} rseg_id:{rseg_id} is_insert:{is_insert}")
print("DATA:",data[offset 2:end_offset-2])
sys.exit(0)
# 完整的解析这个undo文件
f = open(filename,'rb')
# FIL_PAGE_TYPE_FSP_HDR 好像没必要解析
# FIL_PAGE_IBUF_BITMAP 从来都没解析过...
# FIL_PAGE_INODE 页没必要 -_-
f.seek(16384*3,0) # 21:FIL_PAGE_TYPE_RSEG_ARRAY
bdata = f.read(16384)
array_version,array_size = struct.unpack('>2L',bdata[38:46])
array_fseg_header = struct.unpack('>LLH',bdata[46:56]) # space_id,page_id,page_offset
array_slot = struct.unpack('>128L',bdata[56:56 128*4])
print(f"[ROLLBACK SEGMENT] VERSION : {hex(array_version)}")
print(f"[ROLLBACK SEGMENT] SIZE : {array_size}")
print(f"[ROLLBACK SEGMENT] SPACE_ID : {array_fseg_header[0]}")
print(f"[ROLLBACK SEGMENT] PAGE_ID : {array_fseg_header[1]}")
print(f"[ROLLBACK SEGMENT] PAGE_OFFSET : {array_fseg_header[2]}")
#print(f"[ROLLBACK SEGMENT] SLOT PAGE : {array_slot}")
for slot in array_slot:
f.seek(slot*16384,0)
bdata = f.read(16384) # 6:FIL_PAGE_TYPE_SYS
rseg_max_size,rseg_history_size = struct.unpack('>2L',bdata[38:46])
rseg_history = struct.unpack('>LLHLH',bdata[46:62])
space_id,page_id,offset = struct.unpack('>2LH',bdata[62:72])
undo_slot = struct.unpack('>1024L',bdata[72:72 4*1024])
undo_slot_var = [ ]
for x in undo_slot:
if x != 4294967295:
undo_slot_var.append(x)
print(f"t[UNDO SEGMENT] CURRENT PAGE:{slot}")
print(f"t[UNDO SEGMENT] MAX_SIZE:{rseg_max_size}")
print(f"t[UNDO SEGMENT] HISTORY_SIZE:{rseg_history_size}")
print(f"t[UNDO SEGMENT] HISTORY:{rseg_history}")
print(f"t[UNDO SEGMENT] SPACE_ID:{space_id}")
print(f"t[UNDO SEGMENT] PAGE_ID:{page_id} (inode)")
print(f"t[UNDO SEGMENT] PAGE_OFFSET:{offset}")
#print(f"[UNDO SEGMENT] SLOT PAGE:{undo_slot_var} (去除无效页(4294967295))")
for undo_page in undo_slot_var:
f.seek(undo_page*16384,0)
bdata = f.read(16384) # 2:FIL_PAGE_UNDO_LOG
# fil header
FIL_PAGE_SPACE_OR_CHKSUM, FIL_PAGE_OFFSET, FIL_PAGE_PREV, FIL_PAGE_NEXT, FIL_PAGE_LSN, FIL_PAGE_TYPE, FIL_PAGE_FILE_FLUSH_LSN, FIL_PAGE_SPACE_ID = struct.unpack('>4LQHQL',bdata[:38])
# undo page header
undo_page_type,undo_page_start,undo_page_free = struct.unpack('>3H',bdata[38:44])
undo_page_node = struct.unpack('>LHLH',bdata[44:56])
# undo segment header 仅第一个undo log有 同一时刻只能由一个事务所有, 但是可能记录多个事务的信息
trx_undo_state,trx_undo_last_log = struct.unpack('>HH',bdata[56:60])
trx_undo_fseg_header = struct.unpack('>LLH',bdata[60:70])
trx_undo_page_list = struct.unpack('>LLHLH',bdata[70:86])
print(f"tt[UNDO LOG] CURRENT PAGE: {undo_page}")
print(f"tt[UNDO LOG] PAGE_TYPE : {undo_page_type} ({UNDO_LOG_SEGEMT_TYPE[undo_page_type]})")
print(f"tt[UNDO LOG] PAGE_START : {undo_page_start}") # 开始位置
print(f"tt[UNDO LOG] PAGE_FREE : {undo_page_free}") # 结束位置, 遇到这个offset表示结束了
print(f"tt[UNDO LOG] PAGE_NODE : {undo_page_node}")
print(f"tt[UNDO LOG] TRX_UNDO_STATE : {trx_undo_state} ({UNDO_LOG_SEGMENT_STAT[trx_undo_state]})")
print(f"tt[UNDO LOG] TRX_UNDO_LAST_LOG : {trx_undo_last_log}")
print(f"tt[UNDO LOG] TRX_UNDO_FSEG_HEADER : {trx_undo_fseg_header}")
print(f"tt[UNDO LOG] TRX_UNDO_PAGE_LIST : {trx_undo_page_list}")
# undo log header (86-->276)
undo_log_header = bdata[86:276]
if undo_page_start == undo_page_free:
continue
# 仅解析部分
not_end = True
undo_log_start_offset = undo_page_start
while not_end:
undo_log_end_offset = struct.unpack('>H',bdata[undo_log_start_offset:undo_page_start 2])[0]
not_end = False if undo_log_end_offset == undo_page_free else True
tdata = bdata_buffer(bdata[undo_log_start_offset:undo_log_end_offset][2:-2])
#tdata.offset = 2
undo_log_type_flag = struct.unpack('>B',tdata.read(1))[0]
undo_log_type = undo_log_type_flag&0x0F # bit 0-3
undo_log_flag = (undo_log_type_flag>>4)& 0x03 # bit 4-5
if undo_log_type_flag & 64: #TRX_UNDO_MODIFY_BLOB
undo_rec_flags = struct.unpack('>B',tdata.read(1))[0]
undo_no = mach_read_next_much_compressed(tdata)
table_id = mach_read_next_much_compressed(tdata)
print(f'ttt[UNDO DATA]',bdata[undo_log_start_offset:undo_log_end_offset])
print(f'ttt[UNDO DATA] TYPE: {undo_log_type} ({UNDO_LOG_TYPE[undo_log_type]})')
print(f'ttt[UNDO DATA] FLAG: {undo_log_flag}')# ({UNDO_LOG_FLAG[undo_log_flag]})')
print(f'ttt[UNDO DATA] UNDO NO : {undo_no}')
print(f'ttt[UNDO DATA] TABLE ID : {table_id}')
print(f'ttt[UNDO DATA] REST_DATA: {tdata.bdata[tdata.offset:]}')
#undo_log_parse(tdata,f) # 把文件描述一并丢过去
#trx_undo_trx_id,trx_undo_trx_no,trx_undo_del_marks,trx_undo_log_start,trx_undo_flags,trx_undo_dict_trans,trx_undo_table_id,trx_undo_next_log,trx_undo_prev_log = struct.unpack('>QQHHBBQHH',tdata[0:34])
#trx_undo_history_node = struct.unpack('>LHLH',tdata[34:46])
if undo_page_node[2] == FIL_PAGE_SPACE_ID:
undo_log_start_offset = undo_page_node[2]
else:
not_end = False # 跨FILE的哒咩
f.close()