Postgresql的pageinspect插件框架与数据恢复探索

2022-05-12 11:09:40 浏览数 (1)

1 插件代码分析

pageinspect是一个非常典型、简单的插件,大部分流程可以学习、复用。

如果需要写新的插件,可以直接copy修改下面两个函数,大部分框架代码可以直接使用。

get_raw_page读取页面的经典流程

get_raw_page只返回一行数据,所以无需走插件循环产生元组的流程。

读取页面的经典流程:

1、schema table名包装:makeRangeVarFromNameList 2、打开表:relation_openrv 3、读页面(返回pin住的页面):ReadBufferExtended 4、页面加锁:LockBuffer 5、读取数据:memcpy 6、页面解锁:LockBuffer 7、释放页面(unpin):ReleaseBuffer 8、关闭表:relation_close

代码语言:javascript复制
get_raw_page
  get_raw_page_internal(relname, MAIN_FORKNUM, blkno)
    (1) relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname))
      textToQualifiedNameList:表名前加 public. 或 直接使用 schema.tablename
      makeRangeVarFromNameList:{type = T_RangeVar, catalogname = 0x0, schemaname = 0x2ae57c8 "public", relname = 0x2ae58f8 "t81", inh = 1 '01', relpersistence = 112 'p', alias = 0x0, location = -1}
    (2) rel = relation_openrv(relrv, AccessShareLock)
      relation_openrv
        RangeVarGetRelid:RangeVar转OID
          RangeVarGetRelidExtended
            LookupExplicitNamespace:查询到public的id2200
            get_relname_relid(relation->relname, namespaceId)
              // 系统表缓存的典型读取流程
              GetSysCacheOid2(RELNAMENSP, PointerGetDatum(relname), ObjectIdGetDatum(relnamespace))
                GetSysCacheOid (cacheId=49, key1=44980472, key2=2200, key3=0, key4=0)
                  SearchSysCache (cacheId=49, key1=44980472, key2=2200, key3=0, key4=0)
                    SearchCatCache (cache=0x2a97f80, v1=44980472, v2=2200, v3=0, v4=0)
                  HeapTupleGetOid(tuple)
    (3) buf = ReadBufferExtended(rel, forknum, blkno, RBM_NORMAL, NULL)
    (4) LockBuffer(buf, BUFFER_LOCK_SHARE)
    (5) memcpy(raw_page_data, BufferGetPage(buf), BLCKSZ)
    (6) LockBuffer(buf, BUFFER_LOCK_UNLOCK)
    (7) ReleaseBuffer(buf)
    (8) relation_close(rel, AccessShareLock)

heap_page_items

如果插件需要返回多行数据,这里是一个典型的PG插件框架:

代码语言:javascript复制
Datum
heap_page_items(PG_FUNCTION_ARGS)
{
// 进一次,初始化放在这里
if (SRF_IS_FIRSTCALL())
{
    ...
    fctx = SRF_FIRSTCALL_INIT();
		mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);
		...
		fctx->max_calls = PageGetMaxOffsetNumber(inter_call_data->page);
		fctx->user_fctx = 后面需要使用的自定义数据;
		MemoryContextSwitchTo(mctx);
}

fctx = SRF_PERCALL_SETUP();
后面需要使用的自定义数据 = fctx->user_fctx;

// 外层会调用max_calls次,每次拼一行数据返回给SRF_RETURN_NEXT
if (fctx->call_cntr < fctx->max_calls)
{
  拼装数据values
  
  resultTuple = heap_form_tuple(tupdesc, values, nulls);
  result = HeapTupleGetDatum(resultTuple);
  SRF_RETURN_NEXT(fctx, result);
}
else
  SRF_RETURN_DONE(fctx);
}

数据组装比较简单,基本都是用宏在page里面取数据。

2 恢复数据探索

遇到一个客户刚刚删除一条数据,问有没有快捷的方法可以查到被删数据的某个字段? 这里尝试使用pageinspect做恢复实验。

结论先行:

  1. 使用pageinspect的data字段是可以反转为原数据的,但是前提是data字段还在。
  2. 即使没有vacuum,下一次对删除数据所在页面的读写,都可能对页面进行purne,导致删除数据不再能被pageinspect发现。
  3. 使用pageinspect恢复数据可行性不大。

实验数据

代码语言:javascript复制
create table t21(id int, d1 int, d2 varchar(8), d3 text);
alter table t21 set ( autovacuum_enabled = false, toast.autovacuum_enabled = false);
insert into t21 select a, a 100, 'aA012345',md5(random()::text) from generate_series(1,10000) a;
select * from t21;
  id   |  d1   |    d2    |                d3                
------- ------- ---------- ----------------------------------
     1 |   101 | aA012345 | e10022e0fa6ecece950a1ab20caac824
     2 |   102 | aA012345 | 7c963b472d452c4dde9d36760b41a8de
     3 |   103 | aA012345 | 8142f1299d5cf7e384e7b3c43076710d
     4 |   104 | aA012345 | c3c76a8c0cf98e6004051b3bfe415310
     5 |   105 | aA012345 | 163214ae0c27f387d0207e89210e5a57
     6 |   106 | aA012345 | 302cb39b8ebc649d045f2a1123d379bb
     7 |   107 | aA012345 | be22d7234058e2e9247911c011eef47f
     8 |   108 | aA012345 | 7fe88087d8d6fea4470eece997f49360
     9 |   109 | aA012345 | 279eb01c225b71b083b5df2d9bd87d7b
    10 |   110 | aA012345 | 0463ef122689cb78f795c6d4309f3565
    11 |   111 | aA012345 | 1587dd3d027b6fda21d97085369434d7

实验过程

删除5页面数据

代码语言:javascript复制
delete from t21 where ctid='(5,3)';
DELETE 1

观察当前页面状态,data还在

代码语言:javascript复制
postgres=# SELECT lp, t_xmin, t_xmax, t_infomask,substring(t_data from 0 for 8) from heap_page_items(get_raw_page('t21', 5));
 lp |   t_xmin   |   t_xmax   | t_infomask |    substring     
---- ------------ ------------ ------------ ------------------
  1 | 1477542367 |          0 |       2306 | xe60100004a0200
  2 | 1477542367 |          0 |       2306 | xe70100004b0200
  3 | 1477542367 | 1477542374 |        258 | xe80100004c0200
  4 | 1477542367 |          0 |       2306 | xe90100004d0200
  5 | 1477542367 |          0 |       2306 | xea0100004e0200
  6 | 1477542367 |          0 |       2306 | xeb0100004f0200

解析第三条,可根据数据类型反向解析回原始数据。

代码语言:javascript复制
postgres=# SELECT t_attrs FROM heap_page_item_attrs(get_raw_page('t21', 5), 't21'::regclass);
                                                            t_attrs                                                            
-------------------------------------------------------------------------------------------------------------------------------
 {"\xe6010000","\x4a020000","\x136141303132333435","\x433333623539396461646233336235336137356138313065653162353035386262"}
 {"\xe7010000","\x4b020000","\x136141303132333435","\x436330383466376163646361386438356432326566643537633137623161396561"}
 {"\xe8010000","\x4c020000","\x136141303132333435","\x436566386436353365306437333439613639623161613835383236386531376430"}
 {"\xe9010000","\x4d020000","\x136141303132333435","\x433134356539636636336536393231653137353661616130303438633865363364"}
 {"\xea010000","\x4e020000","\x136141303132333435","\x433262663333653063663735643664326335303538323137306136306635303133"}

下面做一次查询,在查询第五页发现pageinspect已经看不到被删除数据了。看不到也就无法解析了。

代码语言:javascript复制
select * from t21;

SELECT lp, t_xmin, t_xmax, t_infomask,substring(t_data from 0 for 8) from heap_page_items(get_raw_page('t21', 5));
 lp |   t_xmin   | t_xmax | t_infomask |    substring     
---- ------------ -------- ------------ ------------------
  1 | 1477542367 |      0 |       2306 | xe60100004a0200
  2 | 1477542367 |      0 |       2306 | xe70100004b0200
  3 |            |        |            | 
  4 | 1477542367 |      0 |       2306 | xe90100004d0200
  5 | 1477542367 |      0 |       2306 | xea0100004e0200
  6 | 1477542367 |      0 |       2306 | xeb0100004f0200
  7 | 1477542367 |      0 |       2306 | xec010000500200

DELETE XLOG哪些信息有用?

如果只关注DELETE的XLOG,可以执行下面命令:

pg_waldump 000000010000002F000000E7 -r heap | grep DELETE

代码语言:javascript复制
rmgr: Heap        len (rec/tot):     59/  8231, tx: 1477542368, lsn: 2F/E7CE5D70, prev 2F/E7CE5D38, desc: DELETE off 1 KEYS_UPDATED , blkref #0: rel 1663/13212/143325 blk 1 FPW
rmgr: Heap        len (rec/tot):     59/  8231, tx: 1477542369, lsn: 2F/E7CE7E50, prev 2F/E7CE7E18, desc: DELETE off 1 KEYS_UPDATED , blkref #0: rel 1663/13212/143325 blk 4 FPW
rmgr: Heap        len (rec/tot):     59/  8151, tx: 1477542370, lsn: 2F/E7CE9FD8, prev 2F/E7CE9FA0, desc: DELETE off 2 KEYS_UPDATED , blkref #0: rel 1663/13212/143325 blk 1 FPW
rmgr: Heap        len (rec/tot):     54/    54, tx: 1477542371, lsn: 2F/E7CF0080, prev 2F/E7CEE0E0, desc: DELETE off 3 KEYS_UPDATED , blkref #0: rel 1663/13212/143325 blk 1
rmgr: Heap        len (rec/tot):     54/    54, tx: 1477542372, lsn: 2F/E7CF0158, prev 2F/E7CF0120, desc: DELETE off 5 KEYS_UPDATED , blkref #0: rel 1663/13212/143325 blk 1
rmgr: Heap        len (rec/tot):     59/  8151, tx: 1477542373, lsn: 2F/E7CF22F0, prev 2F/E7CF22B8, desc: DELETE off 2 KEYS_UPDATED , blkref #0: rel 1663/13212/143325 blk 4 FPW
rmgr: Heap        len (rec/tot):     59/  8231, tx: 1477542374, lsn: 2F/E7CF4428, prev 2F/E7CF43F0, desc: DELETE off 3 KEYS_UPDATED , blkref #0: rel 1663/13212/143325 blk 5 FPW

XLOG中记录了tuple在页面中的偏移量,和页面ID,对确认删除行为、恢复数据应该都有一些作用。

代码语言:javascript复制
rmgr: Heap       

 len (rec/tot):     59/  8231, 

tx: 1477542374, 

lsn: 2F/E7CF4428, 

prev 2F/E7CF43F0, 

desc: DELETE 

off 3 KEYS_UPDATED , (页面内的偏移)

blkref #0: rel 1663/13212/143325 

blk 5 (页面ID)

FPW

0 人点赞