PostgreSQL Timelines and History File:Master Guide

2021-02-26 16:19:01 浏览数 (1)

“TIMELINESARE THE DIVERGENT POINTS”

Let’s assume you did a point in timerecovery after a wrong transaction, PostgreSQL branches to a new timeline andcontinue with the operations.

But what happens after you perform a pointin time recovery, realized you made a mistake again?

That’s where recovery_target_timeline comesinto picture.

In this post we are going to understandeverything about recovery_target_timeline and timelines in PostgreSQL ingeneral.

PostgreSQLTimeline

Everytime you do a transaction inPostgreSQL the information is recorded in a wal file under $DATADIR/wallocation.

The first file that is created is 000000010000000000000001 and when it is filled the next wal will becreated with the name 000000010000000000000002 and so on.(Itis a HEX notation and more information is beyond the scope of this post)

Here, the first eight digits representPostgreSQL timeline.

In our example, the database cluster is intimeline 1.

After every point in time recovery, thetimeline id will be increased by 1 and a new file called NewTimelineID.historyis created.

recovery_target_timeline is a parameterwhich helps us to take our cluster to any timeline in the history provided avalid base backup and all the archivelogs in place.

Lets consider below example.

I have initialized and started a newcluster with the below command

代码语言:javascript复制
-bash-4.1$ initdb -D basebackup1
-bash-4.1$ pg_ctl start -D /u02/basebackup1

Then I created a table and inserted arecord into it

代码语言:javascript复制
postgres=# create table timeline(tid int,remarks varchar(1000));
CREATE TABLE
postgres=# insert into timelinevalues('1','This is timeline id 1');
INSERT 0 1
postgres=# checkpoint;
CHECKPOINT
postgres=# select pg_switch_wal();
pg_switch_wal
---------------
0/15D4B70
(1 row)

My record is somewhere in my wal000000010000000000000001

After few switches, I have taken a fullbackup when my wal at 000000010000000000000005

代码语言:javascript复制
-bash-4.1$ ls -rlt
total 147460
-rw------- 1 postgres postgres 16777216 Nov22 13:03 000000010000000000000001
-rw------- 1 postgres postgres 16777216 Nov22 13:03 000000010000000000000002
-rw------- 1 postgres postgres 16777216 Nov22 13:03 000000010000000000000003
-rw------- 1 postgres postgres 16777216 Nov22 13:05 000000010000000000000004
-rw------- 1 postgres postgres 16777216 Nov22 13:05 000000010000000000000005
-rw------- 1 postgrespostgres      337 Nov 22 13:05000000010000000000000005.00000028.backup
-rw------- 1 postgres postgres 16777216 Nov22 13:06 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Nov22 13:06 000000010000000000000007

Then I made a few switches and when my walis at 000000010000000000000008

I inserted a new record.

代码语言:javascript复制
postgres=# insert into timelinevalues('1','This is timeline id 1 after basebackup');
INSERT 0 1
postgres=# checkpoint;
CHECKPOINT
 
-bash-4.1$ pg_waldump000000010000000000000008 | grep INSERT
rmgr:Heap        len(rec/tot):     54/   214,tx:        
487, lsn: 0/08000110, prev 0/080000D8,desc: off 2 flags 0x00, blkref #0: rel 1663/13530/16384 blk 0 FPW
-bash-4.1$

Then I made few switches and my currentarchived wal location is something like this

代码语言:javascript复制
-bash-4.1$ ls -rlt
total 311308
-rw------- 1  16777216 Nov 2213:03 000000010000000000000001
-rw------- 1  16777216 Nov 2213:03 000000010000000000000002
-rw------- 1  16777216 Nov 2213:03 000000010000000000000003
-rw------- 1  16777216 Nov 2213:05 000000010000000000000004
-rw------- 1  16777216 Nov 2213:05 000000010000000000000005
-rw-------1       337 Nov 22 13:05000000010000000000000005.00000028.backup
-rw------- 1  16777216 Nov 2213:06 000000010000000000000006
-rw------- 1  16777216 Nov 2213:06 000000010000000000000007
-rw------- 1  16777216 Nov 2213:07 000000010000000000000008
-rw------- 1  16777216 Nov 2213:07 000000010000000000000009
-rw------- 1  16777216 Nov 2213:09 00000001000000000000000A

At this point of time, I have done a pointin time recovery to some point in 000000010000000000000007 wal, before secondinsert.

so I gave recovery target lsn as‘0/07000060’ in postgresql.conf file.

代码语言:javascript复制
-bash-4.1$ cat postgresql.conf | grep lsn
recovery_target_lsn ='0/07000060'      # the WAL LSN up to whichrecovery will proceed
-bash-4.1$

After successful point in time recovery,the PostgreSQL branched to a new timeline.

At the end of recovery the following things will happen

End of recovery means the point where thethe database opens up for writing

New timeline is chosen

A timeline history file is written

The partial last WAL file on the previoustimeline is copied with the new timeline’s ID

A checkpoint record is written on the newtimeline

alert log says,

代码语言:javascript复制
LOG:  starting point-in-timerecovery to WAL location (LSN) "0/7000060"
LOG:  restored log file"000000010000000000000005" from archive
LOG:  redo starts at 0/5000028
LOG:  consistent recovery statereached at 0/5000138
LOG:  database system is ready toaccept read only connections
LOG:  restored log file"000000010000000000000006" from archive
LOG:  restored log file"000000010000000000000007" from archive
LOG:  recovery stopping after WALlocation (LSN) "0/7000060"
LOG:  pausing at the end ofrecovery
HINT:  Executepg_wal_replay_resume() to promote.

after executing pg_wal_replay_resume(), thepostgresql changed to timeline 2 and the same information is recorded inarchivelog location.

代码语言:javascript复制
-rw------- 1 postgres postgres 16777216 Nov22 13:05 000000010000000000000005
-rw------- 1 postgrespostgres      337 Nov 22 13:05000000010000000000000005.00000028.backup
-rw------- 1 postgres postgres 16777216 Nov22 13:06 000000010000000000000006
-rw------- 1 postgres postgres 16777216 Nov22 13:06 000000010000000000000007
-rw------- 1 postgres postgres 16777216 Nov22 13:07 000000010000000000000008
-rw------- 1 postgres postgres 16777216 Nov22 13:07 000000010000000000000009
-rw------- 1 postgres postgres 16777216 Nov22 13:09 00000001000000000000000A
-rw-r----- 1 postgrespostgres       33 Nov 22 13:12 00000002.history
-rw-r----- 1 postgres postgres 16777216 Nov22 13:13 000000020000000000000007
-rw------- 1 postgres postgres 16777216 Nov22 13:13 000000020000000000000008
-rw------- 1 postgres postgres 16777216 Nov22 13:14 000000020000000000000009
-rw-r----- 1 postgres postgres 16777216 Nov22 13:15 00000002000000000000000A
 
-bash-4.1$

Here, the PostgreSQL has branched to a newtimeline at walfile 7 and started creating new walfiles with timeline id 2.

The 00000002.history fileconfirms that the PostgreSQL has branched to new timeline.

The history file is a small text file thatread

代码语言:javascript复制
-bash-4.1$ cat 00000002.history
1 0/70000D8 after LSN 0/7000060

Here

代码语言:javascript复制
1<parentTLI>    0/70000D8<switchpoint>     after LSN 0/7000060<reason>
parentTLI                 ID of the parent timeline
switchpoint                XLogRecPtrof the WAL location where the switch happened
reason                      human-readable explanation of why the timeline was changed

Now, I inserted one recordat 00000002000000000000000A (0/A000060)

代码语言:javascript复制
postgres=# insert into timelinevalues('2','This is timeline id 2 correct');
INSERT 0 1

and another record at00000002000000000000000D (0/D000000)

代码语言:javascript复制
postgres=# insert into timelinevalues('2','This is timeline id 2 wrong at 0/D000000');
INSERT 0 1

At this point of time I realized that I dida mistake at 00000002000000000000000D and has to rollback to 00000002000000000000000C of timeline 2.

This can be achieved by setting belowparameters in postgresql.conf file

recovery_target_timeline = '2'

recovery_target_lsn = '0/0C000060'

After setting up above parameters, Istarted the cluster and the alert log says

代码语言:javascript复制
LOG:  database system wasinterrupted; last known up at 2020-11-22 13:05:01 IST
LOG:  restored log file "00000002.history"from archive
cp: cannot stat`/u02/archivelogs/00000003.history': No such file or directory
LOG:  starting point-in-timerecovery to WAL location (LSN) "0/C000060"
LOG:  restored log file"00000002.history" from archive
LOG:  restored log file "000000010000000000000005"from archive
LOG:  redo starts at 0/5000028
LOG:  consistent recovery statereached at 0/5000138
LOG:  database system is ready toaccept read only connections
LOG:  restored log file"000000010000000000000006" from archive
LOG:  restored log file"000000020000000000000007" from archive
LOG:  restored log file"000000020000000000000008" from archive
LOG:  restored log file"000000020000000000000009" from archive
LOG:  restored log file"00000002000000000000000A" from archive
LOG:  restored log file"00000002000000000000000B" from archive
LOG:  restored log file "00000002000000000000000C"from archive
LOG:  recovery stopping after WALlocation (LSN) "0/C000060"
LOG:  pausing at the end ofrecovery
HINT:  Executepg_wal_replay_resume() to promote.
..
LOG:  redo done at 0/C000060
LOG:  last completed transactionwas at log time 2020-11-22 13:15:29.696929 05:30
LOG:  selected new timeline ID: 3

When I selected the table,

代码语言:javascript复制
postgres=# select * from timeline;
 tid|            remarks
----- -------------------------------
   1 | This is timeline id 1
   2 | This is timeline id 2correct
(2 rows)

The history file will is recorded withbelow details

代码语言:javascript复制
-bash-4.1$ cat 00000003.history
1      0/70000D8       after LSN 0/7000060
  
2      0/C0000D8       after LSN 0/C000060

Here

timeline 1 branched at 0/70000D8

timeline 2 branched at 0/C0000D8

and current timeline is going to be 3.

代码语言:javascript复制
-rw-r----- 1 postgres postgres 16777216 Nov22 13:16 00000002000000000000000F
-rw-r----- 1 postgrespostgres       67 Nov 22 15:59 00000003.history
-rw-r----- 1 postgres postgres 16777216 Nov22 16:05 00000003000000000000000C

0 人点赞