一次把RMAN备份速度提高6倍的工作笔记

2023-04-06 17:58:54 浏览数 (1)

这是我给一个使用Oracle数据库的客户做一个RMAN备份优化的工作笔记。

优化前后的对比如下表:

对比项

优化前

优化后

提高

1.4T全量备份时间

131分钟27秒

5分钟15秒

提高5.99倍

读nvme磁盘的速度

672MB/s

4607MB/s

提高6.8倍

iostate中磁盘利用率

80%

100%

提高20%

进行了一下3项变更:

  1. 将同步IO改成异步IO,大大提高IO性能;
  2. 修改linux内核参数shmmax和shmall,将共享内存段从35个减少到3个,碎片大大减少;
  3. 配置hugepage,Oracle的共享内存管理单位从4K增加到2M,性能大幅提高;

将同步IO改成异步IO

查询当前数据文件的I/O方式:

代码语言:javascript复制
SQL> SELECT NAME,ASYNCH_IO FROM V$DATAFILE F,V$IOSTAT_FILE I WHERE F.FILE#=I.FILE_NO AND FILETYPE_NAME='Data File';
/test-nvme/oradata/orcl/system01.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/sysaux01.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/undotbs01.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/users01.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc1.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc2.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc3.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc4.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc5.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc6.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc7.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc8.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc9.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc10.dbf       ASYNC_OFF
/test-nvme/oradata/orcl/cccc11.dbf       ASYNC_OFF

启动异步IO:

代码语言:javascript复制
SQL>   alter system set filesystemio_options=setall  scope=spfile;
System altered.
SQL> STARTUP FORCE;

再次查询数据文件的I/O方式:

代码语言:javascript复制
SQL>  SELECT NAME,ASYNCH_IO FROM V$DATAFILE F,V$IOSTAT_FILE I WHERE F.FILE#=I.FILE_NO AND FILETYPE_NAME='Data File';
/test-nvme/oradata/orcl/system01.dbf       ASYNC_ON
/test-nvme/oradata/orcl/sysaux01.dbf       ASYNC_ON
/test-nvme/oradata/orcl/undotbs01.dbf       ASYNC_ON
/test-nvme/oradata/orcl/users01.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc1.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc2.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc3.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc4.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc5.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc6.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc7.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc8.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc9.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc10.dbf       ASYNC_ON
/test-nvme/oradata/orcl/cccc11.dbf       ASYNC_ON
15 rows selected.

发现已经改成了异步I/O。

检查HugePages内存

目前内存里的Hugepage为零:

# cat /proc/meminfo |grep -i HugePage

AnonHugePages: 2156544 kB

HugePages_Total: 0

HugePages_Free: 0

HugePages_Rsvd: 0

HugePages_Surp: 0

Hugepagesize: 2048 kB

参考MOS的文档Doc ID 361468.1进行配置后,再次查询内存里的Hugepage如下:

# cat /proc/meminfo |grep -i HugePage

AnonHugePages: 0 kB

HugePages_Total: 97284

HugePages_Free: 97284

HugePages_Rsvd: 0

HugePages_Surp: 0

Hugepagesize: 2048 kB

修改linux内核参数shmmax和shmall

修改之前共享内存

代码语言:javascript复制
[oracle@base-test-01 ~]$ ipcs -m
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
0x00000000 114458624  oracle     640        4096       0                       
0x00000000 114491393  oracle     640        4096       0                       
0x00000000 114524163  oracle     640        4096       0                       
0x00000000 1212420    root       644        80         2                       
0x00000000 1245189    root       644        16384      2                       
0x00000000 1277958    root       644        280        2                       
0x00000000 297435143  root       600        524288     2          dest         
0x00000000 114556936  oracle     640        4096       0                       
0x00000000 114589705  oracle     640        4096       0                       
0x00000000 114622474  oracle     640        4096       0                       
0x00000000 114655243  oracle     640        4096       0                       
0x00000000 114688012  oracle     640        4096       0                       
0x00000000 114720781  oracle     640        4096       0                       
0x00000000 114753550  oracle     640        4096       0                       
0x00000000 114786319  oracle     640        4096       0                       
0x00000000 114819088  oracle     640        4096       0                       
0x00000000 114851857  oracle     640        4096       0                       
0x00000000 114884626  oracle     640        4096       0                       
0x00000000 114917395  oracle     640        4096       0                       
0x00000000 114950164  oracle     640        4096       0                       
0x00000000 114982933  oracle     640        4096       0                       
0x00000000 115015702  oracle     640        4096       0                       
0x00000000 115048471  oracle     640        4096       0                       
0x00000000 115081240  oracle     640        4096       0                       
0x00000000 115114009  oracle     640        4096       0                       
0x00000000 115146778  oracle     640        4096       0                       
0x00000000 115179547  oracle     640        4096       0                       
0x00000000 115212316  oracle     640        4096       0                       
0x00000000 115245085  oracle     640        4096       0                       
0x00000000 115277854  oracle     640        4096       0                       
0x00000000 115310623  oracle     640        4096       0                       
0x00000000 115343392  oracle     640        4096       0                       
0x00000000 115376161  oracle     640        4096       0                       
0x00000000 115408930  oracle     640        4096       0                       
0x00000000 297762851  root       600        4194304    2          dest         
0x00000000 297861156  root       600        524288     2          dest         
0x00000000 115441701  oracle     640        4096       0                       
0x00000000 115474470  oracle     640        4096       0                       
0x00000000 115507239  oracle     640        4096       0                       
0x00000000 115540008  oracle     640        4096       0                       
0x42e38fd0 115572777  oracle     640        4096       0                       
0x00000000 127959083  root       600        832920     2          dest    

修改之前linux内核参数

代码语言:javascript复制
$ cat /etc/sysctl.conf
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 2097152
kernel.shmmax = 4294967295 
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
vm.swappiness=1

修改之后共享内存

代码语言:javascript复制
[root@base-test-01 oracle]# ipcs -m
------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
0x00000000 504037376  oracle     640        536870912  72                      
0x00000000 504070145  oracle     640        107911053312 72                      
0x42e38fd0 504102915  oracle     640        2097152    72                      
0x00000000 1212420    root       644        80         2                       
0x00000000 1245189    root       644        16384      2                       
0x00000000 1277958    root       644        280        2                       
0x00000000 297435143  root       600        524288     2          dest         
0x00000000 508821513  root       600        832920     2          dest         
0x00000000 297762851  root       600        4194304    2          dest         
0x00000000 297861156  root       600        524288     2          dest 
#  sysresv -l orcl -d on
IPC Resources for ORACLE_SID "orcl" :
Shared Memory:
ID    KEY
504037376  0x00000000
504070145  0x00000000
504102915  0x42e38fd0

修改之后linux内核参数

代码语言:javascript复制
[root@base-test-01 oracle]# sysctl -p
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 67108864
kernel.shmmax = 256000000000
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
vm.swappiness = 1

前后对比,oracle的共享内存从35个减少到3个,碎片大大减少。

使用validate命令进行只读验证

没有优化前备份性能

修改环境变量,将rman中的时间提示最小单位从天改到秒:

export NLS_DATE_FORMAT=‘YYYY-MM-DD HH24:MI:SS’

export NLS_LANG=AMERICAN_AMERICA.ZHS16GBK # 或:export NLS_LANG=AMERICAN

rman备份

代码语言:javascript复制
RMAN>  backup validate database;
Starting backup at 2020-02-26 14:42:44
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=/test-nvme/oradata/orcl/cccc1.dbf
input datafile file number=00006 name=/test-nvme/oradata/orcl/cccc2.dbf
input datafile file number=00007 name=/test-nvme/oradata/orcl/cccc3.dbf
input datafile file number=00008 name=/test-nvme/oradata/orcl/cccc4.dbf
input datafile file number=00009 name=/test-nvme/oradata/orcl/cccc5.dbf
input datafile file number=00010 name=/test-nvme/oradata/orcl/cccc6.dbf
input datafile file number=00011 name=/test-nvme/oradata/orcl/cccc7.dbf
input datafile file number=00012 name=/test-nvme/oradata/orcl/cccc8.dbf
input datafile file number=00013 name=/test-nvme/oradata/orcl/cccc9.dbf
input datafile file number=00014 name=/test-nvme/oradata/orcl/cccc10.dbf
input datafile file number=00015 name=/test-nvme/oradata/orcl/cccc11.dbf
input datafile file number=00003 name=/test-nvme/oradata/orcl/undotbs01.dbf
input datafile file number=00001 name=/test-nvme/oradata/orcl/system01.dbf
input datafile file number=00002 name=/test-nvme/oradata/orcl/sysaux01.dbf
input datafile file number=00004 name=/test-nvme/oradata/orcl/users01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:31:25
.........
RMAN> 

从14:42:44开始到15:14:11结束,31分钟27秒,一共1887秒。

从os层查看io情况

代码语言:javascript复制
$ iostat 60 nvme0n1
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
nvme0n1        5091.10    650829.00        93.44   39049740       5606
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.84    0.00    2.11    0.75    0.00   96.30
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
nvme0n1        5483.45    701698.13        11.38   42101888        683
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.93    0.00    2.20    0.72    0.00   96.15
Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
nvme0n1        5589.93    715302.93        12.60   42918176        756
[oracle@base-test-01 ~]$ iostat -x 60 nvme0n1
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 5255.30    0.93 672465.47     9.53   255.88     1.24    0.24    0.24    0.00   0.16  83.82
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.77    0.00    2.11    0.73    0.00   96.39
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 5425.92    1.08 694424.67    11.71   255.92     1.25    0.23    0.23    0.05   0.16  85.25

优化后的备份性能

rman备份

代码语言:javascript复制
[oracle@base-test-01 ~]$ rman target /
Recovery Manager: Release 11.2.0.4.0 - Production on Wed Feb 26 18:28:40 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
connected to target database: ORCL (DBID=1560763823)
RMAN>  backup validate database;
Starting backup at 2020-02-26 18:28:47
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=9391 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=/test-nvme/oradata/orcl/cccc1.dbf
input datafile file number=00006 name=/test-nvme/oradata/orcl/cccc2.dbf
input datafile file number=00007 name=/test-nvme/oradata/orcl/cccc3.dbf
input datafile file number=00008 name=/test-nvme/oradata/orcl/cccc4.dbf
input datafile file number=00009 name=/test-nvme/oradata/orcl/cccc5.dbf
input datafile file number=00010 name=/test-nvme/oradata/orcl/cccc6.dbf
input datafile file number=00011 name=/test-nvme/oradata/orcl/cccc7.dbf
input datafile file number=00012 name=/test-nvme/oradata/orcl/cccc8.dbf
input datafile file number=00013 name=/test-nvme/oradata/orcl/cccc9.dbf
input datafile file number=00014 name=/test-nvme/oradata/orcl/cccc10.dbf
input datafile file number=00015 name=/test-nvme/oradata/orcl/cccc11.dbf
input datafile file number=00003 name=/test-nvme/oradata/orcl/undotbs01.dbf
input datafile file number=00001 name=/test-nvme/oradata/orcl/system01.dbf
input datafile file number=00002 name=/test-nvme/oradata/orcl/sysaux01.dbf
input datafile file number=00004 name=/test-nvme/oradata/orcl/users01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:05:15
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1    OK     0              15274        97280           10829000  
  File Name: /test-nvme/oradata/orcl/system01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              65126           
  Index      0              13578           
  Other      0              3302            
......
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Control File and SPFILE
===============================
File Type    Status Blocks Failing Blocks Examined
------------ ------ -------------- ---------------
SPFILE       OK     0              2               
Control File OK     0              3146            
Finished backup at 2020-02-26 18:34:04

5分钟15秒完成。

从OS层看磁盘的IO性能

代码语言:javascript复制
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.02 35998.80    0.45 4607693.07     5.99   255.99    90.92    2.53    2.53    0.41   0.03 100.00
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.74    0.00    1.23    0.00    0.00   97.03
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 35901.82    0.45 4595201.33     5.73   255.98    90.03    2.51    2.51    0.44   0.03 100.00
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.78    0.00    1.24    0.01    0.00   96.98
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 36015.75    0.52 4609726.14     5.54   255.98    88.58    2.46    2.46    0.26   0.03 100.00
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.66    0.00    1.19    0.00    0.00   97.14
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
nvme0n1           0.00     0.00 36069.02    0.58 4616505.08     5.75   255.98    91.41    2.53    2.53    0.23   0.03 100.00

0 人点赞