这是我给一个使用Oracle数据库的客户做一个RMAN备份优化的工作笔记。
优化前后的对比如下表:
对比项 | 优化前 | 优化后 | 提高 |
---|---|---|---|
1.4T全量备份时间 | 131分钟27秒 | 5分钟15秒 | 提高5.99倍 |
读nvme磁盘的速度 | 672MB/s | 4607MB/s | 提高6.8倍 |
iostate中磁盘利用率 | 80% | 100% | 提高20% |
进行了一下3项变更:
- 将同步IO改成异步IO,大大提高IO性能;
- 修改linux内核参数shmmax和shmall,将共享内存段从35个减少到3个,碎片大大减少;
- 配置hugepage,Oracle的共享内存管理单位从4K增加到2M,性能大幅提高;
将同步IO改成异步IO
查询当前数据文件的I/O方式:
代码语言:javascript复制SQL> SELECT NAME,ASYNCH_IO FROM V$DATAFILE F,V$IOSTAT_FILE I WHERE F.FILE#=I.FILE_NO AND FILETYPE_NAME='Data File';
/test-nvme/oradata/orcl/system01.dbf ASYNC_OFF
/test-nvme/oradata/orcl/sysaux01.dbf ASYNC_OFF
/test-nvme/oradata/orcl/undotbs01.dbf ASYNC_OFF
/test-nvme/oradata/orcl/users01.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc1.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc2.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc3.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc4.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc5.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc6.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc7.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc8.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc9.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc10.dbf ASYNC_OFF
/test-nvme/oradata/orcl/cccc11.dbf ASYNC_OFF
启动异步IO:
代码语言:javascript复制SQL> alter system set filesystemio_options=setall scope=spfile;
System altered.
SQL> STARTUP FORCE;
再次查询数据文件的I/O方式:
代码语言:javascript复制SQL> SELECT NAME,ASYNCH_IO FROM V$DATAFILE F,V$IOSTAT_FILE I WHERE F.FILE#=I.FILE_NO AND FILETYPE_NAME='Data File';
/test-nvme/oradata/orcl/system01.dbf ASYNC_ON
/test-nvme/oradata/orcl/sysaux01.dbf ASYNC_ON
/test-nvme/oradata/orcl/undotbs01.dbf ASYNC_ON
/test-nvme/oradata/orcl/users01.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc1.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc2.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc3.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc4.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc5.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc6.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc7.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc8.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc9.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc10.dbf ASYNC_ON
/test-nvme/oradata/orcl/cccc11.dbf ASYNC_ON
15 rows selected.
发现已经改成了异步I/O。
检查HugePages内存
目前内存里的Hugepage为零:
# cat /proc/meminfo |grep -i HugePage
AnonHugePages: 2156544 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
参考MOS的文档Doc ID 361468.1进行配置后,再次查询内存里的Hugepage如下:
# cat /proc/meminfo |grep -i HugePage
AnonHugePages: 0 kB
HugePages_Total: 97284
HugePages_Free: 97284
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
修改linux内核参数shmmax和shmall
修改之前共享内存
代码语言:javascript复制[oracle@base-test-01 ~]$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 114458624 oracle 640 4096 0
0x00000000 114491393 oracle 640 4096 0
0x00000000 114524163 oracle 640 4096 0
0x00000000 1212420 root 644 80 2
0x00000000 1245189 root 644 16384 2
0x00000000 1277958 root 644 280 2
0x00000000 297435143 root 600 524288 2 dest
0x00000000 114556936 oracle 640 4096 0
0x00000000 114589705 oracle 640 4096 0
0x00000000 114622474 oracle 640 4096 0
0x00000000 114655243 oracle 640 4096 0
0x00000000 114688012 oracle 640 4096 0
0x00000000 114720781 oracle 640 4096 0
0x00000000 114753550 oracle 640 4096 0
0x00000000 114786319 oracle 640 4096 0
0x00000000 114819088 oracle 640 4096 0
0x00000000 114851857 oracle 640 4096 0
0x00000000 114884626 oracle 640 4096 0
0x00000000 114917395 oracle 640 4096 0
0x00000000 114950164 oracle 640 4096 0
0x00000000 114982933 oracle 640 4096 0
0x00000000 115015702 oracle 640 4096 0
0x00000000 115048471 oracle 640 4096 0
0x00000000 115081240 oracle 640 4096 0
0x00000000 115114009 oracle 640 4096 0
0x00000000 115146778 oracle 640 4096 0
0x00000000 115179547 oracle 640 4096 0
0x00000000 115212316 oracle 640 4096 0
0x00000000 115245085 oracle 640 4096 0
0x00000000 115277854 oracle 640 4096 0
0x00000000 115310623 oracle 640 4096 0
0x00000000 115343392 oracle 640 4096 0
0x00000000 115376161 oracle 640 4096 0
0x00000000 115408930 oracle 640 4096 0
0x00000000 297762851 root 600 4194304 2 dest
0x00000000 297861156 root 600 524288 2 dest
0x00000000 115441701 oracle 640 4096 0
0x00000000 115474470 oracle 640 4096 0
0x00000000 115507239 oracle 640 4096 0
0x00000000 115540008 oracle 640 4096 0
0x42e38fd0 115572777 oracle 640 4096 0
0x00000000 127959083 root 600 832920 2 dest
修改之前linux内核参数
代码语言:javascript复制$ cat /etc/sysctl.conf
# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 2097152
kernel.shmmax = 4294967295
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
vm.swappiness=1
修改之后共享内存
代码语言:javascript复制[root@base-test-01 oracle]# ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 504037376 oracle 640 536870912 72
0x00000000 504070145 oracle 640 107911053312 72
0x42e38fd0 504102915 oracle 640 2097152 72
0x00000000 1212420 root 644 80 2
0x00000000 1245189 root 644 16384 2
0x00000000 1277958 root 644 280 2
0x00000000 297435143 root 600 524288 2 dest
0x00000000 508821513 root 600 832920 2 dest
0x00000000 297762851 root 600 4194304 2 dest
0x00000000 297861156 root 600 524288 2 dest
# sysresv -l orcl -d on
IPC Resources for ORACLE_SID "orcl" :
Shared Memory:
ID KEY
504037376 0x00000000
504070145 0x00000000
504102915 0x42e38fd0
修改之后linux内核参数
代码语言:javascript复制[root@base-test-01 oracle]# sysctl -p
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 67108864
kernel.shmmax = 256000000000
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
vm.swappiness = 1
前后对比,oracle的共享内存从35个减少到3个,碎片大大减少。
使用validate命令进行只读验证
没有优化前备份性能
修改环境变量,将rman中的时间提示最小单位从天改到秒:
export NLS_DATE_FORMAT=‘YYYY-MM-DD HH24:MI:SS’
export NLS_LANG=AMERICAN_AMERICA.ZHS16GBK # 或:export NLS_LANG=AMERICAN
rman备份
代码语言:javascript复制RMAN> backup validate database;
Starting backup at 2020-02-26 14:42:44
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=/test-nvme/oradata/orcl/cccc1.dbf
input datafile file number=00006 name=/test-nvme/oradata/orcl/cccc2.dbf
input datafile file number=00007 name=/test-nvme/oradata/orcl/cccc3.dbf
input datafile file number=00008 name=/test-nvme/oradata/orcl/cccc4.dbf
input datafile file number=00009 name=/test-nvme/oradata/orcl/cccc5.dbf
input datafile file number=00010 name=/test-nvme/oradata/orcl/cccc6.dbf
input datafile file number=00011 name=/test-nvme/oradata/orcl/cccc7.dbf
input datafile file number=00012 name=/test-nvme/oradata/orcl/cccc8.dbf
input datafile file number=00013 name=/test-nvme/oradata/orcl/cccc9.dbf
input datafile file number=00014 name=/test-nvme/oradata/orcl/cccc10.dbf
input datafile file number=00015 name=/test-nvme/oradata/orcl/cccc11.dbf
input datafile file number=00003 name=/test-nvme/oradata/orcl/undotbs01.dbf
input datafile file number=00001 name=/test-nvme/oradata/orcl/system01.dbf
input datafile file number=00002 name=/test-nvme/oradata/orcl/sysaux01.dbf
input datafile file number=00004 name=/test-nvme/oradata/orcl/users01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:31:25
.........
RMAN>
从14:42:44开始到15:14:11结束,31分钟27秒,一共1887秒。
从os层查看io情况
代码语言:javascript复制$ iostat 60 nvme0n1
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
nvme0n1 5091.10 650829.00 93.44 39049740 5606
avg-cpu: %user %nice %system %iowait %steal %idle
0.84 0.00 2.11 0.75 0.00 96.30
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
nvme0n1 5483.45 701698.13 11.38 42101888 683
avg-cpu: %user %nice %system %iowait %steal %idle
0.93 0.00 2.20 0.72 0.00 96.15
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
nvme0n1 5589.93 715302.93 12.60 42918176 756
[oracle@base-test-01 ~]$ iostat -x 60 nvme0n1
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
nvme0n1 0.00 0.00 5255.30 0.93 672465.47 9.53 255.88 1.24 0.24 0.24 0.00 0.16 83.82
avg-cpu: %user %nice %system %iowait %steal %idle
0.77 0.00 2.11 0.73 0.00 96.39
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
nvme0n1 0.00 0.00 5425.92 1.08 694424.67 11.71 255.92 1.25 0.23 0.23 0.05 0.16 85.25
优化后的备份性能
rman备份
代码语言:javascript复制[oracle@base-test-01 ~]$ rman target /
Recovery Manager: Release 11.2.0.4.0 - Production on Wed Feb 26 18:28:40 2020
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: ORCL (DBID=1560763823)
RMAN> backup validate database;
Starting backup at 2020-02-26 18:28:47
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=9391 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00005 name=/test-nvme/oradata/orcl/cccc1.dbf
input datafile file number=00006 name=/test-nvme/oradata/orcl/cccc2.dbf
input datafile file number=00007 name=/test-nvme/oradata/orcl/cccc3.dbf
input datafile file number=00008 name=/test-nvme/oradata/orcl/cccc4.dbf
input datafile file number=00009 name=/test-nvme/oradata/orcl/cccc5.dbf
input datafile file number=00010 name=/test-nvme/oradata/orcl/cccc6.dbf
input datafile file number=00011 name=/test-nvme/oradata/orcl/cccc7.dbf
input datafile file number=00012 name=/test-nvme/oradata/orcl/cccc8.dbf
input datafile file number=00013 name=/test-nvme/oradata/orcl/cccc9.dbf
input datafile file number=00014 name=/test-nvme/oradata/orcl/cccc10.dbf
input datafile file number=00015 name=/test-nvme/oradata/orcl/cccc11.dbf
input datafile file number=00003 name=/test-nvme/oradata/orcl/undotbs01.dbf
input datafile file number=00001 name=/test-nvme/oradata/orcl/system01.dbf
input datafile file number=00002 name=/test-nvme/oradata/orcl/sysaux01.dbf
input datafile file number=00004 name=/test-nvme/oradata/orcl/users01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:05:15
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1 OK 0 15274 97280 10829000
File Name: /test-nvme/oradata/orcl/system01.dbf
Block Type Blocks Failing Blocks Processed
---------- -------------- ----------------
Data 0 65126
Index 0 13578
Other 0 3302
......
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including current control file in backup set
including current SPFILE in backup set
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Control File and SPFILE
===============================
File Type Status Blocks Failing Blocks Examined
------------ ------ -------------- ---------------
SPFILE OK 0 2
Control File OK 0 3146
Finished backup at 2020-02-26 18:34:04
5分钟15秒完成。
从OS层看磁盘的IO性能
代码语言:javascript复制Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
nvme0n1 0.00 0.02 35998.80 0.45 4607693.07 5.99 255.99 90.92 2.53 2.53 0.41 0.03 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.74 0.00 1.23 0.00 0.00 97.03
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
nvme0n1 0.00 0.00 35901.82 0.45 4595201.33 5.73 255.98 90.03 2.51 2.51 0.44 0.03 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.78 0.00 1.24 0.01 0.00 96.98
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
nvme0n1 0.00 0.00 36015.75 0.52 4609726.14 5.54 255.98 88.58 2.46 2.46 0.26 0.03 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
1.66 0.00 1.19 0.00 0.00 97.14
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
nvme0n1 0.00 0.00 36069.02 0.58 4616505.08 5.75 255.98 91.41 2.53 2.53 0.23 0.03 100.00