[pbs]Proxmox Backup Server备份注意事项

2022-05-20 14:23:40 浏览数 (1)

安装测试PBS一段时间了,总结一下注意事项。

根据虚拟化的类型(VM或者LXC)和存储类型(lvm 或者 ceph rbd),PBS使用不同的备份方式,块备份和文件备份,两者时间差会很大(可能超过24小时)。

块备份第一次备份是最耗时的,之后都是增量备份,耗时会降低很多;而文件存储备份增量备份耗时和初次备份差不多。

为了避免影响系统运行,所有备份策略使用默认快照方式(实际上Fc SAN上面的LVM并不支持快照,会自动转化为暂停)。

VM虚拟机备份

对VM备份是基于kvm和qemu,无论VM的磁盘在什么设备上,都直接调用其镜像管理读取功能(snapshot,快照模式),相当于执行设备块读操作,速度快,在san存储上,平均速度接近100MiB/s。下面的VM 126有四块磁盘,前面三块(32G,400G,100G)在Fc SAN上面,最后一块(500G)在ceph rbd上面。该VM第一次备份,用时5:39。

代码语言:javascript复制
INFO: Starting Backup of VM 126 (qemu)
INFO: Backup started at 2021-07-19 23:45:02
INFO: status = running
INFO: VM Name: 
INFO: include disk 'virtio0' 'vm:vm-126-disk-1' 32G
INFO: include disk 'virtio1' 'vms:vm-126-disk-2' 400G
INFO: include disk 'virtio2' 'vms:vm-126-disk-0' 1000G
INFO: include disk 'virtio3' 'cephr:vm-126-disk-0' 500G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/126/2021-07-19T15:45:02Z'
INFO: skipping guest-agent 'fs-freeze', agent configured but not running?
INFO: started backup task '33ce99a8-445s-d3ds-4r45-6358f585a015'
INFO: resuming VM again
INFO: virtio0: dirty-bitmap status: created new
...
INFO:   0% (116.0 MiB of 1.9 TiB) in  3s, read: 38.7 MiB/s, write: 38.7 MiB/s
INFO:   1% (19.3 GiB of 1.9 TiB) in  8m 48s, read: 37.5 MiB/s, write: 37.5 MiB/s
INFO:   2% (38.6 GiB of 1.9 TiB) in 17m 51s, read: 36.4 MiB/s, write: 36.4 MiB/s
...
INFO: 100% (1.9 TiB of 1.9 TiB) in  5h 39m 13s, read: 93.3 MiB/s, write: 88.6 MiB/s
INFO: backup is sparse: 396.57 GiB (20%) total zero data
INFO: backup was done incrementally, reused 397.00 GiB (20%)
INFO: transferred 1.89 TiB in 20357 seconds (97.2 MiB/s)
INFO: Finished Backup of VM 126 (05:39:23)
INFO: Backup finished at 2021-07-20 05:24:25
INFO: Backup job finished successfully
TASK OK

值得一提的是即使初次备份,数据重用也有20%,估计是里面有大量文件复制,侧面说明PBS的消重靠谱。

对比完全使用ceph rbd存储的VM 187,备份速度大约60MiB/s,耗时1:04。

代码语言:javascript复制
INFO: Starting Backup of VM 187 (qemu)
INFO: Backup started at 2021-07-16 23:45:01
INFO: status = running
INFO: VM Name: 
INFO: include disk 'virtio0' 'cephr:vm-187-disk-0' 32G
INFO: include disk 'virtio1' 'cephr:vm-187-disk-1' 200G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/187/2021-07-16T15:45:01Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task '89d824a6-be40-4rw3-8d21-a9d0f555a178'
INFO: resuming VM again
INFO: virtio0: dirty-bitmap status: created new
INFO: virtio1: dirty-bitmap status: created new
INFO:   0% (252.0 MiB of 232.0 GiB) in  3s, read: 84.0 MiB/s, write: 74.7 MiB/s
INFO:   1% (2.3 GiB of 232.0 GiB) in 33s, read: 71.2 MiB/s, write: 51.3 MiB/s
INFO:   2% (4.7 GiB of 232.0 GiB) in  1m 15s, read: 57.4 MiB/s, write: 41.1 MiB/s
...
INFO:  98% (227.4 GiB of 232.0 GiB) in  1h  3m 45s, read: 48.8 MiB/s, write: 40.8 MiB/s
INFO:  99% (229.8 GiB of 232.0 GiB) in  1h  4m 17s, read: 77.0 MiB/s, write: 40.5 MiB/s
INFO: 100% (232.0 GiB of 232.0 GiB) in  1h  4m 20s, read: 741.3 MiB/s, write: 1.3 MiB/s
INFO: backup is sparse: 7.00 GiB (3%) total zero data
INFO: backup was done incrementally, reused 11.76 GiB (5%)
INFO: transferred 232.00 GiB in 3861 seconds (61.5 MiB/s)
INFO: Finished Backup of VM 187 (01:04:22)
INFO: Backup finished at 2021-07-17 00:49:23
INFO: Backup job finished successfully
TASK OK

LXC容器备份

对LXC的备份按照其存储介质的不同分两种情况。

存储介质不支持块读取操作

如果LXC的磁盘文件位于基于SAN搭建的LVM上,那么PBS备份是基于文件的,PBS的客户端会分分为两个阶段:

第一阶段:分两次把LXC的文件同步到节点目录/var/tmp/vzdumptempxxxxxxx下面;

第二阶段:把这个目录上传到PBS服务器的存储。

这种方式有两个明显的缺点:

如果有大量文件,必然很慢;

而且要求PVE节点本身有很大的临时空间。

下面这个磁盘500G的LXC(VM 100),初次备份用了将近5个半小时。其中大半时间(9750s)在同步到本地目录。

代码语言:javascript复制
INFO: Starting Backup of VM 100 (lxc)
INFO: Backup started at 2021-07-17 22:57:40
INFO: status = running
INFO: CT Name: 
INFO: including mount point rootfs ('/') in backup
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name: 
INFO: including mount point rootfs ('/') in backup
INFO: starting first sync /proc/313180/root/ to /var/tmp/vzdumptmp1153328_100
INFO: first sync finished - transferred 360.71G bytes in 6813s
INFO: suspending guest
INFO: starting final sync /proc/313180/root/ to /var/tmp/vzdumptmp1153328_100
INFO: final sync finished - transferred 331.26G bytes in 9750s
INFO: resuming guest
INFO: guest is online again after 9750 seconds
INFO: creating Proxmox Backup Server archive 'ct/100/2021-07-17T14:57:40Z'
INFO: run: /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp1153328_100/etc/vzdump/pct.conf root.pxar:/var/tmp/vzdumptmp1153328_100 --include-dev /var/tmp/vzdumptmp1153328_100/. --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 100 --backup-time 1626533860 --repository bak@pbs@ip:bak
INFO: Starting backup: ct/100/2021-07-17T14:57:40Z
INFO: Client name: node007
INFO: Starting backup protocol: Sun Jul 18 03:33:43 2021
INFO: No previous manifest available.
INFO: Upload config file '/var/tmp/vzdumptmp1153328_100/etc/vzdump/pct.conf' to 'bak@pbs@ip:8007:bak' as pct.conf.blob
INFO: Upload directory '/var/tmp/vzdumptmp1153328_100' to 'bak@pbs@ip:8007:bakf' as root.pxar.didx
INFO: root.pxar: had to upload 330.91 GiB of 336.04 GiB in 2796.37s, average speed 121.18 MiB/s).
INFO: root.pxar: backup was done incrementally, reused 5.13 GiB (1.5%)
INFO: Uploaded backup catalog (1.04 MiB)
INFO: Duration: 2796.42s
INFO: End Time: Sun Jul 18 04:20:20 2021
INFO: Finished Backup of VM 100 (05:23:08)
INFO: Backup finished at 2021-07-18 04:20:48
INFO: Backup job finished successfully
TASK OK

存储介质支持块读取操作

如果LXC的磁盘基于CEPH的RBD,此时PBS直接读取设备块(/dev/rdbN)。下面这个LXC(VM 136)1T的磁盘用时两小时。

代码语言:javascript复制
INFO: Starting Backup of VM 136 (lxc)
INFO: Backup started at 2021-07-19 21:55:19
INFO: status = running
INFO: CT Name: CCS-OA-Web-centos7-vlan101.52
INFO: including mount point rootfs ('/') in backup
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: create storage snapshot 'vzdump'
/dev/rbd14
INFO: creating Proxmox Backup Server archive 'ct/136/2021-07-19T13:55:19Z'
INFO: run: /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp1360367_136/etc/vzdump/pct.conf root.pxar:/mnt/vzsnap0 --include-dev /mnt/vzsnap0/./ --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 136 --backup-time 1626702919 --repository bak@pbs@ip:bak
INFO: Starting backup: ct/136/2021-07-19T13:55:19Z
INFO: Client name: ynode006
INFO: Starting backup protocol: Mon Jul 19 21:55:20 2021
INFO: No previous manifest available.
INFO: Upload config file '/var/tmp/vzdumptmp1360367_136/etc/vzdump/pct.conf' to 'bak@pbs@ip:8007:bak' as pct.conf.blob
INFO: Upload directory '/mnt/vzsnap0' to 'bak@pbs@ip:8007:bak' as root.pxar.didx
INFO: root.pxar: had to upload 358.49 GiB of 481.80 GiB in 7445.50s, average speed 49.30 MiB/s).
INFO: root.pxar: backup was done incrementally, reused 123.30 GiB (25.6%)
INFO: Uploaded backup catalog (42.60 MiB)
INFO: Duration: 7445.78s
INFO: End Time: Mon Jul 19 23:59:26 2021
INFO: cleanup temporary 'vzdump' snapshot
2021-07-19 23:59:40.566 7f440affd700 -1 librbd::object_map::InvalidateRequest: 0x7f441400f490 should_complete: r=0
Removing snap: 100% complete...done.
INFO: Finished Backup of VM 136 (02:04:21)
INFO: Backup finished at 2021-07-19 23:59:40
INFO: Backup job finished successfully
TASK OK

增量备份

上面的虚拟机VM 187,3天后增量备份耗时仅仅几分钟,比第一次备份少用一个小时。

代码语言:javascript复制
INFO: Starting Backup of VM 187 (qemu)
INFO: Backup started at 2021-07-19 23:45:02
INFO: status = running
INFO: VM Name: 
INFO: include disk 'virtio0' 'rdb001:vm-187-disk-0' 32G
INFO: include disk 'virtio1' 'rdb001:vm-187-disk-1' 200G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating Proxmox Backup Server archive 'vm/187/2021-07-19T15:45:02Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
INFO: started backup task 'c3ccaa90-5td4-56t7-a099-714279794082'
INFO: resuming VM again
INFO: virtio0: dirty-bitmap status: OK (508.0 MiB of 32.0 GiB dirty)
INFO: virtio1: dirty-bitmap status: OK (5.5 GiB of 200.0 GiB dirty)
INFO: using fast incremental mode (dirty-bitmap), 6.0 GiB dirty of 232.0 GiB total
INFO:   1% (112.0 MiB of 6.0 GiB) in  3s, read: 37.3 MiB/s, write: 37.3 MiB/s
INFO:   3% (224.0 MiB of 6.0 GiB) in  6s, read: 37.3 MiB/s, write: 37.3 MiB/s
...
INFO:  97% (5.9 GiB of 6.0 GiB) in  4m  8s, read: 16.0 MiB/s, write: 16.0 MiB/s
INFO:  99% (6.0 GiB of 6.0 GiB) in  4m 11s, read: 38.7 MiB/s, write: 33.3 MiB/s
INFO: 100% (6.0 GiB of 6.0 GiB) in  4m 13s, read: 30.0 MiB/s, write: 30.0 MiB/s
INFO: backup was done incrementally, reused 226.36 GiB (97%)
INFO: transferred 6.04 GiB in 253 seconds (24.4 MiB/s)
INFO: Finished Backup of VM 187 (00:04:14)
INFO: Backup finished at 2021-07-19 23:49:16
INFO: Backup job finished successfully
TASK OK

而同样是ct100,增量备份还是和第一次备份一样,同步整个LXC的文件到节点目录(不可避免的主要耗时环节),然后才上传增量到PBS(3.23G),但耗时1747s,导致最终耗时和第一次备份相差甚微。可见LXC运行在无法块模式读取的存储上,虽然在后续的上传阶段会识别增量,但耗时上看每次备份时间都接近全量备份。

代码语言:javascript复制
INFO: Starting Backup of VM 100 (lxc)
INFO: Backup started at 2021-07-19 23:45:02
INFO: status = running
INFO: including mount point rootfs ('/') in backup
INFO: mode failure - some volumes do not support snapshots
INFO: trying 'suspend' mode instead
INFO: backup mode: suspend
INFO: ionice priority: 7
INFO: CT Name:
INFO: including mount point rootfs ('/') in backup
INFO: starting first sync /proc/313180/root/ to /var/tmp/vzdumptmp67232_100
INFO: first sync finished - transferred 361.84G bytes in 8395s
INFO: suspending guest
INFO: starting final sync /proc/313180/root/ to /var/tmp/vzdumptmp67232_100
INFO: final sync finished - transferred 333.67G bytes in 9568s
INFO: resuming guest
INFO: guest is online again after 9568 seconds
INFO: creating Proxmox Backup Server archive 'ct/100/2021-07-19T15:45:02Z'
INFO: run: /usr/bin/proxmox-backup-client backup --crypt-mode=none pct.conf:/var/tmp/vzdumptmp67232_100/etc/vzdump/pct.conf root.pxar:/var/tmp/vzdumptmp67232_100 --include-dev /var/tmp/vzdumptmp67232_100/. --skip-lost-and-found --exclude=/tmp/?* --exclude=/var/tmp/?* --exclude=/var/run/?*.pid --backup-type ct --backup-id 100 --backup-time 1626709502 --repository bak@pbs@ip:bak
INFO: Starting backup: ct/100/2021-07-19T15:45:02Z
INFO: Client name: ynode007
INFO: Starting backup protocol: Tue Jul 20 04:44:25 2021
INFO: Downloading previous manifest (Sat Jul 17 22:57:40 2021)
INFO: Upload config file '/var/tmp/vzdumptmp67232_100/etc/vzdump/pct.conf' to 'bak@pbs@ip:8007:bak' as pct.conf.blob
INFO: Upload directory '/var/tmp/vzdumptmp67232_100' to 'bak@pbs@ip:8007:bak' as root.pxar.didx
INFO: root.pxar: had to upload 3.23 GiB of 337.11 GiB in 1747.13s, average speed 1.89 MiB/s).
INFO: root.pxar: backup was done incrementally, reused 333.88 GiB (99.0%)
INFO: Uploaded backup catalog (1.04 MiB)
INFO: Duration: 1752.75s
INFO: End Time: Tue Jul 20 05:13:38 2021
INFO: Finished Backup of VM 100 (05:28:48)
INFO: Backup finished at 2021-07-20 05:13:50

优化改善

上面的LXC(VM 100),迁移到ceph rbd(直接使用PBS备份还原一台新的虚拟机 LXC VM 132),备份耗时1:04,相比之前的5:39大大缩减,但是之后的增量备份耗时降低并不明显。下面是第一次和第二次备份,可见仅仅少了几分钟。

可见LXC无论运行在哪种存储上,每次备份时间都接近全量备份。

代码语言:javascript复制
 INFO: No previous manifest available.
 ...
 INFO: root.pxar: had to upload 334.09 GiB of 339.47 GiB in 3870.63s, average speed 88.38 MiB/s).
 INFO: root.pxar: backup was done incrementally, reused 5.38 GiB (1.6%)
 INFO: Uploaded backup catalog (1.04 MiB) 
 INFO: Duration: 3870.68s
 INFO: End Time: Wed Jul 21 22:04:34 2021
 INFO: cleanup temporary 'vzdump' snapshot
 Removing snap: 100% complete...done.
 INFO: Finished Backup of VM 132 (01:04:47)
 INFO: Backup finished at 2021-07-21 22:04:49


 INFO: Downloading previous manifest (Wed Jul 21 21:00:02 2021)
 ...
 INFO: root.pxar: had to upload 2.00 GiB of 340.49 GiB in 3499.52s, average speed 597.80 KiB/s).
 INFO: root.pxar: backup was done incrementally, reused 338.49 GiB (99.4%)
 INFO: Uploaded backup catalog (1.04 MiB)
 INFO: Duration: 3499.90s
 INFO: End Time: Thu Jul 22 21:58:23 2021
 INFO: cleanup temporary 'vzdump' snapshot
 Removing snap: 100% complete...done.
 INFO: Finished Backup of VM 132 (00:58:36)
 INFO: Backup finished at 2021-07-22 21:58:38

总结

PBS的备份效率对比,从高往低依次是:

VM(LVM on Fc SAN) > VM(ceph rbd) > LXC(ceph rbd) > LXC(LVM on Fc SAN)

为了高效备份,以后LXC只运行在ceph存储上面,基于SAN的LVM存储上只运行VM。

数据容量大于100G不使用LXC。

另:如果有硬件厂商的专门驱动,或许也可以让PBS在FC SAN上对LVM执行块操作备份,有兴趣的朋友可以试试。

0 人点赞