openstack SRIOV passthrough配置手册

2021-02-24 11:19:00 浏览数 (1)

配置BIOS

BIOS中enable vt-d和sriov

配置内核参数

grep IOMMU /boot/config_3.10.0-957.27.2.el7.x86_64

如果内核默认没有打开CONFIG_INTEL_IOMMU_DEFAULT_ON则需要配置内核的启动参数

intel_iommu=on。如果为了非pci passthrough device的性能则需要配置内核参数iommu=pt。

可以看intel_iommu默认没有打开,需要添加内核参数

sudo vim /boot/grub2/grub.cfg

在对应行添加两个参数intel_iommu=on iommu=pt

然后sudo reboot机器,等机器重启完全执行sudo dmesg | grep IOMMU

执行sudo dmesg | grep DMAR

如果能看到” DMAR: IOMMU enabled”和“DMAR: Intel(R) Virtualization Technology for Directed I/O”那就可以放心了,机器已经具备passthrough的基本条件了

确认要passthrough的网卡

这个机器上总共了4块网卡

分别看这几块网卡的名字

可以看到eth0和eth1已经被占用,eth2和eth3空闲,我们就拿eth3来做SRIOV passthrough,查看eth3支持的VF个数,然后创建VF。

echo "echo '8' > /sys/class/net/eth3/device/sriov_numvfs" >> /etc/rc.local防止机器重启VF丢失了。

看一下eth3上VF的iommu_group,和group中的其它设备,一个group的设备必须同时passthrough,只passthrough其中一个会导致失败,这group中只有一个设备,所以没什么问题。

配置openstack

  • 配置controller上的nova-scheduler

controller上的nova-sheduler的配置文件中增加PciPassthroughFilter

[filter_scheduler]

enabled_filters = AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter

available_filters = nova.scheduler.filters.all_filters

  • 配置controller上的nova-api

alias={"vendor_id":"15b3","product_id":"1016","device_type":"type-VF","numa_policy":"preferred","name":"mellanox-0"}

  • 配置controller上的neutron-server

在ml2_conf.ini中增加mechanism中增加sriov

mechanism_drivers = openvswitch,sriovnicswitch

上计算节点上查看VF的vendor_id和device_id,添加到sriov_agent.ini文件中

在sriov_agent.ini中增加mechanism中增加supported_pci_vendor_devs = 15b3:1016

文件/usr/lib/systemd/system/neutron-server.service增加neutron-server的参数--config-file /etc/neutron/plugins/ml2/sriov_agent.ini,然后systemctl daemon-reload

systemctl restart neutron-server

  • 配置compute节点上的nova-compute

[pci]

passthrough_whitelist={"vendor_id":"15b3", "product_id":"1016","address":"0000:04:01.2","physical_network": "provider"}

这儿的physical_network在创建 tenant network是要用到,就是--provider-physical-network 选项指定的值,创建VM时要指定此tenant network,每个passthrough的网卡要对应一个physical_network。neutron中physical_network是写到配置文件ml2_conf.ini中的,例如flat_networks = physnet1和vlan_ranges = physnet2:1:1000。

  • 计算节点在安装neutron-sriov-agent

yum install openstack-neutron-sriov-nic-agent

配置sriov_agent.ini中增加

[securitygroup]

firewall_driver = neutron.agent.firewall.NoopFirewallDriver

[sriov_nic]

physical_device_mappings = provider:eth3

exclude_devices =

systemctl start neutron-sriov-nic-agent

创建VM

openstack flavor create --ram 1024 --disk 32 --vcpus 2 --property "pci_passthrough:alias"="mellanox-0:1" passthrough-flavor

其中的"pci_passthrough:alias"="mellanox-0:1"是必须传的参数,mellanox-0就是nova-api中配置的,1表示passthrough一个网卡

openstack --debug server create --flavor passthrough-flavor --image CentOS-7-x86_64-GenericCloud.raw --network provider –wait passthrough-server

也可以在provider network上创建一个port,创建VM指定这个port

net_id=`openstack network show provider | grep " id " | awk '{ print $4 }'`

port_id=`openstack port create --network net_id --vnic-type direct sriov_port | grep " id " | awk '{ print

openstack flavor create --ram 4096 --disk 60 --vcpus 2 sriov-flavor

openstack server create --flavor sriov-flavor --image centos75-raw --nic port-id=$port_id --wait sriov-server

问题记录

  • enable SRIOV,然后passthrough PF会报错
代码语言:javascript复制
b - default default] [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] Instance failed to spawn: libvirtError: internal error: Process exited prior to exec: libvirt: QEMU Driver error : Unable to stat /dev/vfio/48: No such file or directory
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] Traceback (most recent call last):
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2495, in _build_resources
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     yield resources
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2256, in _build_and_run_instance
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     block_device_info=block_device_info)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3204, in spawn
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     destroy_disks_on_failure=True)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5727, in _create_domain_and_network
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     destroy_disks_on_failure)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     self.force_reraise()
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     six.reraise(self.type_, self.value, self.tb)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5696, in _create_domain_and_network
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     post_xml_callback=post_xml_callback)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5630, in _create_domain
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     guest.launch(pause=pause)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     self._encoded_xml, errors='ignore')
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     self.force_reraise()
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     six.reraise(self.type_, self.value, self.tb)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     return self._domain.createWithFlags(flags)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 190, in doit
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 148, in proxy_call
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     rv = execute(f, *args, **kwargs)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] libvirtError: internal error: Process exited prior to exec: libvirt: QEMU Driver error : Unable to stat /dev/vfio/48: No such file or directory

内核报如下错误

[777512.451080] vfio-pci 0000:04:00.1: Cannot bind to PF with SR-IOV enabled

[777512.452920] vfio-pci: probe of 0000:04:00.1 failed with error -16

vfio-pci: Disable binding to PFs with SR-IOV enabled​patchwork.kernel.org

  • PF一定要UP,否则VF不能passthrough,指定port有问题,但是好像用pci.alias没问题
代码语言:javascript复制
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [req-caa200cd-2dd4-440a-98ca-4c0320eb23be a7ad28c934a84cdca96ba47623110949 02c0f9589cca400abd623868516c209b - default default] [instance: ea885f64-223a-4602-8921-eeb9fb43061c] Instance failed to spawn: libvirtError: internal error: Unable to configure VF 0 of PF 'eth3' because the PF is not online. Please change host network config to put the PF online.
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] Traceback (most recent call last):
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2495, in _build_resources
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     yield resources
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2256, in _build_and_run_instance
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     block_device_info=block_device_info)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3204, in spawn
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     destroy_disks_on_failure=True)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5727, in _create_domain_and_network
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     destroy_disks_on_failure)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     self.force_reraise()
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     six.reraise(self.type_, self.value, self.tb)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5696, in _create_domain_and_network
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     post_xml_callback=post_xml_callback)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5630, in _create_domain
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     guest.launch(pause=pause)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     self._encoded_xml, errors='ignore')
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     self.force_reraise()
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     six.reraise(self.type_, self.value, self.tb)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     return self._domain.createWithFlags(flags)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 190, in doit
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 148, in proxy_call
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     rv = execute(f, *args, **kwargs)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 129, in execute
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     six.reraise(c, e, tb)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     rv = meth(*args, **kwargs)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] libvirtError: internal error: Unable to configure VF 0 of PF 'eth3' because the PF is not online. Please change host network config to put the PF online.
  • 手动把一个vf绑定到了vfio-pci,导致nova-computer resource_tracker不再上报host资源给nova-conductor,然后nova-sscheduler就在一些情况下玩不转了
代码语言:javascript复制
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager [req-83844fa9-d126-4d70-a95d-167f64a35f3a - - - - -] Error updating resources for node test25g05.ops.lycc.qihoo.net.: libvirtError: internal error: The PF device for VF eth8 has no network device name
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager Traceback (most recent call last):
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 8148, in _update_available_resource_for_node
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager     startup=startup)
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 728, in update_available_resource
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager     resources = self.driver.get_available_resource(nodename)
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7128, in get_available_resource
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager     self._get_pci_passthrough_devices()
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6131, in _get_pci_passthrough_devices
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager     pci_info.append(self._get_pcidev_info(name))
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6091, in _get_pcidev_info
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager     device.update(_get_device_capabilities(device, address))
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6062, in _get_device_capabilities
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager     pcinet_info = self._get_pcinet_info(address)
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6001, in _get_pcinet_info
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager     xmlstr = virtdev.XMLDesc(0)
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 5540, in XMLDesc
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager     if ret is None: raise libvirtError ('virNodeDeviceGetXMLDesc() failed')
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager libvirtError: internal error: The PF device for VF eth8 has no network device name

0 人点赞