配置BIOS
BIOS中enable vt-d和sriov
配置内核参数
grep IOMMU /boot/config_3.10.0-957.27.2.el7.x86_64
如果内核默认没有打开CONFIG_INTEL_IOMMU_DEFAULT_ON则需要配置内核的启动参数
intel_iommu=on。如果为了非pci passthrough device的性能则需要配置内核参数iommu=pt。
可以看intel_iommu默认没有打开,需要添加内核参数
sudo vim /boot/grub2/grub.cfg
在对应行添加两个参数intel_iommu=on iommu=pt
然后sudo reboot机器,等机器重启完全执行sudo dmesg | grep IOMMU
执行sudo dmesg | grep DMAR
如果能看到” DMAR: IOMMU enabled”和“DMAR: Intel(R) Virtualization Technology for Directed I/O”那就可以放心了,机器已经具备passthrough的基本条件了
确认要passthrough的网卡
这个机器上总共了4块网卡
分别看这几块网卡的名字
可以看到eth0和eth1已经被占用,eth2和eth3空闲,我们就拿eth3来做SRIOV passthrough,查看eth3支持的VF个数,然后创建VF。
echo "echo '8' > /sys/class/net/eth3/device/sriov_numvfs" >> /etc/rc.local防止机器重启VF丢失了。
看一下eth3上VF的iommu_group,和group中的其它设备,一个group的设备必须同时passthrough,只passthrough其中一个会导致失败,这group中只有一个设备,所以没什么问题。
配置openstack
- 配置controller上的nova-scheduler
controller上的nova-sheduler的配置文件中增加PciPassthroughFilter
[filter_scheduler]
enabled_filters = AvailabilityZoneFilter, ComputeFilter, ComputeCapabilitiesFilter, ImagePropertiesFilter, ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter, PciPassthroughFilter
available_filters = nova.scheduler.filters.all_filters
- 配置controller上的nova-api
alias={"vendor_id":"15b3","product_id":"1016","device_type":"type-VF","numa_policy":"preferred","name":"mellanox-0"}
- 配置controller上的neutron-server
在ml2_conf.ini中增加mechanism中增加sriov
mechanism_drivers = openvswitch,sriovnicswitch
上计算节点上查看VF的vendor_id和device_id,添加到sriov_agent.ini文件中
在sriov_agent.ini中增加mechanism中增加supported_pci_vendor_devs = 15b3:1016
文件/usr/lib/systemd/system/neutron-server.service增加neutron-server的参数--config-file /etc/neutron/plugins/ml2/sriov_agent.ini,然后systemctl daemon-reload
systemctl restart neutron-server
- 配置compute节点上的nova-compute
[pci]
passthrough_whitelist={"vendor_id":"15b3", "product_id":"1016","address":"0000:04:01.2","physical_network": "provider"}
这儿的physical_network在创建 tenant network是要用到,就是--provider-physical-network 选项指定的值,创建VM时要指定此tenant network,每个passthrough的网卡要对应一个physical_network。neutron中physical_network是写到配置文件ml2_conf.ini中的,例如flat_networks = physnet1和vlan_ranges = physnet2:1:1000。
- 计算节点在安装neutron-sriov-agent
yum install openstack-neutron-sriov-nic-agent
配置sriov_agent.ini中增加
[securitygroup]
firewall_driver = neutron.agent.firewall.NoopFirewallDriver
[sriov_nic]
physical_device_mappings = provider:eth3
exclude_devices =
systemctl start neutron-sriov-nic-agent
创建VM
openstack flavor create --ram 1024 --disk 32 --vcpus 2 --property "pci_passthrough:alias"="mellanox-0:1" passthrough-flavor
其中的"pci_passthrough:alias"="mellanox-0:1"是必须传的参数,mellanox-0就是nova-api中配置的,1表示passthrough一个网卡
openstack --debug server create --flavor passthrough-flavor --image CentOS-7-x86_64-GenericCloud.raw --network provider –wait passthrough-server
也可以在provider network上创建一个port,创建VM指定这个port
net_id=`openstack network show provider | grep " id " | awk '{ print $4 }'`
port_id=`openstack port create --network net_id --vnic-type direct sriov_port | grep " id " | awk '{ print
openstack flavor create --ram 4096 --disk 60 --vcpus 2 sriov-flavor
openstack server create --flavor sriov-flavor --image centos75-raw --nic port-id=$port_id --wait sriov-server
问题记录
- enable SRIOV,然后passthrough PF会报错
b - default default] [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] Instance failed to spawn: libvirtError: internal error: Process exited prior to exec: libvirt: QEMU Driver error : Unable to stat /dev/vfio/48: No such file or directory
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] Traceback (most recent call last):
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2495, in _build_resources
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] yield resources
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2256, in _build_and_run_instance
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] block_device_info=block_device_info)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3204, in spawn
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] destroy_disks_on_failure=True)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5727, in _create_domain_and_network
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] destroy_disks_on_failure)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] self.force_reraise()
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] six.reraise(self.type_, self.value, self.tb)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5696, in _create_domain_and_network
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] post_xml_callback=post_xml_callback)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5630, in _create_domain
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] guest.launch(pause=pause)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] self._encoded_xml, errors='ignore')
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] self.force_reraise()
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] six.reraise(self.type_, self.value, self.tb)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] return self._domain.createWithFlags(flags)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 190, in doit
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] result = proxy_call(self._autowrap, f, *args, **kwargs)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 148, in proxy_call
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] rv = execute(f, *args, **kwargs)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2019-11-26 09:50:55.240 1091525 ERROR nova.compute.manager [instance: ad0d2589-ba8e-4bcb-a56a-26a5306503a2] libvirtError: internal error: Process exited prior to exec: libvirt: QEMU Driver error : Unable to stat /dev/vfio/48: No such file or directory
内核报如下错误
[777512.451080] vfio-pci 0000:04:00.1: Cannot bind to PF with SR-IOV enabled
[777512.452920] vfio-pci: probe of 0000:04:00.1 failed with error -16
vfio-pci: Disable binding to PFs with SR-IOV enabledpatchwork.kernel.org
- PF一定要UP,否则VF不能passthrough,指定port有问题,但是好像用pci.alias没问题
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [req-caa200cd-2dd4-440a-98ca-4c0320eb23be a7ad28c934a84cdca96ba47623110949 02c0f9589cca400abd623868516c209b - default default] [instance: ea885f64-223a-4602-8921-eeb9fb43061c] Instance failed to spawn: libvirtError: internal error: Unable to configure VF 0 of PF 'eth3' because the PF is not online. Please change host network config to put the PF online.
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] Traceback (most recent call last):
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2495, in _build_resources
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] yield resources
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2256, in _build_and_run_instance
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] block_device_info=block_device_info)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3204, in spawn
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] destroy_disks_on_failure=True)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5727, in _create_domain_and_network
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] destroy_disks_on_failure)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] self.force_reraise()
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] six.reraise(self.type_, self.value, self.tb)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5696, in _create_domain_and_network
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] post_xml_callback=post_xml_callback)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5630, in _create_domain
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] guest.launch(pause=pause)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 144, in launch
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] self._encoded_xml, errors='ignore')
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] self.force_reraise()
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] six.reraise(self.type_, self.value, self.tb)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 139, in launch
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] return self._domain.createWithFlags(flags)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 190, in doit
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] result = proxy_call(self._autowrap, f, *args, **kwargs)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 148, in proxy_call
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] rv = execute(f, *args, **kwargs)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 129, in execute
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] six.reraise(c, e, tb)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] rv = meth(*args, **kwargs)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in createWithFlags
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2019-11-26 16:48:37.463 3045060 ERROR nova.compute.manager [instance: ea885f64-223a-4602-8921-eeb9fb43061c] libvirtError: internal error: Unable to configure VF 0 of PF 'eth3' because the PF is not online. Please change host network config to put the PF online.
- 手动把一个vf绑定到了vfio-pci,导致nova-computer resource_tracker不再上报host资源给nova-conductor,然后nova-sscheduler就在一些情况下玩不转了
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager [req-83844fa9-d126-4d70-a95d-167f64a35f3a - - - - -] Error updating resources for node test25g05.ops.lycc.qihoo.net.: libvirtError: internal error: The PF device for VF eth8 has no network device name
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager Traceback (most recent call last):
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 8148, in _update_available_resource_for_node
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager startup=startup)
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 728, in update_available_resource
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename)
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7128, in get_available_resource
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager self._get_pci_passthrough_devices()
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6131, in _get_pci_passthrough_devices
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager pci_info.append(self._get_pcidev_info(name))
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6091, in _get_pcidev_info
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager device.update(_get_device_capabilities(device, address))
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6062, in _get_device_capabilities
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager pcinet_info = self._get_pcinet_info(address)
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6001, in _get_pcinet_info
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager xmlstr = virtdev.XMLDesc(0)
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager File "/usr/lib64/python2.7/site-packages/libvirt.py", line 5540, in XMLDesc
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager if ret is None: raise libvirtError ('virNodeDeviceGetXMLDesc() failed')
2019-11-26 15:10:13.704 2976215 ERROR nova.compute.manager libvirtError: internal error: The PF device for VF eth8 has no network device name