VMDirectPath – Fix Configuration Issues

VMDirectPath is a great new feature with vSphere 4.0, but with new features come new challenges. This page will deal with some common configuration problems that you might encounter.

1) Disable VMDirectPath with a reboot
2) Disable VMDirectPath without a reboot
3) Remove devices from the VMDirectPath Configuration when you’re unable to do so with the vSphere client
4) Dealing with the ESXi boot device that has been enabled for VMDirectPath

Disabling VMDirectPath with a reboot

You may encounter a situation where you need to just disable VMDirectPath. If you are able to reboot your host you can set the VMkernel option Boot.noIOMMU to enabled and restart the server. This won’t help if you’re allocated your boot device to VMDirectPath. When you do this to a host and then go to Configuration \ Hardware \ Advanced Settings, the host will show the message that it does not support VMDirectPath as shown in the second image. If you want to enable VMDirectPath, you can uncheck the option and reboot.

Disable VMDirectPath without a reboot

Should you want to disable VMDirectPath without restarting the host, you can unload (or load) the driver with the below command.

/usr/lib/vmware/vmkmod # vmkload_mod -u vtd
   Module vtd successfully unloaded
 /usr/lib/vmware/vmkmod # vmkload_mod vtd
   Module vtd loaded successfully

When you try to start a VM after you have unloaded the module vtd, you will get an error such as this. You will still be able to configure VMDirectPath devices in Hardware/Advanced Settings.

Remove devices from VMDirectPath Configuration when you’re unable to do so with the vSphere client

A common issue that will hopefully be resolved in an update is the inability to remove a device from passthrough once it has been enabled. You might enable the devices like shown below for the graphics card and NIC, reboot the host and then find that later you can’t remove the devices from pass-thru configuration. When you uncheck the devices and click OK, the devices still appear on the pass-thru configuration screen and after a reboot, the hosts are still shown as enabled. While it is possible to reset the configuration of the host, that may not be the most desirable option.

The configuration for passthru is stored is /etc/vmware/esx.conf. It is possible to edit this file with vifs.pl (supported option) or at the console (unsupported). You’ll note that the Pro/1000 MT is listed as vmnic2 and vmnic3 and the video card is not listed. The item listed for passthru is 00:28.05 which lspci shows to be the device “Bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 6”. In this case, you can change the owner to equal vmkernel, reboot and the devices should no longer be enabled for VMDirectPath. If you’re accessing esx.conf at the console you can also run backup.sh 0 /bootbank/ to ensure the change is backed up to the system backup.

When trying to reproduce this problem on the host the PCI bridge would not always be shown in esx.conf, but manually adding the entry and setting the owner to vmkernel was sufficient to clear the devices from being assigned to pass-thru mode.

/device/000:26.0/owner = "vmkernel"
   /device/000:26.1/owner = "vmkernel"
   /device/000:26.2/owner = "vmkernel"
   /device/000:26.7/owner = "vmkernel"
   /device/000:28.5/owner = "passthru"
   /device/000:29.0/owner = "vmkernel"
   /device/000:29.1/owner = "vmkernel"
   /device/000:29.2/owner = "vmkernel"
   /device/000:29.7/owner = "vmkernel"
   /device/000:31.2/owner = "vmkernel"
   /device/000:31.2/vmkname = "vmhba0"
   /device/000:31.5/owner = "vmkernel"
   /device/000:31.5/vmkname = "vmhba1"
   /device/006:00.0/vmkname = "vmnic0"
   /device/007:00.0/vmkname = "vmnic1"
   /device/008:03.0/vmkname = "vmnic2"
   /device/008:03.1/vmkname = "vmnic3"

Dealing with the ESXi boot device that has been enabled for VMDirectPath

It is possible to inadvertently enable pass-thru on the USB device that ESXi to boot from. As shown in the below image all USB devices in the system have been enabled for pass-thru. After a reboot, the host will respond boot very slowly and ESXi will not be able to write to the partitions on the USB boot device making the host largely unusable.

If you attempt operations at the console, you’ll find that no write operations are enabled and the partitions are not mounted properly. If you experience this issue you may find that the console stays at the “dvsfilter loaded successfully” step for several minutes or longer during bootup.

# backup.sh 0 /bootbank/
   config explicitly loaded
   Boot partition /bootbank/ cannot be found
   ~ # ls -l
   l---------    0 root     root               1984 Jan  1  1970 altbootbank -> /vmfs/volumes/7cfc25ed-9e7abe3b-92b0-4b92d4cdeb1e
   drwxr-xr-x    1 root     root                512 Nov 14 18:26 bin
   l---------    0 root     root               1984 Jan  1  1970 bootbank -> /vmfs/volumes/53cc92eb-5326110e-6194-a0c34fc4fe34
   drwxr-xr-x    1 root     root                512 Nov 14 18:48 dev
   drwxr-xr-x    1 root     root                512 Nov 14 18:42 etc
   drwxr-xr-x    1 root     root                512 Nov 14 18:26 lib
   l---------    0 root     root               1984 Jan  1  1970 locker -> /tmp/scratch
   drwxr-xr-x    1 root     root                512 Sep 17 21:47 opt
   drwxr-xr-x    1 root     root             131072 Nov 14 18:48 proc
   l---------    0 root     root               1984 Jan  1  1970 productLocker -> /locker/packages/4.0.0/
   drwxr-xr-x    1 root     root                512 Nov 14 18:26 sbin
   l---------    0 root     root               1984 Jan  1  1970 scratch -> /tmp/scratch
   l---------    0 root     root               1984 Jan  1  1970 store -> /vmfs/volumes/efd8efe3-03bc1cbf-15e0-080efd9e7379
   drwxrwxrwt    1 root     root                512 Nov 14 18:42 tmp
   drwxr-xr-x    1 root     root                512 Nov 14 18:26 usr
   drwxr-xr-x    1 root     root                512 Nov 14 18:26 var
   drwxr-xr-x    1 root     root                512 Nov 14 18:26 vmfs
   l---------    0 root     root               1984 Jan  1  1970 vmupgrade -> /locker/vmupgrade/
 

To troubleshoot this problem I tried to both disable IOMMU with a vmkernel boot option and to also disable Intel VT-d in the BIOS but neither option helped to resolve the problem. Now while it may be possible to access the console or SSH, manually editing esx.conf will not help as it can’t be written manually or automatically to the /bootbank/ partition as it will no longer be mounted properly after the USB device for the boot device is handed off to VMDirectPath.

I figured the best approach would use this method to edit esx.conf stored in state.tgz to set the owner of affected components to vmkernel instead of pass-thru.

This host did have SSH enabled so I logged in and took a look at the esx.conf file. I found that two devices were enabled for passthru as shown below in the output of esx.conf and lspci. I then booted with the Linux live CD and when I first opened esx.conf I found that the file had no devices were set to passthru as shown in the image below. The problem was that I was looking at the original firmware partition and not the current one.

/advUserOptions/options[0003]/name = "CIMWatchdogInterval"
   /advUserOptions/options[0003]/type = "int"
   /device/000:26.0/owner = "vmkernel"
   /device/000:26.1/owner = "vmkernel"
   /device/000:26.2/owner = "vmkernel"
   /device/000:26.7/owner = "vmkernel"
   /device/000:27.0/owner = "passthru"
   /device/000:29.0/owner = "vmkernel"
   /device/000:29.1/owner = "vmkernel"
   /device/000:29.2/owner = "vmkernel"
   /device/000:29.7/owner = "vmkernel"
   /device/000:30.0/owner = "passthru"
   /device/000:31.2/owner = "vmkernel"
   /device/000:31.2/vmkname = "vmhba0"
   /device/000:31.5/owner = "vmkernel"
   /device/000:31.5/vmkname = "vmhba1"
   /device/006:00.0/vmkname = "vmnic0"
   /device/007:00.0/vmkname = "vmnic1"
   /device/008:03.0/vmkname = "vmnic2"
   /device/008:03.1/vmkname = "vmnic3"
   /net/pnic/child[0000]/mac = "00:30:48:db:68:88"
  /net/pnic/child[0000]/name = "vmnic0"
LSPCI OUTPUT
 00:27.00 Multimedia controller: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller
 00:30.00 Bridge: Intel Corporation 82801BA/CA/DB/EB PCI Bridge

Leave a Comment

Your email address will not be published. Required fields are marked *