Another day in the field another issue. I’ve been working on VMware Cloud Director more recently and I’ve come across a fascinating new User Interface bug in the HTML5 Tenant UI. This particular environment is running VMware Cloud Director v10.0.0.2 build 16081830. Fun fact, in this release it’s still actually called vCloud Director instead of VMware Cloud Director, this changed in version 10.1.
But back to the issue at hand. From the tenant perspective in VMware Cloud Director there are two major “compute” objects inside the Organization vDC that can be managed. The vAPP’s and the Virtual Machines. A vAPP in the sense of VCD is a logical construct that consist of one or more virtual machines that share common settings that originate from the vAPP construct. You can also execute common actions on the vAPP, which in return executes the action on all virtual machines inside it. There is also a possibility to edit the Start and Stop Order of the virtual machines inside it. Because of this you can decide and configure the startup order off your virtual machines. All in all this ensures a nice management relief.
Another great option you have is that you can “Move” virtual machines inside vAPP’s. You might think why would I want to do that? Well this is useful when you are changing startup orders or application groups for example. While “moving” a powered on VM to a different vAPP you get the possibility to edit some settings while you’re at it. You can change the following settings:
- Change the VM Name
- Change the Storage Policy
- Change the Network
- Change the Network Type (only if the VM is powered off)
- Change the NIC Connection State
- Change the Network (only if the VM is powered off)
- Change the IP Mode
- Change the MAC Address
In normal conditions this should all work out fine and you should have no issues at all. However in our environment I am having issues like the one below:
VM XXX failed to update and has been rolled back. Cannot change network adapter type of existing virtual machine.
Wait what? I didn’t even change any settings. I just opened the “Move” workflow window, pressed Next a couple of times and started the move like below:
Something that did catch my eye (the second time I ran the workflow) is that the Network Adapter Type was set to “E1000E” and wasn’t changable. Hmm strange, because this VM has a “VMXNET3” adapter type. Well now the error message makes sense though. Changing the Network Adapter Type is not possible when the VM is running.
Troubleshooting the issue
Now that the issue is clear to us all we can start and troubleshoot it. Ofcourse the basic steps include checking if the VM has another Network Adapter Type, which we already did. Next up I tried this on a couple of other VM’s that I had in this tenant. Strangely this worked just fine. At some point I started thinking that this might be due to a strange condition that the VM was having inside the VCD inventory. After checking VCD and it’s log files it seemed that this was not the case here. The VM was also not in a vAPP yet, so no “global” vAPP settings were interfering with the move we tried to do.
While I was starting to get annoyed because I couldn’t find anything that would make sense of the issue, I tried to move the same VM’s again. However this time it worked flawlessly and the VCD Move VM UI workflow displayed the VMXNET3. Once I tried this a third time it stopped working again. Because of this I decided to create a case with VMware support while continuing the troubleshooting.
Something I recalled is that it was possible to still use the old “Flex” Flash UI by changing some properties on the VCD Cell(s). Since version 10 the Flash UI was deprecated and disabled by default. But it’s still possible to enable it and use it. This can be done by following the next couple of easy steps:
- Login to each of your VCD Cells through SSH.
- Check if the Flex UI is enabled by executing the following line:
root@vcd-cell01 [ /opt/vmware/vcloud-director/bin ]# ./cell-management-tool manage-config -n flex.ui.enabled -l
Property "flex.ui.enabled" has value "true"
- If this returns “true” such as above, you are good to go. If not please continue with step 4.
- Execute the following line if you want to enable the Flex UI for all users, or the second line for only system administrators:
root@vcd-cell01 [ /opt/vmware/vcloud-director/bin ]# ./cell-management-tool manage-config -n flex.ui.enabled -v true
Updating property: Property "flex.ui.enabled" has value "true"
root@vcd-cell01 [ /opt/vmware/vcloud-director/bin ]# ./cell-management-tool manage-config -n flex.ui.enabled -v sys-admin-only
Updating property: Property "flex.ui.enabled" has value "sys-admin-only"
- Once you did the above you need to restart the VCD services on all VCD Cells. You can do this by executing the below line:
root@vcd-cell01 [ /opt/vmware/vcloud-director/bin ]# service vmware-vcd restart
Once this is done on all VCD Cells you can login to the Flex UI by browsing to your VCD instance with the following URL:
So once I did this I tried to move the VM to the same vAPP through the Flex UI, everything worked out flawlessly without any issues at all. I tried this a couple dozen times but and I never encountered the issue again!
So while I figured out that the Flex UI did work for me, VMware got back to me on my GSS case. They instantly recognized the issue and told me that this is because of a known issue in our VCD version 10.0.0.2. Well great, and ofcourse they also told me there currently is no solution for the HTML5 interface. Unfortunately I cannot link you to the KB, because there’s only an internal KB. But I can explain the issue with some more information, and provide you with a solution, which you unfortunately cannot implement right away.
What is happening is that VMware Cloud Director HTML5 interface assumes the default vAPP network adapter type for the VM, which would be the E1000E network adapter. Because of this we are not able to “change” the network adapter type in the dropdown list, which by the way shouldn’t be needed to begin with because on working VM’s VMXNET3 is selected by default (at least if the VM has this type of network adapter). When we continue the Move VM workflow it will crash like we saw before. This will be because the actual “recomposevApp” payload will try to change the Network Adapter Type to E1000E, like I will show you in the below figure:
This issue is most common on so called standalone VM’s, which are VM’s that are not yet inside a vAPP construct. I’ve been told that this issue is present on all current VMware Cloud Director versions up to 10.1.2.
The solution here is something you will probably have figured out by yourself by now. It is shutting down the VM and then using the Move VM workflow. This by itself is really dissatisfying because it requires actual downtime for a logical move of a resource… But like I showed you before, you can also still use the Flex UI to move the VM’s to vAPPs. Unfortunately these are the only workarounds I can offer at this point.
I want to end this post with a silver lining though. VMware told me that Engineering has prepared a patch for this issue which will be released in VMware Cloud Director version 10.2! This version will release later this year! Once this has been released and we have updated our VCD environment I will ofcourse update this post!
I hope everybody had fun reading this post. Keep coming back to the blog to find more posts in the future!