In my homelab I’m using a server which contains both Gigabit and 10GbE NICs. When I first deployed ESXi version 7.0U1 the list of devices looked like this.
Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description
------ ------------ -------- ------------ ----------- ----- ------ ----------------- ---- -----------
vmnic0 0000:02:00.0 nmlx4_en Up Down 0 Half xx:xx:xx:xx:xx:xx 1500 Mellanox Technologies MT27500 Family [ConnectX-3]
vmnic1 0000:03:00.0 igbn Up Up 1000 Full xx:xx:xx:xx:xx:xx 2018 Intel Corporation I211 Gigabit Network Connection
vmnic2 0000:04:00.0 igbn Up Down 0 Half xx:xx:xx:xx:xx:xx 2018 Intel Corporation I211 Gigabit Network Connection
In this system the Mellanox isn’t connected to a switch (yet). So I didn’t notice this straight away. But after I installed some patches the vmnic0 interface didn’t show up anymore. So there is a great KB article from VMware which describes how to determine driver and firmware versions. But in this case the driver wasn’t loaded so most of the commands mentioned in the KB were useless.
This seemed like a good excuse to evaluate Runecast as one of it’s features is to identify installed hardware and the supported driver and firmware versions.
Runecast is one of many companies that provide NFR licences for individuals who are in the VMware vExpert program. Requesting a NFR license was a breeze via this link and after creating a profile I was able to download a virtual appliance in OVA format. I deployed this appliance on my environment and configured it to connect to my vSphere environment.
Identifying NIC details
It didn’t take long before the inventory was collected from vSphere. When I navigated to the ‘HW Compatibility’ feature I could easily check which systems and I/O components are compliant with VMware’s HCL. And sure enough my Mellanox interface did not have a compatible driver installed. A link to the VMware HCL for the specific device is provided which allows you to quickly download the compatible driver for installation.
And it doesn’t stop there. The HW compatibility feature also has a feature to do a ‘what if’ scenario where you can simulate the impact of a vSphere upgrade on your enviroment.
Updating the driver
According to the report from Runecast I simply needed to upgrade my nmlx4 driver to version 184.108.40.206. Now ideally when you have a cluster you can distribute this driver using the Lifecycle Manager. But I only have a single physical host so I needed to use the esxcli command which worked just fine.
[root@esx01:/vmfs/volumes/5fbd8a02-a1c0e492-c0f8-80ee73f0abb5] esxcli software vib install -d /vmfs/volumes/datastore1/Mellanox-nmlx4_220.127.116.11-1OEM.618.104.22.16869922-offline_bundle-17262032.zip
Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
Reboot Required: true
VIBs Installed: MEL_bootbank_nmlx4-core_22.214.171.124-1OEM.6126.96.36.19969922, MEL_bootbank_nmlx4-en_188.8.131.52-1OEM.6184.108.40.20669922, MEL_bootbank_nmlx4-rdma_220.127.116.11-1OEM.618.104.22.16869922
VIBs Removed: VMW_bootbank_nmlx4-core_22.214.171.124-2vmw.701.0.0.16850804, VMW_bootbank_nmlx4-en_126.96.36.199-2vmw.701.0.0.16850804, VMW_bootbank_nmlx4-rdma_188.8.131.52-2vmw.701.0.0.16850804
And sure enough vmnic0 came back from the dead.
Of course I’m only scratching the surface when it comes to all the features that is on offer with Runecast. But even for a simple issue like identifying an incorrect driver version on a single host homelab Runecast worked amazing.