Introduction

The other day we were upgrading our VMware NSX environment to a fresh new version from 3.x to 4.x. After the upgrade, the first login we encountered many NSX Corfu certificates as expired from the alarm screen as can be seen below:

Many expired NSX certificates after upgrading to 4.1.2.3
Many expired NSX certificates after upgrading to 4.1.2.3

You can see these alarms on the dashboard right after logging in or after going to System -> Settings -> Certificates -> Click on the View expired certificates line to see all the expired certificates.

Troubleshooting

I was aware that there were some certificates called Corfu in the NSX software by some other issue I had before, but I did not expect this from the upgrade. It was not a known issue in the release notes. A quick search lead me to the following KB94898. Seems that this issue is something you might encounter when you upgrade an NSX environment that was created below version 3.2.3 in the past. The Cluster Boot Manager (CBM) service certificates were given a validity period of 825 days instead of 100 years in that version of the software. This was corrected in 3.2.3 and NSX 4.1.0, however it would not fix the issue by itself, the expiration of 825 days will stay, unless fixed with the script I will explain and is explained in the KB.

As with version NSX 3.2.X on NSX 4.1.x there will never be any functional impact when these internal certificates expire. However, in NSX 4.1.x there will be alarms to show you that something has expired. This was not yet the case in NSX 3.2.x.

The article explains the usage of the script quite good. However, I had many issues getting the script to work to start with. I see that since I’ve executed this, the KB was updated to include the required python packages. Good thing, because that took me some time! So first things first, the requirements to run the script:

  1. Download the script from the KB94898.
  2. Make sure you have a NSX 4.1.x environment.
  3. Create a NSX Backup under System -> Lifecycle Management -> Backup & Restore -> Start Backup.
  4. Note down the NSX Manager VIP. This can be found under System -> Configuration -> Appliances -> Virtual IP.
  5. Make sure you have the required packages. The KB mentioned the following packages before the edit:
    • paramiko
    • cryptography
    • (After the KB was edited it showed) pyOpenSSL
  6. Execute the script by running python3 replace_certs_v1.1.py. Follow the procedure.

It seems that this is easy enough. However, before the KB edit I had some trouble getting the script to work because of missing python packages on my system.

First let’s start off by updating ‘pip’:

Windows:
py -m pip install --upgrade pip

Mac:
python3 -m pip install --upgrade pip

Then we install the packages as suggested in the KB:

Windows:
C:\Users\b.vaneeden\AppData\Local\Programs\Python\Python312\Scripts> .\pip3 install cryptography
PS C:\Users\b.vaneeden\AppData\Local\Programs\Python\Python312\Scripts> .\pip3 install paramiko

Mac:
python3 -m pip install paramiko
python3 -m pip install cryptography

Now all should’ve been fine to run the script. However while running the script I ran into the following error message (on both Windows and Mac):

PS C:\Users\b.vaneeden\Downloads> py .\replace_certs_v1.1.py
Traceback (most recent call last):
  File "C:\Users\b.vaneeden\Downloads\replace_certs_v1.1.py", line 18, in <module>
    import requests
ModuleNotFoundError: No module named 'requests'

Alright it seems we also need the ‘requests’ python package. Let’s install that one too:

Windows:
py -m pip install --upgrade requests

Mac:
python3 -m pip install requests

Let’s try the script again:

Windows:
PS C:\Users\b.vaneeden\Downloads> py .\replace_certs_v1.1.py
YOU NEED TO INSTALL PYTHON MODULE "OpenSSL"

Mac:
bryanvaneeden@Bryans-MacBook-Pro Downloads % python3 replace_certs_v1.1.py 
YOU NEED TO INSTALL PYTHON MODULE "OpenSSL"

OK. Now like I’ve mentioned, before the KB was updated this was not shown that you needed pyOpenSSL. I tried multiple different things to fix that issue:

  • I installed LibreSSL and OpenSSL1 on my Mac -> no resolution.
  • I replaced all OpenSSL lines in the code with LibreSSL -> no resolution.

Only after a while when I’ve read online that the internal python SSL modules did not cover the entirety of the functionalities in the OpenSSL Libaries and you should install pyOpenSSL to complement this is when I fixed the issue so I could run the script:

Windows:
py -m pip install --upgrade pyOpenSSL

Mac:
python3 -m pip install pyOpenSSL

Now let’s get back to running the script with python3 replace_certs_v1.1.py:

bryanvaneeden@Bryans-MacBook-Pro Downloads % python3 replace_certs_v1.1_1719576737534.py
*************************************************************
* replace_certs.py 1.1                                      *
*                                                           *
* This script will replace self-signed certificates on      *
* an NSX Manager cluster. The newly generated certificates  *
* will retain the properties of the replaced certificates.  *
*                                                           *
* The estimated execution time is 50 minutes if all         *
* certificates are replaced. During this time period do not *
* make any other configuration changes to the NSX Manager.  *
*                                                           *
* It is highly recommended to backup the NSX Manager before *
* running this script.                                      *
*************************************************************

Do you want to continue (Yes/No)? Yes
Have you backed up your system (Yes/No)? Yes
Enter the Manager VIP address or one of the Manager's IP addresses if no VIP: XXX.XXX.XXX.XXX
Enter the cluster's admin password (will not be displayed): 
To confirm,  re-enter the password (will not be displayed): 

Node Type is NSX Manager
Local site id is: XXXX-XXXX-XXXX-XXXX-XXXXXX

You are about to replace the certificates in cluster XXX.XXX.XXX.XXX
Do you want to continue (Yes/No): Yes
Skipping because APH certificate expires after 31 days
Skipping because API certificate expires after 31 days
Skipping because CBM_MESSAGING_MANAGER certificate expires after 31 days
Skipping because CBM_UPGRADE_COORDINATOR certificate expires after 31 days
Skipping because CBM_CORFU certificate expires after 31 days
Waiting for cluster to stabilize
Cluster is stable

About to replace CBM_MP certificate on node 034b0e1c-... as it has already expired or is expiring soon
Generating a self-signed certificate with expiry of 36500 days
Replacing CBM_MP certificate on node 034b0e1c-...
Sleeping for 150 seconds ...
Skipping because CCP certificate expires after 31 days
Skipping because APH_TN certificate expires after 31 days
Skipping because CBM_SITE_MANAGER certificate expires after 31 days
Waiting for cluster to stabilize
Cluster is stable

XXXXXXXX This went on until all certificates were replaced.

There is a manual proces that can be done to replace all of the Corfu certificates in the NSX Managers. However you do not want to do this. There are so many Corfu certificates, this is a very lenghty proces. This was in the KB before, but this has been edited also. If you still want this, please contact VMware GSS.

Once the script is done all certificates will be either missing or not visible at all as expired. If they are not in use anymore you will see a ‘0’ in the ‘Where Used’ column. You can safely remove them from your environment if they are categorized as missing and no longer in use by clicking them and -> Remove.

Conclusion

So in the end we can conclude that this issue was a bug remaining from previous code used in the NSX before version 3.2.2. However, simply upgrading the environment is not enough. Manual intervention is required to fully resolve the issues.

Once the script is executed you will see that the certificates are no longer only 825 days valid, but 100 years.

NSX CBM Corfu certificate validity for 100 years
NSX CBM Corfu certificate validity for 100 years

I hope this blog will help you while facing this issue!


Bryan van Eeden

Bryan is an ambitious and seasoned IT professional with almost a decade of experience in designing, building and operating complex (virtual) IT environments. In his current role he tackles customers, complex issues and design questions on a daily basis. Bryan holds several certifications such as VCIX-DCV, VCAP-DCA, VCAP-DCD, V(T)SP and vSAN and vCloud Specialist badges.

0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *