Introduction
Just a couple of months ago during the summer VMware Cloud Director (VCD) released a brand new version in the form of VCD 10.4 and now in december 10.4.1! Both of the releases feel like a major release with the big change in the Console Proxy loadbalancing technique, Photon 3.0, in VCD 10.4 and Solution Add-On Management, PostgreSQL14, IP Spaces and the cool (beta) branding feature in VCD 10.4.1!
Now it was time for us to upgrade to 10.4.1 to take advantage of all of the new features in the 10.4 release branch so far! The biggest change ofcourse is the way the console proxy service works in 10.4. It’s no longer required to use a seperate port (Port TCP:8443), certificate and possibly (depending on configuration) dedicated URL to loadbalance this the traffic to the cells. Everything can now be done with a unified port TCP:443.
Tomas Fojta explains this on his blogpost (HERE) if you want to have a detailed look. The below diagram is from his page to give you a quick overview how it works (credits to the creator Francois Misiak).
Issue
In VCD 10.4 a couple of additional items have also changed regarding security and the management of certificates. The release notes metion the following on this:
Enhanced trust management integration with vSphere
VMware Cloud Director 10.4 enhances SSL connectivity to all vSphere infrastructure components, including ESXi, by incorporating the vSphere Certificate Authority (CA) into the VMware Cloud Director trust mechanisms which also affects previously added vCenter Server instances.IMPORTANT: Because of this enhancement to SSL connectivity, you must perform additional steps post-upgrade to ensure that VMware Cloud Director trusts all necessary vSphere certificates. Failure to perform these steps post-upgrade can disrupt the connection between VMware Cloud Director and vCenter Server instances.Follow the procedure outlined in the advisory that appears upon upgrade. See also KB 78885 and The VMware Cloud Director console proxy, uploading OVFs and media, and powering on a VM fail.
This in all is something that is similar to the requirement that was present in the VCD 10.3 upgrade back then, where you also had to re-import the vSphere infrastructure related certificates to provide a fully working environment. VCD 10.4 extends this with the ESXi certificates.
During the upgrade to VCD 10.4.1 (which went fine) we applied the Automated option (option 2 in the mentioned KB) to automatically import all the certificated for the configured vCenter Servers, NSX Servers and VMCA certificates with the following command:
/opt/vmware/vcloud-director/bin/cell-management-tool trust-infra-certs --vsphere --unattended
After this we finished the upgrade and started testing the environment. It turned out that everything was working correctly, except for the console connections. The console connections were working without any issues before the upgrade to 10.4, so it was probably related to the upgrade itself is what we figured. After checking the (changed) loadbalancer, the VCD cells and the logging we could identify the following log snippet in the console-proxy.log
file:
2022-12-14 12:53:58,983 | DEBUG | nioEventLoopGroup-2-6 | NettyWebSocketClientHandler | channelInactive, channel=[id: 0xaa2a6c4c, L:/xx.x.xxx.x:48114 ! R:esx01.local/xx.x.xxx.x:443] [server: [L=/xx.x.xxx.x:443 R=/xx.x.xxx.x:55820]] | Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:131) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:353) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:296) at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:291) at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:654) at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.onCertificate(CertificateMessage.java:473) at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.consume(CertificateMessage.java:369) at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392) at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443) at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1074) at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1061) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.base/sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1008) at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1548) at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1394) at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1235) at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284) at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) ... 17 more Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certifi cation path to requested target at java.base/sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:439) at java.base/sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:306) at java.base/sun.security.validator.Validator.validate(Validator.java:264) at java.base/sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:313) at java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:233) at java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:110) at com.vmware.vcloud.common.crypto.ssl.TenantAwareTrustManager.checkServerTrusted(TenantAwareTrustManager.java:70) at com.vmware.vcloud.common.crypto.ssl.DelegatingTrustManager.checkTrust(DelegatingTrustManager.java:102) at com.vmware.vcloud.common.crypto.ssl.DelegatingTrustManager.checkServerTrusted(DelegatingTrustManager.java:74) at java.base/sun.security.ssl.AbstractTrustManagerWrapper.checkServerTrusted(SSLContextImpl.java:1524) at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:632) ... 31 more Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at java.base/sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141) at java.base/sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126) at java.base/java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297) at java.base/sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:434) ... 41 more 2022-12-14 12:53:58,982 | ERROR | pool-jetty-70 | ServerWebSocket | Connecting to ESX esx01.local [server: [L=/xx.x.xxx.x:4 43 R=/xx.x.xxx.x:55820]] [client: [id: 0xaa2a6c4c, L:/xx.x.xxx.x:48114 ! R:esx01.local/xx.x.xxx.x:443]] | io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unab le to find valid certification path to requested target
This is clearly not a regular console-proxy output, so we dug a bit further and noticed that VCD did not seem to either have or accept the ESXi certificates on which the VM’s reside. Investigating this with GSS reveiled that the command described in the KB and that we executed during the change does not accept self-signed ESXi host certificates automatically. So in other words, you would either have to manually upload all of the ESXi host certificates, or manually upload the VMCA certificate.
Resolution
Fortunately uploading the VMCA certificate is pretty easy to do:
- Go to the vCenter Server(s) that are connected to the VCD environment on the following URL: https://vcenter.domain.com/certs/download.zip.
- Extract the ZIP file.
- Search under the /certs/win folder for the following certificates:
- Files with a .0, .1 and so on are root certificates.
- Files with .r0, .r1 and so on are the Certificate Revocation Lists (CRL’s), these we don’t need.
- Go to the VMware Cloud Director provider page -> Adminstration -> Trusted Certificates.
- Klik on Import and import the certificate downloaded in the ZIP earlier.
- Retry the console connections.
After this you can retry the console connections and they should work again. You can also have another look in the console-proxy.log
file and see the following correct logging:
2022-12-16 12:37:20,167 | DEBUG | pool-jetty-53379 | ServerWebSocket | Decoded ticket host=esx02.local, payload=com.vmware.consoleproxy.ticket.TicketPayload@dcba0ed1 {userName: system; vmName: BvE-Test-2bf798bc-1e54-4d19-b785-08b4516f162e - BvE-Test-NSXT; vmId: vm-122556; orgName: org-xxx; destHostThumbprint: XX:2A:70:36:22:6C:XX:2B:B8:XX:9F:42:XX:5C:B1:XX:23:54;vcId: xxxxx-e892-4048-bb76-xxxxxxxx;ticketType: webmks}, ESX ticket=9b5eb0480c693772 [server: [L=/xx.x.xxx.x:443 R=/xx.x.xxx.x:58518]] [client: not-connected] | 2022-12-16 12:37:20,173 | DEBUG | nioEventLoopGroup-2-9 | NettyWebSocketClientHandler | channelActive, channel=[id: 0x26939e93, L:/xx.x.xxx.x:37436 - R:esx02.local/xx.x.xxx.x:443] [server: [L=/xx.x.xxx.x:443 R=/xx.x.xxx.x:58518]] | 2022-12-16 12:37:20,193 | DEBUG | nioEventLoopGroup-2-9 | NettyWebSocketClientHandler | WebSocket Client connected, channel=[id: 0x26939e93, L:/xx.x.xxx.x:37436 - R:esx02.local/xx.x.xxx.x:443] [server: [L=/xx.x.xxx.x:443 R=/xx.x.xxx.x:58518]] | 2022-12-16 12:48:26,896 | DEBUG | pool-jetty-53493 | ServerWebSocket | onClose: status=1,006, reason=Session Closed [server: [L=/xx.x.xxx.x:443 R=/xx.x.xxx.x:58518]] [client: [id: 0x26939e93, L:/xx.x.xxx.x:37436 - R:esx02.local/xx.x.xxx.x:443]] | 2022-12-16 12:48:26,897 | DEBUG | nioEventLoopGroup-2-9 | NettyWebSocketClientHandler | channelInactive, channel=[id: 0x26939e93, L:/xx.x.xxx.x:37436 ! R:esx02.local/xx.x.xxx.x:443] [server: [L=/xx.x.xxx.x:443 R=/xx.x.xxx.x:58518]] |
There you have it. If you are having issues with your console connections in VCD 10.4.x this might be your issue. Like I said earlier the first two options mentioned in KB 78885 worked out for us, only manually uploading the VMCA certificate worked. This is due to the fact that this environment was using self-signed VMCA certificates on the ESXi hosts.
1 Comment
Robért · April 4, 2023 at 4:40 pm
Thanks, this article help us a lot 🙂