Today is the second day of VMworld Europe 2019! Last night we closed out day one with some nice drinks at a local restaurant called “El Nacional”. After some rest during the night we are ready to take on the second day! The day started with the general session.

General Session Keynote

Pat started of by giving us a history lesson on how the digital innovation and life has evolved in the last four decades. Back in the day the digital innovation was completely separated from your daily life ritual. But nowadays everything is interconnected. It’s the same. We wake up in the morning and listen to a audiobook, upload our sleep patterns to the cloud and step on our high-tech bike to work out.

The “problem” with this from a tech perspective is how do you manage it all? How do we maintain all of these workloads and how do we operate on it with simplicity? Well this is something VMware can help with by using their services. Operating workloads across clouds is something extremely difficult and something the market is missing out on. But not anymore. With the today announced VMware vCloud Director Service you can manage on-prem hybrid and hyperscale cloud resources. VMware vCloud Director Service is the same as the product that has already been released years ago called VMware vCloud Director. But with this service you can buy it directly from VMware. They deploy it, they operate it and they upgrade it as a service. If your organization is not able to deploy and maintain a vCloud Director instance, this is really something you should have a look at. Taking advantage of the functionality from vCloud Director will change the way your organization is managing their virtual workloads. It will provide you with a Central-Pane-of-Management (CPOM) for all virtual workloads in the entire IT estate.

Next Joe Beda, one of the founders of Kubernetes, went on stage to explain the VMware Tanzu mission. Delivering, running and maintaining enterprise Kubernetes modern apps with ease across clouds in a consistent manner. VMware extends this project by telling us about a recently launched project called “Project Pacific”. Project Pacific is basically the fusion of Kubernetes and vSphere. They are going to integrate these two products into each other. So that we can natively operate Kubernetes workloads from within vSphere wilth a deep integration with VMware NSX and VMware vSAN and all of the other vSphere features.

Next the new close relationship with Microsoft got featured. VMware and Microsoft are trying to integrated all sorts of services into each other. VMware is doing this so that they can make true to their word to let customers manage all of their virtual workloads across clouds in a consistent way. Services that integrate with Azure are, HCX, VeloCloud, Azure SQL 2019 on vSphere and full Workspace One integration with Microsoft Endpoint Manager.

Also announced today, Project Meastro. A new solution to for Telco cloud automation and orchestration. VMware sees a giant opportunity in the Telco markets to automate their environments to deliver more efficient services to their customers.

Another announcement that was just made today is: IDS/IPS for NSX!! VMware announced that they are going to release a NSX software based solution for Intrusion Detection System (IDS) and Intrusion Prevention System (IPS) to increase the security on the network, but also to reduce the physical footprint inside the datacenter. Since most IDS/IPS solutions require physical hardware, or a large number of resources, VMware figured it should release this to further enhance the SDDC vision. From a personal perspective I do wonder what the price will be for this feature in regards to compute resources and money. But we will find out soon enough I guess!

The acquisition of Carbon Black has also been finalized. They explained that they are going to integrate this inside vSphere (agentless), in Workspace, in NSX and on physical devices such as laptops from Dell. VMware said that this has never been done before. Eager to see these developments in the future.

Another awesome thing that happend during the general keynote is that the company I work for got featured in the VMware Cloud Verified list on stage! The team and I are extremely proud that we got this far and that we got featured on this global event. We are one of less than a hundred VMware Cloud Verified providers on the globe and we all do it with a team of just 8!

Application Migration in Multi-Cloud using Network Insight and Cloudhealth

Large enterprise nowadays have embraced the Multi-Cloud strategies and have about 33% of their virtual estate on public clouds. These workloads are often distributed on an average of 5 clouds. It’s good to hear that most enterprises have adopted this strategy, but it also creates a couple or challenges, such as cost, migration, security and governance.

Cloudhealth and Network Insight can help with this. With Network Insight you can get an advanced overview of your network flows. You get a nice graphical overview that lets you see all the connections that are being made. This advanced overview can help create an inventory of virtual machines that you would like to group together in the so called move groups.

Move groups are logical groups in which you can map virtual machines that you want to move to another cloud. These virtual machines can be mapped together based on all sorts of properties that you can decide on yourself.

Because of the advanced insight into the network flows, you can even decide to move the virtual machines inside the move group to a specific geolocation. This ensures that the users that are using the apps on those virtual machines receive the lowest amount of distance latency once they are migrated.

But, before the migration starts you should do an assessment from within VMware Cloudhealth. Inside Cloudhealth we can let the tool show us the costs and suggested instance types we need to migrate to for the public clouds. This feature is very useful to determine if and where you should migrate the workload to.

This assessment can also provide you with some rightsizing recommendations to reduce the size of the workload and costs before you start the migration. The migration itself will be done with some help from VMware HCX. The migration to AWS will be done with AWS Migration Hub and to the Google Cloud with Velostrata (now renamed to Migrate for Compute Engine).

The neat thing about this process is that you can also do a post migration assessment. This means that any issues in the network or applications can be shown so that you can troubleshoot and fix this.

The last thing that Cloudhealth can also do post migration is to give you another rightsizing recommendations list. It constantly analyzes your environment and sees when instances are underutilized and waisting money. Which in return you can use to reduce the costs.

Extreme Performance Series: vSphere Compute and Memory Schedulers (HBI2494BE)

This session was all about scheduling and performance. We got informed about how the CPU scheduler works in ESXi, how the VMware ESXi Side-Channel Aware (SCA) v1 and v2 scheduler software works, how memory management works and how NUMA schedulers work.

I went to this session to find our more about the ESXi Side-Channel Aware (SCA) scheduler versions and differences between version 1 and version 2 that was released in vSphere ESXi 6.7 U2. Since we are a cloud provider and have shared cloud environments, we really have to think about the recent security issues on the Intel processors that appeared the last couple of years and on how to mitigate them for our customers to ensure a combination of maximum security and maximum performance. The below list is a quick overview on the features that version 1 and version 2 have:

  • ESXi Side-Channel Aware (SCA) scheduler v1
    • Available in vSphere ESXi 5.x to 6.5
    • Does not use hyperthreads for maximum amount of security.
    • Intra-VM security boundary.
    • Because of this it prevents security leaks between processes within the virtual machine.
    • Because you need to box each process within a VM, you cannot schedule more than one thread. Which means that you lose a small amount of performance. Generally speaking this would mean that you lose 0-30% of performance on your platform.
    • Most secure scheduler that is available.
  • ESXi Side-Channel Aware (SCA) scheduler v2
    • Available in vSphere ESXi 6.7 U2.
    • Does use hyperthreads, but only within a virtual machine.
    • Inter-VM security boundary. Which means that it’s not possible to attack another VM through the processor.
    • Version 2 works in a way that it has a hardenend addres-space isolation and boundaries separation, so that no other vCPU’s are on the same HT/Core.
    • You can however get a security leak within a VM between threads.
    • Because you still draw a box around your virtual machine, you still have some performance impact. But these numbers are between 0-10% performance loss.
    • Less secure than v1, but more performance.

Make sure that you read up on this topic! If you don’t activate scheduler v1 or v2 you should assume that a side-channel attack is possible. Working threats have been documented on YouTube. Choosing to leave the scheduler on the default one makes your environment vulnerable. If you want to read up on this topic, I suggest the following blog from VMware.

https://blogs.vmware.com/vsphere/2019/05/which-vsphere-cpu-scheduler-to-choose.html

If you are not sure what scheduler you need to use doe your environment, please do check out the advisor VMware made:

https://vspherecentral.vmware.com/path-finder

The last note to take home from this topic is that the scheduler, whether v1 or v2 won’t be activated on environments (processors) that don’t have any vulnerabilities.

Innovations in vMotion: Features, Performance and best practices

VMware has been developing a major overhaul for vSphere vMotion in the last period. I think they have dubbed it “Future vMotion”, well atleast for now it is. A release is not yet named, but it will be released in a future version of ESXi. During this session we got to know about three parts that will be overhauled for the next release. Below is a quick summary for those three:

  • vMotion performance improvements;
    • In the current vMotion process, memory pages that have changed need to be tracked. This is done by installing “traces” on vCPU’s. This in return gives a short hick-up/performance drop (during the start of the vMotion). The reason it gives a drop in performance is because all vCPU’s are “halted” (microseconds) when the trace is installed. The new enhanced vMotion doesn’t install a trace on all vCPU’s at the same time, so this ensures that there is no direct “long” halt on the virtual machine.
    • One vCPU is in charge of the vMotion process instead of all vCPUs.
    • The time a virtual machine spends in the trace install time is significantly reduced.
    • This also reduces the performance hit that virtual machines otherwise would experience during the time the memory spends in trace.
  • Memory bitmap enhancements;
    • A memory bitmap is a map of pages that have changed (dirty) during the vMotion process. This bitmap can get quite large when you are talking about “monster” virtual machines. A memory bitmap for a virtual machine with 1GB RAM is 32KB. So back in the day this wasn’t hard to transfer over to another host. But when you are talking about virtual machines that has around 24TB of RAM, the memory bitmap file is already 768MB large. Which takes about 2 seconds to transfer. This also gives the virtual machine a long stun time, in which no operations can be executed.
    • This has been enhanced by compacting the memory bitmap file. Apparently the memory bitmap file is full of blank spaces but since a lot of memory has already been copied over during the pre-copy phase in the vMotion process, this is not needed. When you compact the blanks out of the memory bitmap file the file size gets reduced significantly.
  • Fast Suspend Resume
    • This is a technique that VMware uses for hot adding devices and storage vMotion. It’s different from vMotion because it doesn’t have to change the active memory file to another host.
    • This technique actually creates a shadow virtual machine, adds the new resources or does the storage vMotion, quiesces the virtual machine, copies over the memory metadata and resumes the virtual machine.
    • Transferring over this memory metadata file is currently done by using one vCPU.
    • Because of this the stun time during one of the operations mentioned above is rather large when you are editing virtual machines with a large amount of RAM.
    • They enhanced this by using all the vCPU’s the virtual machine has. This significantly reduces the stun time.

This session got very technical and was rather quick to cover all topics. So I couldn’t note down everything. I will do a follow up on this later on when I can rewatch the session at home! I do find this very interesting! I sometimes notice that using vMotion or editing virtual machines gives a rather large stun time, which sometimes generates downtime. So enhancements on this well be well received on my end!

After this session we had drinks at the Solution Exchange and went back to the hotel to drop our stuff and had a dinner. After this we went to the “legendary” Veeam Party and had some drinks with friends and colleagues.

Share this if you found this interesting.

Leave a Reply

Your email address will not be published. Required fields are marked *