VMware vCenter 6.7 U1: Windows to VCSA Upgrade and Convergence

Today we will be talking about the VMware vCenter 6.7-U1 (Update 1) upgrade process. I recently had an opportunity to work with a enterprise customer to upgrade their VMware environment. In this post we will be going through the upgrade process and my thoughts. VMware 6.7 U1 is a major upgrade that includes the fully featured HTML5 client. For full details on what’s new please see: https://blogs.vmware.com/vsphere/2018/10/whats-new-in-vcenter-server-6-7-update-1.html

I will start by saying bravo to the VMware team for this release. For the first time I actually felt comfortable abandoning the good ol’ “fat client” (the legacy C# client). Many of VMware’s customers, in my experience, were intentionally lagging behind on older versions of vCenter to keep a cold death-grip on the fat client because they refused to be force-fed the flash client that we all know and despise. The HTML5 client is a worthy successor. It’s fast, it looks good, its organized better, and it even has a dark mode. It’s obvious they took feedback from the community, hired the right developers who understood their target audience, and put out a great product. The upgrade and migration process is also done very well.

After a few weeks of the VCSA and HTML5 client baked into the client environment it’s obvious that some things are still missing, like exporting events, from the HTML5 client but I would expect these to be eventually added. There also appears to be some lag to the recent tasks list in larger linked environments. I’ve also seen a few UI bugs with adding permissions and modifying sDRS configuration.

One issue I’ve seen on multiple VCSA’s so far is that the database “archive” (disk 13) will constantly fill up causing the VCSA to show up as degraded within the dashboard. You will be greeted with the error message “File system /storage/archive is low on storage space. Increase the size of disk /storage/archive.” There is very little documentation on this but apparently this is expected behavior despite the warnings and rational I don’t quite understand yet. This didn’t stop me from increasing the disk size (KB2126276) slightly.

The 6.7-U1 Upgrade Experience

Like most customers running VMware your vCenters are probably running Windows, you have an external platform services controller (PSC), and probably even separate database servers. Well as of 6.7-U1 you will be making some changes to the topology of your vCenter deployment. VMware now recommends the embedded platform service controller topology for all new and existing deployments. But what does this mean? In one word: simplified. The environment will have less complexity and less dependency on other systems. The integrated database means no need for a local or external database which introduced performance and latency hits as well as complicates recovery during an outage. The embedded platform services controller for each VCSA means no need to point to an external PSC which for some customers just ended up being a single point of failure.

Below are some notes I made for administrators considering an upgrade. Your starting point should be the VMware 6.7 Upgrade Guide.

My Recommendations:

Migrate Windows vCenter to VCSA: VMware has openly said support for Windows will be going away and that VCSA is now the primary platform.

Converge to Embedded PSC: This is the new VMware recommended deployment topology. There are situations where this is not appropriate and external may still be best. Be aware that you cannot converge a VCSA to embedded with a historical data import job queued, running, or in a hung state. The historical data import must complete or be cancelled in order to converge.

Upgrade datastores to VMFS6: Your environment may benefit from VMFS6, particularly if you use SSD or NVMe based storage. VMFS6 is 4K aligned in order to support newer Advanced Format (AF) large-capacity drives among many other improvements. WARNING: You cannot upgrade a VMFS5 datastore to VMFS6. You must migrate, remove and re-add the datastore to format as VMFS6. Your ESXi hosts must be at least 6.5.x or higher.

Pre-run the Migration Assistant prior to migration/upgrade day: This will expose any issues before you upgrade. Note that the assistant will terminate when it encounters an issue and does not proceed with the rest of it’s checks. This means you need to keep running the Migration Assistant until you get to the state where it is ready. The log files will be your primary resource to troubleshooting these issues.

Pre-build the converge scripts: You should plan-out and pre-build the converge.json and decommission_psc.json templates.

Historical Data Import: If you’re migrating a large vCenter then use the background historical data import. This will allow the vCenter services to start without having to wait a potentially lengthy period.

Your skills may be tested: A fresh install will be very straight forward but the migration from Windows to VCSA can become very complicated very quickly. Be prepared for the unexpected. I personally ran into many undocumented issues during an upgrade from a Windows Server 2008 R2 vCenter. This was the customers development vCenter and we expected this one to be smooth sailing, but it turned into a nightmare that required quick thinking and our skills were certainly put to the test during it’s upgrade.

In conclusion I highly recommend an upgrade to 6.7 Update 1, but don’t rush into it. Ensure you take the time to plan your upgrade.

vSphere 6.7 U1 now released

On October 17, 2018 VMware announced that vSphere 6.7 Update 1 is now available. The new HTML5 client is now ‘Fully Featured’ which means that you can use the HTML5 client for all administration and configuration of vSphere; including Auto Deploy, Host Profiles, VMware vSphere Update Manager (VUM), vCenter High Availability (VCHA), network topology diagrams, overview performance charts, and more.

I am personally excited to see the HTML5 client become the primary client as I much prefer using it over the flash client. One of the more interesting features included in this release is the vCenter External to Embedded Convergence tool. Since embedded PSC is the recommended deployment model for vCenter Server this tool allows you to migrate to an embedded PSC without having to nuke-and-pave your entire vCenter installation.

The Content Library also got some much needed love from the VMware development team as it now supports two more new file formats; allowing templates and OVA files. This makes the Content Library much more functional. The lack of VM templates was a major caveat of the Content Library to the point of making it practically useless for some VMware customers. So this change is a welcome one to say the least.

New Features

  • vCenter High Availability (VCHA)
    • We redesigned VCHA workflows to combine the Basic and Advanced configuration workflows. This streamlines the user experience and eliminates the need for manual intervention of some deployments.
  • Search Experience
    • We revamped the search experience. In this version of the vSphere Client, you can now search for objects with a string and filter the search results based on Tags/Custom attributes. You can also filter the object lists in the search even further. For instance, you can filter on the power state of the VMs etc., You can save your searches and revisit them later.
  • Performance Charts
    • You can pop the performance charts into a separate tab and zoom in on a specific time in the chart. We also added overview performance charts for datacenters and clusters.
  • Dark Theme
    • Dark theme has been one of the most requested features for the vSphere Client so we’re introducing a Dark mode setting. Support for the Dark theme is available for all core vSphere Client functionality and implementation for vSphere Client plugins is in progress.
  • Alarm Definitions
    • We greatly simplified the way you define new alarms, particularly in how you create rules for trigger conditions.

Disaster strikes as NAS3 crashes

This past weekend we had a power brownout for about 4 hours. This caused my servers to fail-over to battery power. The batteries don’t last long with servers running. I guess something went sour with the automatic shutdown of my NAS3 which is used only for my VMware virtual machines and it did an improper shutdown. The RAID has crashed.

I don’t have anyone to blame other than myself and I knew eventually this day would come. NAS3 was in RAID-0. That means striping with no redundancy. A failed array on RAID-0 typically means total data loss. I take daily backups of this entire NAS nightly so I am aware and prepared for the risk of using striping. That does not mean that it’s a fun time recovering from it.

Adding additional redundancy for blackouts

Currently, one of the hardest things to recover from in my current home-lab environment is a total power blackout. Everything right now is planned & designed around losing certain components like 1 disk, 1 switch/network cable, etc. However when everything is off and I need to bring things back online it’s a painstaking and very manual process. Over time my environment has also become more and more complex. This latest outage has me scratching my head at how to recover faster & simpler from a power blackout.

Continue reading…