← Back to ProDirt Blog

VxRail in Full Meltdown: Recovering VMs When vCenter and vSAN Are Both Gone

No vCenter. No VxRail Manager. vSAN partially recovered. Here's how to get VMs back on the network when the entire management plane is gone and you're working directly on the host.

This one was a bad day. VxRail cluster, vSAN-backed storage, and both vCenter and VxRail Manager were offline. The vSAN had already been through a partial recovery — some VMs were accessible on the datastores, but they weren't responding on the network. The reason turned out to be something that seems simple once you understand it, but burns a lot of time if you don't: the VMX files were referencing dvSwitch portgroup IDs that no longer resolved to anything, because the entire management plane that knew about those portgroups was gone.

Why the NICs Were Dead

In a normal vSphere environment with a distributed vSwitch (dvSwitch), VM network adapters are defined in the VMX file using references like this:

ethernet1.dvs.switchId = "50 27 c7 2e ..."
ethernet1.dvs.portgroupId = "dvportgroup-1011"
ethernet1.dvs.portId = "..."
ethernet1.dvs.connectionId = "..."

Those IDs are resolved by vCenter at runtime. When vCenter is alive, it maps dvportgroup-1011 to an actual network. When vCenter is dead, those references are essentially null pointers. The ESXi host has no idea what network to put that VM on, so the adapter initializes with no connectivity.

The fix is to abandon the dvSwitch references entirely and rewrite the ethernet section to use networkName style — which is what standard vSwitch VMs use and what ESXi can resolve locally without needing vCenter at all.

Finding the VMX File

With vCenter down, everything happens from the ESXi host CLI over SSH. First, find the VMX:

find /vmfs/volumes -name "*.vmx" 2>/dev/null

On a vSAN datastore the path will look something like:

/vmfs/volumes/vsan:52429c81f18c2743-cab55941a2c30dd1/f49c0860-5ce0-f710-4e2a-e4434b35be40/YourVM.vmx

Finding the Correct Network Name

Before editing the broken VM, grep a working VM that you know is on the right network. This gives you the exact networkName value to use:

grep -i ethernet /vmfs/volumes/vsan:.../WorkingVM/WorkingVM.vmx

You're looking for a line like:

ethernet0.networkName = "VM Network"

That string — "VM Network" or whatever your environment calls it — is what you'll use in the broken VM's VMX. Don't guess it; pull it from something that's working.

Rewriting the Broken Ethernet Section

Open the VMX in vi:

vi /vmfs/volumes/vsan:.../.../YourVM.vmx

Delete all lines that start with ethernet1.dvs. and the uptCompatibility line if present. Replace the entire ethernet1 block with clean networkName-style entries:

ethernet1.virtualDev = "vmxnet3"
ethernet1.networkName = "VM Network"
ethernet1.shares = "normal"
ethernet1.addressType = "static"
ethernet1.address = "00:50:56:ab:ac:7a"
ethernet1.present = "TRUE"
ethernet1.pciSlotNumber = "224"
⚠️ Keep the original MAC address. Use addressType = "static" and copy the MAC from the original VMX. If this VM has a DHCP reservation or anything else tied to its MAC, changing it will cause more problems than it solves.

Registering and Starting the VM Without vCenter

With vCenter gone, you manage VMs directly through the host using vim-cmd. If the VM isn't registered on the host yet:

vim-cmd solo/registervm /vmfs/volumes/vsan:.../.../YourVM.vmx

Get the VM ID and reload the config:

vim-cmd vmsvc/getallvms
vim-cmd vmsvc/reload <vmid>
vim-cmd vmsvc/power.on <vmid>
ℹ️ Power state check: If the VM shows as already powered on in getallvms but isn't responding, do a hard power off first (vim-cmd vmsvc/power.off <vmid>) before the reload. Otherwise the reload may not pick up the VMX changes.

The vSAN Disk Extension Side Quest

While we were in here, there was also an attempt to extend a VM disk through the vCenter UI — which was failing with "Invalid operation for device" even from CLI vmkfstools -X. This turned out to be vSAN holding locks on the object differently than a standard VMFS datastore. The UI was also holding the entire Edit Settings dialog hostage over a disk validation error, blocking an unrelated NIC change from being saved.

Lesson: when the UI is broken and blocking you from making an unrelated change, go directly to the VMX. Don't fight the GUI for an hour when vi and vim-cmd will get you there in five minutes.

Wrapping Up

The core takeaway from this incident: dvSwitch portgroup references in VMX files are vCenter-dependent. No vCenter means no portgroup resolution, which means the NIC initializes with no connectivity even though the VM boots and appears healthy. Switching from dvSwitch style to networkName style lets ESXi resolve the network locally and completely bypasses the dead management plane.

It's the kind of thing you'd never know to look for until you're standing in the wreckage of a cluster at 11pm wondering why a VM that's clearly running has no network. Now you know.