Decommissioned the external PSCs after convergence without re-registering SRM? No problem !!

[Impact]

Re-registering SRM/VR would fail with an SSL error.

[Cause]

— Both SRM & VR save the entries of the lookupservice url in their respective DBs under various tables.
— These need to be manually updated with the correct entries of the now embedded VC node’s url & thumbprint.

[DB Tables to review]

Wile re-registering SRM :

  1. In SRM, look for certificates in the following DB tables:

a. SELECT * FROM pd_sslthumbprintstore;

b. SELECT * FROM pd_localsite;

c. SELECT * FROM pd_remotesite;

d. SELECT * FROM pds_remotesite;

e. SELECT * from pds_solutionuser;

  1. For vSphere Replication:

a. SELECT * from vmomiserverentity;

[Fix]

Manually update the thumbprint & url information as below (would be the same for both SRM & VR):

update vmomiserverentity set thumbprint = ‘C5:A1:31:FA:1F:A5:90:32:90:DX:3E:5F:49:A3:ED:51:79:4C:F4:A2‘ where dbid = ‘330‘;

where;
C5:A1:31:FA:1F:A5:90:32:90:DX:3E:5F:49:A3:ED:51:79:4C:F4:A2 is the thumbprint of the embedded VC’s machine ssl certificate.
330 is the database id of that particular entry.

I hope this helps!

Unable to SSH into the vCenter Server, fails with an error.

[Error]

/etc/profile.d/proxy.sh: line 11: Enable: command not found

Traceback (most recent call last):

File “/usr/lib/applmgmt/base/bin/vherdrunner”, line 8, in

vherdrunner.start(vherdrunner.directories)

File “/usr/lib/applmgmt/base/bin/vherdrunner.py”, line 130, in start

exec(code, childGlobals)

[Cause]

There was an extra p before the # sign wrong characters in /etc/sysconfig/proxy file.

less /etc/sysconfig/proxy

p# Enable a generation of the proxy settings to the profile.

[Resolution]

Removed the extra character p before the # sign and were able to ssh to vCenter.

less /etc/sysconfig/proxy

# Enable a generation of the proxy settings to the profile.

Note: – Review the /etc/sysconfig/proxy file for any extra or special characters & remove them (take a backup of the origina file).

Migrating a virtual machine between two different vDS version fails with an error.

What caused this error?

When attempting to migrate a virtual machine from one vSphere Distributed Switch (vDS) to another, you experience these symptoms:
The migration fails.

You see these errors in the vSphere Web Client similar to:

The target host doesn’t support the virtual machines current hardware requirements. The destination virtual switch version or type (VDS 7.0.0) is different than the minimum required version or type (VDS 6.6.0) necessary to migrate VM from source virtual switch.

Why do you see this?

This issue occurs because there are comparisons being made between the vDS on the source and destination for the vMotion operation. The vDS must match. Otherwise, this would mean the destination vDS is not compatible.

How to fix this?

This is an expected behavior when migrating between mixed vSphere Distributed Switches.

To resolve this issue, upgrade your vDS switch with the lower version to match that of the higher one in your infrastructure.

How do we workaround this without vDS upgrade?

  1. vCenter Server 6.5.x and vCenter Server 6.7.x
  2. Log in to the vCenter Server using the HTML5 or vSphere Web Client.
  3. Highlight your vCenter Server name in the left-hand column and then click on the Configure tab on the right.
  4. Go to Advanced Settings and click Edit Settings.
  5. At the bottom of the pop-up window, add the following property in the Name section:

config.migrate.test.NetworksCompatibleOption.AllowMismatchedDVSwitchConfig

  1. Set the value to true.
  2. Click Add.
  3. Click Save.
  4. Re-try the migration.

For vCenter Server 7.x and later

  1. Log in to the vCenter Server using the HTML5 or vSphere Web Client.
  2. Highlight your vCenter Server name in the left-hand column and then click on the Configure tab on the right.
  3. Go to Advanced Settings and click Edit Settings.
  4. At the bottom of the pop-up window, add the following property in the Name section:

config.vmprov.enableHybridMode

  1. Set the value to true.
  2. Click Add.
  3. Click Save.
  4. Re-try the migration.

Note: After enabling hybrid mode in vCenter, the target DVS version must be at least 6.0.0.

Performing a VASA Provider’s certificate replacement after vCenter Server convergence, would result in virtual volumes (vVOLs) going inaccessible from all ESXi host’s inventory.

What causes this?

1. The convergence workflow installs RPMs related to the PSC services which also means a new VMware Certificate Authority (VMCA)
instance is created on the embedded VC node.

2. VMCA creates a new VMCA root certificate which in turn is used for future certificate requests that the embedded node handles.

3. While the old certs are retained maintaining VC<-> host communication, other solutions like vVOl do not operate as the new certs provided to VASA providers have new ROOT certificte details whereas the hosts still have old ones causing vVol workflow to break.

How do you resolve this?

Renew or Refresh ESXi Certificates connected to vcenter server.

https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.security.doc/GUID-ECFD1A29-0534-4118-B762-967A113D5CAA.html

The certificate refresh has to be done manually per host.

Note:Bulk certificate management is currently not possible from vCenter Server UI at this time.

SRM/vSphere Replication site pairing fails with an error. “Cannot complete login due to an incorrect user name or password.”

When will you see this?

While attempting to do a site pair after a re-installation, upgrade of the VC/VR/SRM.

[Log Excerpt]

dr.log:

2020-05-05T21:32:13.527+05:30 warning vmware-dr[04864] [SRM@6876 sub=LocalHms] Failed to connect:
–> (vim.fault.InvalidLogin) {
–> faultCause = (vmodl.MethodFault) null,
–> faultMessage =
–> msg = “Received SOAP response fault from []: login
–> Cannot complete login due to an incorrect user name or password.”
–> }
–> [context]zKq7AVMEAAgAAFaTwQAMdm13YXJlLWRyAAAqPwJ2bWFjb3JlLmRsbAABtM4CdmltLXR5cGVzLmRsbAAB/X8yAqXCBXZtb21pLmRsbAACz+AFAOt+GwBLjhsAyYghA39PAk1TVkNSMTIwLmRsbAADJlECBNITAEtFUk5FTDMyLkRMTAAF9FQBbnRkbGwuZGxsAA==[/context]
–> [backtrace begin] product: VMware vCenter Site Recovery Manager, version: 8.1.2, build: build-12686166, tag: vmware-dr, cpu: x86_64, os: windows, buildType: release
–> backtrace[03] vmacore.dll[0x00023F2A]
–> backtrace[04] vim-types.dll[0x0002CEB4]
–> backtrace[05] vim-types.dll[0x00327FFD]
–> backtrace[06] vmomi.dll[0x0005C2A5]
–> backtrace[07] vmomi.dll[0x0005E0CF]
–> backtrace[08] vmacore.dll[0x001B7EEB]
–> backtrace[09] vmacore.dll[0x001B8E4B]
–> backtrace[10] vmacore.dll[0x002188C9]
–> backtrace[11] MSVCR120.dll[0x00024F7F]
–> backtrace[12] MSVCR120.dll[0x00025126]
–> backtrace[13] KERNEL32.DLL[0x000013D2]
–> backtrace[14] ntdll.dll[0x000154F4]
–> [backtrace end]

/opt/vmware/hms/logs/hms.log

2020-05-05 09:44:28.246 ERROR com.vmware.vim.sso.client.impl.SoapBindingImpl tcweb-11 operationID=lro-2-71e1a81-37ab-HMS-201468 | SOAP fault
com.sun.xml.internal.ws.fault.ServerSOAPFaultException: Client received SOAP Fault from server: Access not authorized! Please see the server log to find more detail regarding exact cause of the failure.

2020-05-05 09:44:28.247 ERROR jvsl.security.authentication.sm tcweb-11 operationID=lro-2-71e1a81-37ab-HMS-201468 | Invalid token
com.vmware.vim.sso.client.exception.InvalidTokenRequestException: Request is invalid: ns0:InvalidRequest: Access not authorized!

2020-05-05 09:44:28.248 INFO hms.i18n.class com.vmware.hms.response.filter.I18nActivationResponseFilter tcweb-11 operationID=lro-2-71e1a81-37ab-HMS-201468 | The localized message is: Cannot complete login due to an incorrect user name or password.

Why would we see this?

One or multiple SolutionUsers get removed from the groups they should be a part of, resulting in the issue.

Steps to resolve:

Following are the 4 SRM & VR SolutionUsers that one would have in their environment.

SRM-
SRM-remote-
h5-dr-
com.vmware.vr-

The following are the groups these SolutionUsers should be a part of:

  1. SolutionUsers
    SRM-
    SRM-remote-
    h5-dr-
    com.vmware.vr-
  2. ActAsUsers
    CN=h5-dr-
    com.vmware.vr-
  3. Administrators
    SRM-
  4. LicenseService.Administrators
    SRM-
  5. SRM Remote Users
    SRM-remote-
  6. HmsRemoteUsers
    SRM-remote-
  7. Login to the vCenter Server using vsphere Flex client.
  8. Navigate to Administration -> Single Sign-On -> Users and Groups -> Groups -> Add Group members.
  9. Manually add the SolutionUsers to these groups.
  10. Re-register SRM/VR.