NSX Manager backup to SDDC Manager fails

Submitted by Robert Cranendonk on Tue, 07/28/2020 - 21:12
 
 
Follow your favourite author

Leave us your email address and be the first to receive a notification when Robert posts a new blog.

NSX Manager backup to SDDC Manager fails
Tue 28 Jul, 2020
When running the pre-check for a VCF upgrade it failed on the ‘Backup Availability Check’ for the NSX-V manager:
Image
Check failed
Textarea

Logging in to the NSX-V manager I was greeted with this error:

"Unable to connect to server <ip> at 22. Either server details are invalid or invalid credentials are presented."

Image
Error message
Textarea

Luckily, I had encountered this error in my testlab when fiddling with some settings regarding backups, so I was able to resolve it quite quickly. Should you run into this issue (or similar), here is how to solve it.

Background

The SDDC manager has a number of users configured, these are the most relevant:

  • vcf (or ‘super-user’), the one you can log in with
  • root, can only be accessed via su
  • backup, the backup user

Now, the backup user is a bit of a strange beast. The password that is configured isn’t something that you as a user can configure, in fact, it’s configured in the PostgreSQL database. The information about this user isn’t really well documented, either. Fun fact: this password expires…

After some Google-fu I found this VMware KB article that no longer exists (for some strange reason):

NSX Manager backups in VMware Cloud Foundation fail with the error “Invalid credentials are presented” (67638)

I have no clue why this article isn’t live anymore! There’s nowhere in the release notes since 3.7.1 that specifies that this issue has been addressed. Thankfully, I found a cached version.

I’m fully expecting that future versions will have a better way of handling the backup user, but for the time being this is the best way to handle it.

Symptoms

If you have the alerts above, check in the SDDC manager if the password is set to expire:

chage -l backup

Image
symptoms
Textarea

Using the following command check the logs for what is the cause.

journalctl -u sshd.service -r | grep  backup -m 10

The password could be expired:

Text highlighted

Apr 21 14:15:07 sddc-manager-controller.vcf.vxrack.local  sshd[11753]: Accepted password for backup from 172.30.0.24 port 41460  ssh2

Apr 21 14:15:07 sddc-manager-controller.vcf.vxrack.local sshd[11753]: pam_unix(sshd:account): expired password for user backup (password aged)

Textarea

The password could also have been changed, which will result in these messages:

Text highlighted

Apr 09 11:37:36 sddc-manager-controller.vcf.vxrack.local  sshd[76309]: pam_unix(sshd:auth): authentication failure; logname= uid=0  euid=0 tty=ssh ruser= rhost=127.0.0.1  user=backup

Apr 09 11:37:38 sddc-manager-controller.vcf.vxrack.local sshd[76309]: Failed password for backup from 127.0.0.1 port 42396 ssh2

Textarea

VMware makes the following statements in the article:

  • This is a known issue affecting VMware Cloud Foundation. There is currently no resolution.
  • Never manually change the password for the “backup” account or  the “backupuser” account.  Both accounts need to match what is in the SDDC Manager PostgreSQL database
  • For VMware Cloud Foundation 3.0 to 3.7 you won’t be able to rotate the “backup” account password.  The account is not part of the Password Rotation workflow yet.

Resolution/workaround

1. Retrieve the password (hint: it’s VMware123!)

  1. Log in to the SDDC Manager as the vcf user and switch to the root user by using the su command.
  2. Run the following command to retrieve the backup-user password:

curl http://localhost/css/credentials | json_pp | grep backup -C 5

Which will give you the following output:

Text highlighted

"credentialType" : "FTP",

"modificationTime" : 1555480009830,

"id" : "84f06ff5-976b-49eb-aa65-88baea7f3d45",

"username" : "backup",

"creationTime" : 1555480009830,

"entityId" : "38a3f4d1-5bb8-11e9-8efc-a7ef5fefc939",

"secret" : "VMware123!",

"entityType" : "BACKUP"

Textarea

2. Change the password

With your super-duper-secret password in hand, change the backup-user to a temporary password, and switch it back using passwd backup. So that means you have to make 2 changes!

3. Clear the login failures

To make sure that the user is able to log in after it possibly got locked out due to failures, use pam_tally2 -u backup -r to clear the failure counter.

4. Issue a new backup from NSX-V manager

Log in to the NSX-V manager and run a new backup:

Image
run new backup
Textarea

After a few minutes a new entry should appear in the table below. If it doesn’t, try using the ‘change’ buttons and re-enter the correct details. Enter the password in the ‘passphrase’ field.

5. Fix permanently

The KB article stops at step 4. This means that in 90 days you’ll be facing the same issue again. In order to prevent the backup-user password from expiring, use the following command as the root user on the SDDC manager:

chage -I -1 -m 0 -M 99999 -E -1 backup

This sets the following:

  • Minimum Password Age to 0
  • Maximum Password Age to 99999
  • Password Inactive to -1
  • Account Expiration Date to -1

Verify again with chage -l backup:

Image
fixed
Textarea

The backup-user should be good to go from now on. Hope this helps!

Tags

Questions, Remarks & Comments

If you have any questions and need more clarification, we are more than happy to dig deeper. Any comments are also appreciated. You can either post it online or send it directly to the author, it’s your choice.
Let us know  

 
 
Questions, Remarks & Comments

Message Robert directly, in order to receive a quick response.

More about RedLogic