Challenge
Possible data corruption in backups after extending disk size past a 128GB boundary on VMWare.Cause
According to VMWare KB kb.vmware.com/kb/2090639, CBT may return an incorrect list of VM disk sectors when a software runs QueryChangedDiskAreas("*").This issue occurs when expanding a virtual disk (vmdk) file with Change Block Tracking (CBT) enabled. This may cause CBT to incorrectly calculate the in-use blocks in the vmdk file.
Note: The amount of space the virtual disk is extended is not relevant, the increment of space by which a virtual disk is extended is not relevant. The virtual machine has this issue when its disk is grown past any 128GB boundary in absolute size. The issue is triggered at other sizes which are a power of 2 from 128GB up. For example: 256GB, 512GB, and 1024GB.
Solution
VMware has released patches for ESXi 5.0,5.1, and 5.5 to fix this issue, please see VMware KB article kb.vmware.com/kb/2090639. Their patch is NOT retroactive and CBT must still be reset, their patch will only prevent it from occurring moving forward after applying the Patch/Update.The solution implemented in to Veeam Backup & Replication v8 and the hotfix for Veeam Backup & Replication v7 are only workarounds. They function by detecting if a VM’s disk size has changed between job runs. This means if the VM is already affected by this CBT corruption issue it will not be corrected automatically.
Note: Remember that regardless of your Veeam Backup & Replication version, we can only addresses CBT corruption for new virtual disk extensions that occur after a VM was been backed up or replicated by Veeam at least once. It is therefore advised that you should reset CBT manually for all VMs that may have been impacted. Below is a script to do this with the VMs powered on.
──────────────────────────────────────────────────────────
Veeam Backup & Replication v7
──────────────────────────────────────────────────────────
Apply the following hotfix to protect your backups.
1. Make sure you are running Veeam Backup and Replication 7.0.0.871 (patch 4) otherwise obtain a patch vee.am/kb1891
2. Stop all Veeam services
3. Replace DLL's in ‘C:\Program Files\Veeam\Backup and Replication’ with DLL's from the hotfix package
4. Start Veeam services
──────────────────────────────────────────────────────────
Veeam Backup & Replication v8 and Later
──────────────────────────────────────────────────────────
Veeam Backup and Replication 8 has a built-in solution for this issue. Veeam Backup & Replication resets CBT for any resized VMware disk to prevent corruption. If ESXi hosts are patched against this issue, this behavior may no longer be desirable. To disable automatic CBT reset upon virtual disk size change, create the following registry value on the Veeam Backup Server (requires v8 Update 2 or later):
Key: HKLM\SOFTWARE\Veeam\Veeam Backup and Replication
Type: REG_DWORD
Value name: ResetCBTOnDiskResize
Set this DWORD value to 0. To apply the registry change, make sure no jobs or restores are running, then restart the Veeam Backup Service.
──────────────────────────────────────────────────────────
Reset all VMs with CBT Enabled
──────────────────────────────────────────────────────────
The following script has been created to expedite the process of Resetting CBT on all VMs that presently have CBT enabled.
This script will only run against VMs that are powered on and have no snapshots on them.
This script is provided as is, and may cause disruption within the virtual environment as it performs a snapshot creation and deletion.
This script may require you to alter the current PowerShell execution policy.
www.veeam.com/download_add_packs/vmware-esx-backup/kb1940/
The following commands can be used to reset CBT for all VMs where it is presently enabled.
(Run these one line at a time after connecting to the vCenter Server via VMware PowerCLI.)
Get the VMs with CBT enabled:
$vms=get-vm | ?{$_.ExtensionData.Config.ChangeTrackingEnabled -eq $true}
Create a VM Specification to apply with the desired setting:
$spec = New-Object VMware.Vim.VirtualMachineConfigSpec
$spec.ChangeTrackingEnabled = $false
Apply the specification to each VM, then create and remove a snapshot:
foreach($vm in $vms){
$vm.ExtensionData.ReconfigVM($spec)
$snap=$vm | New-Snapshot -Name 'Disable CBT'
$snap | Remove-Snapshot -confirm:$false}
Check for success:
get-vm | ?{$_.ExtensionData.Config.ChangeTrackingEnabled -eq $true}
Note: After CBT reset the following job run will take longer to complete.
More Information
The following is from VMware’s KB article:Frequently Asked Questions:
Is there a way to determine if a virtual disk has been expanded?
Customers should rely on their own change control records to determine if a virtual disk has been expanded. This information is not tracked in the virtual machine.
Are virtual machines grown in smaller increments affected?
The amount of space the virtual disk is extended is not relevant, the increment of space by which a virtual disk is extended is not relevant. The virtual machine has this issue when its disk is grown past any 128GB boundary in absolute size. The issue is triggered at other sizes which are a power of 2 from 128GB up. For example: 256GB, 512GB, and 1024GB.
Checking if CBT is enabled:
•In the home directory of the virtual machine, verify if there is a vmname-ctk.vmdk file for one or more virtual hard disks.
•To Query the advanced configuration parameters for the Virtual Machine:
- Right-click the virtual machine and click Edit Settings.
- Click the Options tab.
- Click General under the Advanced section and then click Configuration Parameters. The Configuration Parameters dialog opens.
- Search for the ctkEnabled parameter entry for each disk and note if it is enabled or not.