Hello Frank,
So to start off, the output in the Web Client looks more than a bit suspicious:
(the vCenter/vCSA and hosts are BOTH on 6.0 U2 and also all hosts on identical build?)
Used - Total 10.52TB
Dedup & compression overhead 9.24TB
Free 154.9TB
This is essentially a sub 1.00x ratio:
(10.52+9.24)/10.52 = 0.53x ratio
Essentially what this means is using dedupe/compression here is using more space than it is saving.
But there is far more to this than is obvious here, which I will attempt to address...
As per Graham's response, dedupe only works inter-Disk-group, not across the whole vsandatastore
"dedupe domains will be the same as a disk group; therefore, all redundant copies of data in the disk group will be reduced to a single copy, however redundant copies across disk groups will not be deduped."
https://blogs.vmware.com/virtualblocks/2016/03/11/virtual-san-6-2-cool-new-features/
What this means for the dedupe aspect is that only data blocks that are identical and ALSO residing on the same disk-group will be deduped (e.g. it will use one set of data for two or more blocks that have identical data).
A key factor in what is going to be identical is the applications and Operating Systems residing on the vsandatastore (e.g. if you have 100 different application VMs with completely different data structures and OSs vs 100 near-identical VMs, the identical ones will likely see huge dedupe benefits while the varying ones will see less dedupe benefits if any).
Now here is where it gets personal! (..to your current situation ):
You are currently only using a small fraction of the vsandatastore (let's disregard the dedupe overhead and say ~6.3% actual data usage).
If you have many hosts and disk-groups per host, the data that has been migrated to this cluster is going to be very sparsely distributed and as such it cannot really benefit from deduplication as any data that *might* be identical has a lower odd of also residing on the same disk-group.
Thus, as you increase the amount of utilised storage (assuming there is some data similarity between blocks) dedupe ratio and space saving will likely increase.
This will also take some time to be fully actualised as my understanding of this is that it only dedupes blocks once it reads/accesses blocks it has checked and knows there is a duplicate of.
Other possible factors are potentially at play here:
- Are any/many vmdks thick-provisioned? (Objects with a Storage Policy rule of 100% Object Space Reservation)
This can also occur due to being cloned/deployed from template that has thick-provisioned disks (though even default Storage Policy may show, this may not be applied in reality), reservation space per disk can be seen using this command via RVC at the cluster level: #vsan.disks_stats .
- Is this by any chance a VxRail implementation? (As I have seen some stuff thick-provisioned on older builds even when a thin-provisioned SP was applied)
Bob