Wednesday, 7 November 2012

vSphere 5.1 Whats new - Storage


Storage


Increase of number of hosts able to share a read only file has increased from 8 to 32

A read only file in VMware is often a source VMDK used for linked clones in either VMware View or VMware vCloud Director.  Linked clones are used for quick deployment in both View and VCD.

The current limit is 8 hosts.  This results in View and VCD clusters being limited to 8 hosts as if one of the linked clones was placed on the 9th host in the cluster, this host would be denied access to the source VMDK file.

This limitation has been increased to allow up to 32 hosts to access the read only file.  This removes the 8 host limitation for View and VCD and the limitation now is defined by the cluster limitation of 32.

Introduction of new VMDK type, SE virtual disk (Space –efficient virtual disk)

A well know problem with thin provisioning VMDK files was that if space was freed in the OS of the VMDK file by deleting files etc., the VMDK file did not shrink.

With SE virtual disks the space can be reclaimed. 
  1. VMware tools will scan the OS hard disks for unused allocated blocks.  The blocks are marked as free.
  2. Run the SCSI UNMAP command in the guest to instruct the virtual SCSI layer in the VMKernal to mark the blocks as free in the SE vmdk.
  3. The once the VMKernal knows what blocks are free it reorganizes the SE vmdk so that the construction of the vmdk is of a continuous data lump with all the free blocks at the end of the vmdk.
  4. The VMKernal then sends either a SCSI UNMAP command to the SCSI array or a RPC TRUNCATE command to NFS based storage.
This then frees the unused blocks in the SE vmdk and frees space on the datastore for other VMs.

Please see the below image taken from the VMware white paper for storage in vSphere 5.1


Improvements in detecting APD (all paths down) and PDL (permanent device loss)

It is common when an APD condition is seen by an ESXI host, the host will become unresponsive and will eventually become disconnected from vCenter.  This is due to the hostd process not knowing if the removal of the storage devices is a permanent or a transient state for the lost paths, because of this hostd does not timeout the recan operation for rediscovering the paths or any threads being processed by hostd.  Everything just waits for I/O to be received from the storage array, and because hostd has an infinite number of threads the hostd process often becomes unresponsive and crashes.

There have been improvements over the last few releases of vSphere to improve the detection of APD by introducing PDL detection (Permanent device loss) by detecting specific SCSI sense codes from the target array.  Improvements have been made to this function as well as alterations to how APD conditions are handled.
The following information is an extract from the VMware documentation listed in the appendix of this document.

In vSphere 5.1, a new time-out value for APD is being introduced. There is a new global setting for this
feature called Misc.APDHandlingEnable. If this value is set to 0, the current (vSphere 5.0) condition is used,
i.e., permanently retrying failing I/Os. If Misc.APDHandlingEnable is set to 1, APD handling is enabled to follow the new model, using the time-out value Misc.APDTimeout. This is set to a 140-second time-out by default, but it is tunable. These settings are exposed in the UI. When APD is detected, the timer starts. After 140 seconds, the device is marked as APD Timeout. Any further I/Os are fast-failed with a status of No_Connect, preventing hostd and others from getting hung. If any of the paths to the device recover, subsequent I/Os to the device are issued normally, and special APD treatment concludes.

The above text indicates that the advanced setting Misc.APDHandillingEnable should be set to 1 to allow for APD timeouts and to prevent the hostd process crashing when APD occurs.

Another setting that should be configured is the disk.terminateVMOnPDLDefault this allows HA to restart VMs that were impacted by the APD on another host that is unaffected by the APD issues. There is a know problem with this setting restarting machines that were gracefully shutdown during APD.  Specifying the following advanced setting removes this problem das.maskCleanShutdownEnabled.  Both advanced settings should be used together for best results from an APD condition.

Storage DRS V2.0

Improvements to storage DRS include additional detection for latency and datastore placement on the storage device.  Storage DRS introduces Storage Correlation which is a new feature used to detect if datastores reside on the same SAN spindles. There would be little benefit in moving a VM from one datastore to another if they reside on the same set of physical spindles.  Previously SDRS would analysis constraint on the datastores and move a VM to a less populated datastore with the assumption the VM would receive a performance benefit due to the datastore being less populated.  Now SDRS will investigate if the datastore is sitting on the same spindles and if so it will conclude that there will be little to no benefit and depending the aggressiveness of the SDRS settings will not move the VM’s VMDK files.

VmObservedLatency is another new feature of SDRS and is used to analysis the latency from the time the VMKernal receives the storage command to the time the VMKernal receives a response from the storage array.  This is an improvement over the previous level of monitoring which was monitoring the latency only after the storage request had left the ESXi host.  The new feature allows for latency inside the host to be monitored as well.  This is useful because the latency between the array and the host may be 1 or 2 milliseconds but the latency in the host from the VMKernal could be 20-30milliseconds due to the number of commands being issued/queued on the HBA that is being used for a specific datastore. 

1 comment: