The PropertyCollector is used in this section to apply the above details to the backup task.
The PropertyCollector uses two relatively complicated argument structures. As was mentioned in
PropertyCollector Data, these arguments are
PropertySpec and
ObjectSpec.
PropertySpec is a list of the information desired, and
ObjectSpec is a list of instructions indicating where to find the information. In theory, you could directly address an object using its
moRef. In that case an
ObjectSpec can be very simple. However, getting the initial
moRef can be a challenge when a complicated
ObjectSpec is required. To formulate a complex
ObjectSpec, you need to understand the structure of the available data. This is complicated by the fact that an
ObjectSpec can contain recursive elements.
An ObjectSpec is a list of
ObjectSpec elements, each specifying an object type, and giving a “selection spec” for the object.
More About Managed Objects describes five types of managed objects: Folder, Datacenter, ComputeResource, ResourcePool, and VirtualMachine. VirtualApp (vApp) is a sixth type. You can “traverse” objects, because one managed object leads to another.
■
|
hostFolder – A moRef to a Folder containing a list of ComputeResources comprising a Datacenter.
|
■
|
vmFolder – A moRef to a Folder containing the VirtualMachines that are part of the Datacenter. If it is your objective to duplicate the display seen in a vSphere Client GUI, then this Folder is of limited use because it does not describe the ResourcePool that is the parent of a virtual machine.
|
■
|
resourcePool – A folder containing a list of moRefs pointing to child ResourcePools or VirtualApps.
|
■
|
resourcePool – A folder containing a list of moRefs pointing to child ResourcePools or VirtualApps.
|
■
|
vm – A list of moRefs to child VirtualMachines that employ the resources of the parent ResourcPool. A VirtualMachine always lists a ResourcePool as its parent.
|
The ObjectSpec does not have to lead you any farther than the
moRef of a target object. You can gather information about the managed object itself using the
moRef and the
PropertySpec. This is described in detail in the section
Understanding a PropertySpec.
A TraversalSpec extends
SelectionSpec, a property of
ObjectSpec, and contains the following elements:
■
|
Path – The element contained in the object that is used to steer traversal.
|
■
|
SelectSet – An array containing either SelectionSpec or TraversalSpec elements.
|
■
|
Skip – Whether or not to filter the object in the Path element.
|
■
|
Type – The type of object being referenced.
|
■
|
Name – Optional name you can use to reference the TraversalSpec, inherited from SelectionSpec.
|
SelectionSpec is a direct target for traversal, as is
TraversalSpec (a class extending
SelectionSpec). It is in the
SelectSet that recursion can occur.
If you wish to traverse the entire configuration tree for a server, then you need only the “root node” moRef, which is always a Folder. This root folder
moRef is available in the property
rootFolder of the
ObjectSpec service instance content. All of the above goes into this Java code sample.
A PropertySpec is a list of individual properties that can be found at places identified by the
ObjectSpec and its
TraversalSpec. Once the
PropertyCollector has a
moRef, it can then return the properties associated with that
moRef. This can include “nested” properties. Nested properties are properties that can be found inside of properties identified at the top level of the managed object. Nested properties are identified by a “dot” notation.
An example of nested properties can be drawn from the VirtualMachine managed object.A VirtualMachine has the property identified as summary, which identifies a
VirtualMachineSummary data object. The
VirtualMachineSummary contains property
config, which identifies a
VirtualMachineConfigSummary data object. The
VirtualMachineConfigSummary has a property called
name, which is a string containing the display name of the VirtualMachine. You can access this name property using the
summary.config.name string value. To address all the properties of the
VirtualMachineConfigSummary object, you would use the
summary.config string value.
The PropertyCollector requires an array of
PropertySpec elements. Each element includes:
■
|
Type – The type of object that contains the enclosed list of properties.
|
■
|
PathSet – An array of strings containing names of properties to be returned, including nested properties.
|
Now that we have defined ObjectSpec and
PropertySpec (the where and what), we need to put them into a
FilterSpec that combines the two. An array of
FilterSpec elements is passed to the
PropertyCollector (the minimum number of elements is one). Two mechanisms can retrieve data from
PropertyCollector:
■
|
RetrieveProperties – A one-time request for all of the desired properties. This can involve a lot of data, and has no refresh option. RetrievePropertiesEx has an additional options parameter.
|
■
|
Update requests – PropertyCollector update requests take two forms: polling and waiting (see below).
|
The update method is the way to keep properties up to date. In either Polling or Waiting, it is first necessary to register your FilterSpec array object with the
PropertyCollector. You do this using the
CreateFilter method, which sends a copy of your
FilterSpec to the server. Unlike the
RetrieveProperties method,
FilterSpec is retained after
CreateFilter operation. The following code shows how to set
FilterSpec:
If you wish to begin polling, you may then call the function CheckForUpdates, which on the first try (when it must contain an empty string for the version number) returns a complete dump of all the requested properties from all the eligible objects, along with a version number. Subsequent calls to
CheckForUpdates must contain this version number to indicate to the PropertyCollector that you seek any changes that deviate from this version. The result is either a partial list containing only the changes from the previous version (including a new version number), or a return code indicating no data has changed. The following code sample shows how to check for updates:
If instead you wish to wait for updates to occur, you must create a task thread that blocks on the call WaitForUpdates. This task thread would return changes only as they occur and not at any other time. However if the request times out, you must renew it.
The data returned from CheckForUpdates (or
WaitForUpdates) is an array of
PropertyFilterUpdate entries. Since a
PropertyFilterUpdate entry is very generic, here is some code showing how to extract information from the
PropertyFilterUpdate.
From time to time, you might need to get data that is relevant to a single item. In that case you can create a simple ObjectSpec including the
moRef for the item of interest. The
PropertySpec can then be set to obtain the properties you want, and you can use
RetrieveProperties to get the data. Hopefully you can deduce
moRef from a general examination of the properties, by searching for information from the
rootFolder.
The vSphere API identifies virtual disk files by prefixing the datastore name onto the file system pathname and the filename: [storageN] myvmname/mydisk-NNNNNN.vmdk. The name in square brackets corresponds to the short name of the datastore that contains this virtual disk, while the remainder of the path string represents the location relative to the root of this datastore.
To get the name and characteristics of a virtual disk file, you use the PropertyCollector to select the property:
config.hardware.device from a VirtualMachine managed object. This returns an array of virtual devices associated with a VirtualMachine or Snapshot. You must scan this list of devices to extract the list of virtual disks. All that is necessary is to see if each
VirtualDevice entry extends to
VirtualDisk. When you find such an entry, examine the
BackingInfo property. You must extend the type of the backing property to one of the following, or a
VirtualMachineSnapshot managed object:
The properties of interest are the backing fileName and the
VirtualDisk capacityInKB. Additionally, when change tracking is in place, you should also save the
changeID.
Within a specific snapshot, the names of virtual disk files (with extension .vmdk) can be modified with a zero-filled 6-digit decimal sequence number to ensure that the
.vmdk files are uniquely named. Depending on whether or not the current virtual machine had a pre-existing snapshot, the disk name for a snapshot could have this format: <
diskname>-<NNNNNN>.vmdk. This unique name is no longer valid after the snapshot is destroyed, so any data for a snapshot disk should be stored in the backup program under its base disk name.
You can use the taskRef return value as a
moRef to track progress of the snapshot operation. After successful completion,
taskRef.info.result contains the
moRef of the snapshot.
This section describes how to get data from the Virtual Disk after you have identified it. In order to access a virtual disk, you must use the VixDiskLib. The following code shows how to initialize the
VixDiskLib and use it for accessing a virtual disk. All operations require a
VixDiskLib connection to access virtual disk data. At the present time
VixDiskLib is not implemented for the Java language, so this code is C++ language:
When you are done performing a backup, you need to delete the temporary snapshot. You can get the moRef for the snapshot from taskRef.info.result as describe above for the create snapshot operation. The following Java code demonstrates how to delete the snapshot:
■
|
_this – Managed object reference to the virtual machine.
|
■
|
snapshot – Managed object reference to a Snapshot of the virtual machine.
|
■
|
deviceKey – Virtual disk for which to compute the changes.
|
■
|
startOffset – Byte offset where to start computing changes to virtual disk. The length of virtual disk sector(s) examined is returned in DiskChangeInfo.
|
■
|
changeId – An identifier for the state of a virtual disk at a specific point in time. A new ChangeId results every time someone creates a snapshot. You should retain this value with the version of change data that you extract (using QueryChangedDiskAreas) from the snapshot’s virtual disk.
|
When you back up a snapshot for the first time, ChangeId should be unset, or unsaved, indicating that a baseline (full) backup is required. If you have a saved
ChangeId, it identifies the last time a backup was taken, and tells the changed block tracking logic to identify changes that have occurred since the time indicated by the saved
ChangeId.
2
|
Provide the special ChangeId "*" (star). The star indicates that QueryChangedDiskAreas should return only active portions of the virtual disk. For both thin provisioned (sparse) virtual disks and for ordinary virtual disks, this causes a substantial reduction in the amount of data to save.
|
To summarize, changeID is an identifier for a time in the past. It can be star
"*" to identify all allocated areas of virtual disk, ignoring unallocated areas (of sparse disk), or it could be a
changeId string saved at the time when a pre-backup snapshot was taken. It only makes sense to use the special
ChangeId =
"*" when no previous
ChangeId exists. If a previous
ChangeId does exist, then
QueryChangedDiskAreas returns the disk sectors that changed since the new
ChangeId was collected.
Use of Change ID for Changed Block Tracking shows the algorithm.
The following restrictions are imposed on the "*" query when determining allocated areas of a virtual disk:
If supported, you enable changed block tracking using an abbreviated form of VirtualMachineConfigSpec, then use the
ReconfigVM_Task method to reconfigure the virtual machine with changed block tracking:
3
|
Starting from the snapshot’s ConfigInfo, work your way to the BackingInfo of all virtual disks in the snapshot. This gives you the change IDs for all the disks of the virtual machine.
|
Associated with changed block tracking is changeId, an identifier for versions of changed block data. Whenever a virtual machine snapshot is created, associated with that snapshot is a
changeId that functions as a landmark to identify changes in virtual disk data. So it follows that when a snapshot is created for the purpose of creating an initial virtual disk backup, the
changeId associated with that snapshot can be used to retrieve changes that have occurred since snapshot creation.
To obtain the changeId associated with any disk in a snapshot, you examine the “hardware” array from the snapshot. Any item in the devices table that is of type
vim.vm.device.VirtualDevice.VirtualDisk encloses a class describing the “backing storage” (obtained using
getBacking) that implements virtual disk. If backing storage is one of the following disk types, you can use the
changeId property of the
BackingInfo data object to obtain the
changeId:
Information returned by the QueryChangedDiskAreas method is a
DiskChangeInfo data object containing an array of
DiskChangeInfo.DiskChangeExtent items that enumerate the start offset and length of various disk areas that changed, and the length and start offset of the entire disk area covered by
DiskChangeInfo.
When using QueryChangedDiskAreas to gather information about snapshots, enable change tracking before taking a snapshot. Attempts to collect information about changes that occurred before change tracking was enabled result in a FileFault error. Enabling change tracking provides the additional benefit of saving space because it enables backup of only information that has changed. If change tracking is not enabled, the entire virtual machine must be backed up each time, rather than incrementally.
To find change information, you can use the managed object browser at http://<ESXhost>/mob to follow path
content > rootFolder > datacenter > datastore > vm > snapshot > config > hardware > virtualDisk > backing. Changed block tracking information (
changeId) appears in the BackingInfo.
The following C++ code sample assumes that, in the past, you obtained a complete copy of the virtual disk, and at the time when the changeId associated with the snapshot was collected, you stored it for use at a later time, which is now. A new snapshot has been created, and the appropriate
moRef is available:
In the above code, QueryChangedDiskAreas is called repeatedly, as
position moves through the virtual disk. This is because the number of entries in the
ChangedDiskArea array could occupy a large amount of memory for describing changes to a large virtual disk. Some disk areas may have no changes for a given
changeId.
The changeId (changed block ID) contains a sequence number in the form
<UUID>/<nnn>. If
<UUID> changes, it indicates that tracking information has become invalid, necessitating a full backup. Otherwise incremental backups can continue in the usual pattern.
If you reconfigure a virtual machine to set changeTrackingEnabled, but the property remains false, check that you have queried the virtual machine status with
VirtualMachine->config() after reconfiguration with
VirtualMachine->reconfigure() and not before. Also make sure that virtual machine compatibility is hardware version 7 or higher, and that it has undergone a stun-unstun cycle since reconfiguration.
You can avoid using the queryChangedDiskAreas API on ESX/ESXi 3.5 based storage by parsing XML files for the namespace. For prepackaged methods that do this, see these SDK code samples: