Optimizing Stale Data

Introduction

Enabling or disabling the Inline Deduplication and Compression option migrates all or none of the virtual disk (vDisk) data to the Capacity Optimized (CO) subsystem respectively. This data migration occurs without considering the temperature of data or the storage priority of the vDisk and consequently may not take full advantage of auto-tiering across the full range of devices within a pool.

The Optimize Stale Data Only feature (also referred to as Adaptive Data Placement) has been introduced to enhance Inline Deduplication and Compression by allowing DataCore SANsymphony's auto-tiering to take full advantage of the full range of storage devices in a pool. This feature migrates the stale (or cold) data of a vDisk to the CO subsystem and leaves the hot data of the vDisk on high-performance disks using auto-tiering, further increasing cost efficiency of the overall solution.

To enable this feature for a new or an existing vDisk, select Optimize Stale Data Only option available in the Capacity Optimization section and choose the Optimization Policy (Aggressive/ Normal/ Lazy). The Optimize Stale Data Only option is available only after selecting Inline Deduplication and/or Inline Compression.

To disable this feature, select Optimize All Data option. This will move all the storage allocation units (SAUs) present in the disk pool to their respective CO subsystem.

Prerequisites

  • The Optimize Stale Data Only feature can be enabled only if a CO setting is configured for the vDisk.
  • This feature operates for vDisks that have Storage Profile type as Normal, Low, Archive, or Custom with Performance Class as Normal, Low, or Archive.

    If the Storage Profile type of a vDisk is Critical or High, enabling this option will not move the stale data to the CO subsystem. Changing the Storage Profile to Critical or High will pull data back from the CO subsystem to high performance tiers to satisfy the Performance Class requirements.

Functional Overview

The Optimize Stale Data Only feature allows the data of a vDisk to be dynamically split across both the CO and non-CO pool disks in an extension to the auto-tiering system, providing benefits of high performance and reduced storage consumption.

This feature sorts all the SAUs of a vDisk based on their heatmap temperature and identifies a portion of the SAUs to migrate based on the Optimization Policy selected by the user. The Optimization Policy can be:

  • Aggressive: This migrates up to 75 percent of the coldest SAUs of a logical disk (LD) to the CO storage.
  • Normal: This migrates up to 50 percent of the coldest SAUs of a LD to the CO storage.
  • Lazy: This migrates up to 25 percent of the coldest SAUs of a LD to the CO storage.

The following diagram depicts a simplified temperature curve without “plateaus”.

Temperature plateaus will eventually right shift the qualification point to a location lower than exactly 75/50/25 percent, at the first SAU colder than the calculated median within a plateau. The data migration is based on the temperature of the individual SAUs, and any SAU having temperature below the respective median value is considered as stale data and is moved to the CO subsystem.

The following graph depicts a plateau condition, where the median temperatures for Lazy and Normal policy have right shifted.

The following graph depicts a plateau condition, where the Lazy median temperature has right shifted, and the Aggressive median temperature has left shifted.

Based on the median temperature, the Optimize Stale Data Only feature migrates the vDisk’s stale data from high-performance disks to the CO storage and hot data from the CO storage to high-performance disks. The vDisk SAUs, residing in incorrect storage space, that are identified for migration are considered as “Out Of Settings” SAUs. The percentage of such SAUs in the vDisk can be viewed using the "% Bytes Out Of Settings" performance counter available at the vDisk Details page > Performance tab. For more information, see Live Performance.

Only a limited number of vDisk SAUs are identified as migration targets in every cycle. Therefore, the readings displayed for “% Bytes Out Of Settings” and “Bytes Out Of Settings” performance counters during every cycle may be lower than the actual data set identified to be migrated. However, these performance counters will display correct readings once all the targeted SAUs have been migrated.

Pool Allocation View

To view the actual distribution of vDisk data among the CO and non-CO subsystems, use the Allocation View tab available in the Disk Pool Details page. When the Optimize Stale Data Only option is enabled for a vDisk, the Allocation View displays the placement of data in the CO and non-CO tiers. For more information, see Pool Allocation Tool.

Optimization Policy

Aggressive Optimization Policy

If this policy is selected, then all the SAUs of the vDisk get sorted based on the temperature of each SAU and a median value is calculated based on the first half of SAUs of the vDisk.

The median value for aggressive optimization policy moves out 75 percent or less of the vDisk data to the CO subsystem.

Normal Optimization Policy

If this policy is selected, then all the SAUs of the vDisk get sorted based on the temperature of each SAU and a median value is calculated based on all the SAUs of the vDisk.

The median value for normal optimization policy moves out 50 percent or less of the vDisk data to the CO subsystem.

Lazy Optimization Policy

If this policy is selected, then all the SAUs of the vDisk get sorted based on the temperature of each SAU and a median value is calculated based on the second half of SAUs of the vDisk.

The median value for lazy optimization policy moves out 25 percent or less of the vDisk data to the CO subsystem.

The designation of stale data is based on the temperature of the SAUs at the vDisk level and not at the pool level.

  • The stale data will migrate to the CO subsystem in the respected format (compressed and/or deduplication) based on the Inline Deduplication and Compression option selected by the user along with the Optimize Stale Data Only option.
  • If the CO subsystem is full, the stale data will migrate only when the CO subsystem has free space.
  • If the stale data in the CO subsystem becomes hot, the data will be migrated to the physical disk tier and placed in a suitable tier based on its current temperature.
  • SAUs, other than stale, must obey all other migration settings like temperature migration, tier re-balancing, and so on.
  • Hybrid, non-Inline Deduplication and Compression, and Inline Deduplication and Compression vDisk can co-exist in the same pool.

Allocation of Stale and Hot Data in Specific Scenarios

vDisk with Median Temperature Value Zero

If the median temperature of a vDisk is zero, selecting the Optimize Stale Data Only option for a vDisk will move all the SAUs with temperature value zero to the CO subsystem, effectively left shifting the qualification point to a location higher than 75/50/25 percent.

vDisks having Hot SAUs

When the Optimize Stale Data Only: Aggressive option is selected for a vDisk having hot SAUs, some of these hot SAUs may migrate to the CO subsystem. This is because these hot SAUs have temperature lower than the median temperature and are considered as stale data for that specific vDisk, even though they are hot in the pool.

vDisks having Cold SAUs

  • When the Optimize Stale Data Only option is selected for a vDisk having cold SAUs, there could be an instance that no SAUs of the vDisk migrate to the CO subsystem. This is because the median temperature of that vDisk is equivalent to the lowest SAU temperature in the vDisk. Since, there are no SAUs having temperature lower than the median temperature, no vDisk data migrate to the CO subsystem.
  • When the Optimize Stale Data Only option is selected for a vDisk having cold SAUs, some of these cold SAUs may not migrate to the CO subsystem. This is because these cold SAUs have temperature equal to or higher than the median temperature and are considered as hot data for that specific vDisk, even though they are cold in the pool.

Stale Data becomes Hot

When the stale data of a vDisk becomes hot, the median temperature changes accordingly. There is a possibility that the SAUs which were already available in the CO subsystem may move back to the physical disks and some SAUs from the top tiers may move to the CO subsystem based on the new median.