Managing Storage Performance Degradation
Performance degradation on storage sources can occur for many reasons and when it happens can affect host application performance. In some cases, performance degradation can change without warning, such as when a RAID array is rebuilding or there is I/O contention with other resources. Performance degradation may also occur due to a planned maintenance activity.
SANsymphony software provides storage latency thresholds which are monitored and will report to the System Health tool when a threshold has been reached. The threshold values can be customized to inform administrators when unacceptable latency is occurring on storage used by virtual disks.
When storage performance degradation is detected by a storage latency threshold, administrators can choose to automatically disable front-end and mirror access to mirrored virtual disks, causing a failover to another front-end path on a different server. When the storage on one server is drastically compromised, temporarily failing over to faster storage on another server will result in improved performance for the virtual disks and host applications until the latency issue can be corrected.
Once access to logical disks have been disabled, latency reported on the logical disk might remain high until all cache I/Os have been destaged, but the virtual disk performance will not be affected anymore.
Disabling access to slow storage sources when a threshold is reached can be performed automatically through a task running a provided PowerShell script. When the performance degradation has been corrected, access can be automatically enabled again using another provided PowerShell script.
This implementation is intended as a temporary measure to restore performance to mirrored virtual disks until the latency is corrected.
Ensure that storage latency thresholds are set to the desired settings and use caution when electing to disable access to storage sources. When storage latency occurs, disabling access may not be the most suitable solution in all cases. We recommend that the administrator assess the situation before taking action. DataCore Software does not guarantee performance improvements when using this solution.
A logical disk is the internal software representation of a virtual disk on a server. Logical disks are created by the software when virtual disks are created from disk pools or pass-through disks. A single virtual disk is comprised of one logical disk on one server. A mirrored or dual virtual disk is comprised of two logical disks—one logical disk on each server used as a storage source for the virtual disk.
Since logical disks are used internally by the software, they are not revealed in the SANsymphony Management Console, but are most closely related and referred to as "storage sources" in the console.
Sample PowerShell Scripts
DisableLogicalDisksAccess.ps1
The script file takes an array of logical disk monitor triggers as a parameter. The monitor triggers will be provided from a task trigger configured in SANsymphony.
When the trigger is fired, the script will run as the task action. The script validates the logical disks, triggered by the monitors, to ensure that there is a path available on the partner server in order to maintain high availability, and the virtual disk can remain accessible from all hosts it is served to. The script will disable access to all validated logical disks and skip the logical disks that do not pass validation.
The script uses the DataCore CmdletSet-DcsLogicalDiskAccess to disable access to the logical disks. (This cmdlet can also be used to manually disable access without using the script in a task, such as before a planned maintenance activity.)
All messages regarding this operation are saved to DisableLogicalDisksAccess.log in the SANsymphony installation folder.
EnableLogicalDisksAccess.ps1
The script file takes a server name as a parameter and will enable all disabled logical disks on the specified server. The script can be run from the command line when the storage latency issue has been corrected.
For example, entering EnableLogicalDisksAccess.ps1 -Server [Server1] at the command line would enable access to the disabled storage sources on the server named "Server1".
The script uses the DataCore CmdletSet-DcsLogicalDiskAccess to enable access to the logical disks. (This cmdlet can also be used to manually enable access without using the script.)
All messages regarding this operation are saved to EnableLogicalDisksAccess.log in the SANsymphony installation folder. The script files are included in the SANsymphony installation folder (default path C:\Program Files\DataCore\SANsymphony).
To use the DisableLogicalDisksAccess.ps1 script without modification:
- Create a task to trigger on a monitor state change for the monitor type Virtual disk sources and Storage latency. Select All or specify the virtual disks to monitor. Set the trigger state to the threshold value, when reached, that the action should take place, for example "= Critical". In this case, the trigger will fire when any of the selected storage sources reach the threshold for critical storage latency. In order to fire a trigger, the trigger state must be at least "> Healthy", which means "Attention or greater". (See Automated Tasks for more information.)
- In the task, configure an action to run the PowerShell script file (C:\Program Files\DataCore\SANsymphony\DisableLogicalDisksAccess.ps1). Select the check box to append the associated Task Trigger Data objects. When a trigger fires, the associated trigger state data objects for the logical disk Ids will be appended to the script file action as parameter values. The trigger state data objects are a list of trigger states that caused the action to occur. See Trigger State Data for more information.
- Optionally, another trigger could be set to run during scheduled times such as when I/O processing is slow or during off-hours. In this case, also set the "Only run when all trigger conditions are met task" setting.
- Optionally, an additional task action could be created to send an email to the administrator and an Action delay could be set in the task settings so that the script will be executed after a desired amount of time in order for the administrator to assess the situation.
When the storage latency threshold reaches the threshold set, the trigger will fire and the script file will run. After the storage latency issue is corrected, run the EnableLogicalDisksAccess.ps1 script file with the server name to enable access to the storage sources again.