Auto-evacuating Pools on Offline Servers

An offline server causes mirrored virtual disks to lose fault tolerance and single virtual disks to fail. Evacuating virtual disk storage sources from offline servers to healthy servers can be automated by creating a task that monitors the state of DataCore Servers and runs a script, and after determining that they are offline, automatically evacuates the storage sources of that server to another server that shares the pool.

The Windows PowerShell script file named EvacuateOnFail.ps1 is included in the DataCore SANsymphony installation folder (default path C:\Program Files\DataCore\SANsymphony). The script uses the DataCore Cmdlet Start-DcsDistributionPlan to perform the evacuation.

As outlined in this topic, the task will monitor the health of DataCore Servers in the server group. When the state of any server transitions from Healthy to any other state, a PowerShell script is executed on each server. The script will determine which server is offline, validate that it is, and evacuate those storage sources in compliant shared pools to a designated alternate server in the group.

Shared Multi-port Array (SMPA) licenses are not required to automatically evacuate the pools, although certain limitations apply. Without an SMPA license, all storage sources (virtual disks) from a pool must be evacuated and they must be evacuated to the sameDataCore Server. See Important Notes on Using Shared Disk Pools in the Maintenance Mode topic for more information.

Also see Shared Multi-port Array Support for the characteristics of SMPA pools and non-SMPA pools.

Configuration Requirements

The following configuration is required in order to automate the evacuation process.

  • Storage sources must be in pools that are shared between a minimum of two servers in the group in order to be eligible for automatic evacuation. Storage sources that are not in pools that are shared will not be evacuated.
    • To create shared pools, back-end paths must be created between each disk in the pool and the other servers in the group. See Important Notes on Using Shared Disk Pools in the Maintenance Mode topic for instructions on preparing to move storage sources without an SMPA license (if applicable).
  • The script requires the addition of the following configuration token string to the Description field in the Server Group Details page:

    [Offline-Takeover:On|Off|Kill]

    Specify On to enable the process.

    In the string above, the brackets are significant and should be included: [Offline-Takeover:On|Off|Kill].

    All possible actions include:

    • On - the process is active
    • Off - the process is not active
    • Kill - stop the process

      The configuration token is identified by the brackets and may be located anywhere in the Description field. The configuration token is not case sensitive and may contain spaces within the brackets. Other description information may also be included in the Description field, provided that it is not included within the brackets.

  • Configure the pool owners for each shared disk pool by adding the following required configuration token string to the Description field in each Shared Disk Pool Details page:

    [Possible-Owners: Server1, Server2]

    In the string above, the brackets are significant and should be included: [Possible-Owners: Server1, Server2].

    Where Server1 and Server2 are the machine names of the pool owners for that particular shared disk pool and act as alternate servers for the pool.

    • Two servers can be listed as possible pool owners. In the case of Server1 and Server2 listed as owners, if Server1 goes offline, storage sources will be evacuated to Server2. Alternately, if Server2 goes offline, storage sources will be evacuated to Server1.
    • Pool owners may be changed in the Description field in a Shared Disk Pool Details page as needed. When pool disks are shared between all servers in the group, the designated pool owners can be changed to any server in the group simply by changing the description string without changing the disk or pool configuration, or task.

      Pool owners may be changed while the script is running, although the changes will not take effect until the task is triggered the next time.

    • Virtual disks with storage sources on both configured pool owners will not be evacuated.
    • All virtual disks from all configured shared pools must be evacuated from the offline server when the server is not licensed for SMPA. See Important Notes on Using Shared Disk Pools in the Maintenance Mode topic for more information about evacuating storage sources in non-SMPA pools.
  • Enable write-through mode for all mirrored virtual disks created from the shared pools. This is not necessary for single or dual virtual disks.
    • Write-through can be enabled on the Virtual Disk Details page>Settings tab. Write-through can also be set for virtual disks using the cmdlet Set-DcsVirtualDiskProperties or set for the server using the cmdlet Disable-DcsServerWriteCache.
  • Configure tasks
    • Create one task per server. The script must run locally on each server in the group.
    • Set Trigger:
      • Trigger on Monitor state changed
      • Monitor type: State of DataCore Servers
      • Monitored object: All
      • Trigger state > Healthy (transition from Healthy to any other state will execute the script)
    • Set Action:
      • Perform action: Run a powershell script
      • DataCore Server: Specify one of the servers in the group. (Each task should run on a different server in the group.)
      • File: C:\Program Files\DataCore\SANsymphony\EvacuateOnFail.ps1
      • There are no script parameters
  • See General Maintenance Mode Notes in the Maintenance Mode topic for features that are not supported with the evacuate operation.
  • The script can be manually run in an evaluation mode in a PowerShell window; this is recommended to ensure that the configuration is correct. Evaluation mode displays messages in the PowerShell window as if running the script without actually evacuating the storage sources. To evaluate, configure as outlined above and run the script at the PowerShell prompt with the switch -eval. For example: PS C:\Program Files\DataCore\SANsymphony> EvacuateOnFail -eval
  • The script checks to ensure that the server that transitions from Healthy to any other state is offline before evacuating storage sources to prevent unnecessary evacuations.
  • When a server state changes from "Running" to "Stopped", the script will run but storage sources will not be evacuated from the "Stopped" server. When the server changes from "Running" to "Unavailable", the script will run, and storage sources from the "Unavailable" server will be evacuated where possible.
  • All physical disks in the pool being evacuated must be present. Storage sources in an offline pool will not be evacuated if any physical disks are missing.
  • The “Preferred Server” setting of “All” is unsupported with Evacuation.
  • The static (non-moving) side must be set as the preferred server before evacuation..
  • Before evacuation, the path to the static DataCore Server must be healthy to avoid loss of access by the host. Further, ensure that a path can be created from the destination DataCore Server to the host.
  • Storage source evacuations run in parallel for all shared pools per server.
  • All messages from running the script are recorded in the Event Log and logged to the file named EvacuateOnFail.txt located in C:\Program Files\DataCore\SANsymphony. Warning and errors are also posted as Alerts.
  • After evacuation of an offline server, storage sources do not return to the original server automatically. They must be evacuated or redistributed by the administrator.

Example Scenarios

Scenario 1 for a non-SMPA pool.

Configuration:

  • Pool owners configured for Shared pool 1 are Server 1 and Server 2.
  • Pool owners configured for Shared pool 2 are Server 2 and Server 3.
  • Pool owners configured for Shared pool 3 are Server 3 and Server 4.
  • Pool owners configured for Shared pool 4 are Server 4 and Server 1.
  • Mirrored Virtual disk 1 has storage sources from Server 1 (Shared pool 1) and Server 3 (Shared pool 3).
  • Mirrored Virtual disk 2 has storage sources from Server 2 (Shared pool 2) and Server 4 (Shared pool 4).
  • Mirrored Virtual disk 3 has storage sources from Server 3 (Shared pool 3) and Server 1 (Shared pool 1).
  • Mirrored Virtual disk 4 has storage sources from Server 4 (Shared pool 4) and Server 2 (Shared pool 2).
  • Single Virtual disk 5 has a storage source from Server 1 (Shared pool 1).

Task trigger:

Server 1 goes offline.

Task actions:

  • Virtual disk storage sources from shared pools on Server 1 are evacuated to Server 2.
  • The storage sources in Virtual disk 1, Virtual disk 3, and Virtual disk 5 are evacuated from Server 1 to Server 2.
  • Virtual disk 2 and Virtual disk 4 are unaffected because there is no storage source from Server 1 in either virtual disk.

Scenario 2 for an SMPA pool.

Configuration:

  • Pool owners configured for Shared pool 1 are Server 1 and Server 2.
  • Pool owners configured for Shared pool 2 are Server 3 and Server 4.
  • Mirrored Virtual disk 1 has storage sources from Server 1 (Shared pool 1) and Server 3 (Shared pool 2).
  • Mirrored Virtual disk 2 has storage sources from Server 2 (Shared pool 1) and Server 4 (Shared pool 2).
  • Mirrored Virtual disk 3 has storage sources from Server 3 (Shared pool 2) and Server 1 (Shared pool 1).
  • Mirrored Virtual disk 4 has storage sources from Server 4 (Shared pool 2) and Server 2 (Shared pool 1).
  • Single Virtual disk 5 has a storage source from Server 1 (Shared pool 1).

Task trigger:

Server 1 goes offline.

Task actions:

  • Virtual disk storage sources from shared pools on Server 1 are evacuated to Server 2.
  • The storage sources in Virtual disk 1, Virtual disk 3, and Virtual disk 5 are evacuated from Server 1 to Server 2.
  • Virtual disk 2 and Virtual disk 4 are unaffected because there is no storage source from Server 1 in either virtual disk.