Rebuilds

Explore this Page

Overview

DataCore Puls8 maintains high availability and data redundancy by automatically managing replica rebuilds. If a replica becomes unavailable due to node issues, I/O failures, or temporary network disruptions, DataCore Puls8 restores redundancy by initiating a rebuild process.

Depending on the situation, it performs either a full rebuild (restores the entire replica) or a partial rebuild (restores only the changed data). This flexibility ensures fast recovery with minimal disruption to applications.

How Rebuilds Work

When a volume target detects that one of its replicas is unresponsive or returns an I/O error, DataCore Puls8:

  1. Marks the replica as faulted and removes it from the I/O path.

  2. Starts logging writes to keep track of all changes made while the replica is offline.

  3. Waits for a configurable period (default: 10 minutes) before deciding the next step:

    1. If the replica comes back online within this period, a partial rebuild is performed using the logged changes.

    2. If not, a new replica is provisioned and a full rebuild is initiated.

The wait time is defined by the --faulted-child-wait-period parameter in the data plane. Default is 10 minutes, configurable via values.yaml.

Viewing Rebuild History

The data-plane records rebuild operations both full and partial, allowing you to audit and analyze replica health over time. You can retrieve this data in table or JSON format using kubectl mayastor.

Copy
View Rebuild History in Table Format
kubectl puls8 mayastor get rebuild-history {your_volume_UUID}
Copy
Sample Output
DST                                   SRC                                   STATE      TOTAL  RECOVERED  TRANSFERRED  IS-PARTIAL  START-TIME            END-TIME
b5de71a6-055d-433a-a1c5-2b39ade05d86  0dafa450-7a19-4e21-a919-89c6f9bd2a97  Completed  7MiB   7MiB       0 B          true        2023-07-04T05:45:47Z  2023-07-04T05:45:47Z
b5de71a6-055d-433a-a1c5-2b39ade05d86  0dafa450-7a19-4e21-a919-89c6f9bd2a97  Completed  7MiB   7MiB       0 B          true        2023-07-04T05:45:46Z  2023-07-04T05:45:46Z

 

Copy
View Rebuild History in JSON Format
kubectl puls8 mayastor get rebuild-history {your_volume_UUID} -ojson
Copy
Sample Output
{
  "targetUuid": "c9eb4172-e90c-40ca-9db0-26b2ae372b28",
  "records": [
    {
      "childUri": "nvmf://10.1.0.9:8420/nqn.2019-05.io.openebs:b5de71a6-055d-433a-a1c5-2b39ade05d86?uuid=b5de71a6-055d-433a-a1c5-2b39ade05d86",
      "srcUri": "bdev:///0dafa450-7a19-4e21-a919-89c6f9bd2a97?uuid=0dafa450-7a19-4e21-a919-89c6f9bd2a97",
      "rebuildJobState": "Completed",
      "blocksTotal": 14302,
      "blocksRecovered": 14302,
      "blocksTransferred": 0,
      "blocksRemaining": 0,
      "blockSize": 512,
      "isPartial": true,
      "startTime": "2023-07-04T05:45:47.765932276Z",
      "endTime": "2023-07-04T05:45:47.766825274Z"
    },
    {
      "childUri": "nvmf://10.1.0.9:8420/nqn.2019-05.io.openebs:b5de71a6-055d-433a-a1c5-2b39ade05d86?uuid=b5de71a6-055d-433a-a1c5-2b39ade05d86",
      "srcUri": "bdev:///0dafa450-7a19-4e21-a919-89c6f9bd2a97?uuid=0dafa450-7a19-4e21-a919-89c6f9bd2a97",
      "rebuildJobState": "Completed",
      "blocksTotal": 14302,
      "blocksRecovered": 14302,
      "blocksTransferred": 0,
      "blocksRemaining": 0,
      "blockSize": 512,
      "isPartial": true,
      "startTime": "2023-07-04T05:45:46.242015389Z",
      "endTime": "2023-07-04T05:45:46.242927463Z"
    }
  ]
}
  • Rebuild history is available only while the volume target remains active.
  • If the volume target is deleted or re-created (Example: After a node failure), previous rebuild records will be lost.

Benefits of Rebuilds

  • Efficiency: Only changed data is transferred.
  • Speed: Reduced rebuild time means quicker recovery.
  • Resource Optimization: Avoids unnecessary load on storage and network.

Learn More