Rebuilds
Explore this Page
Overview
DataCore Puls8 maintains high availability and data redundancy by automatically managing replica rebuilds. If a replica becomes unavailable due to node issues, I/O failures, or temporary network disruptions, DataCore Puls8 restores redundancy by initiating a rebuild process.
Depending on the situation, it performs either a full rebuild (restores the entire replica) or a partial rebuild (restores only the changed data). This flexibility ensures fast recovery with minimal disruption to applications.
How Rebuilds Work
When a volume target detects that one of its replicas is unresponsive or returns an I/O error, DataCore Puls8:
-
Marks the replica as faulted and removes it from the I/O path.
-
Starts logging writes to keep track of all changes made while the replica is offline.
-
Waits for a configurable period (default: 10 minutes) before deciding the next step:
-
If the replica comes back online within this period, a partial rebuild is performed using the logged changes.
-
If not, a new replica is provisioned and a full rebuild is initiated.
-
The wait time is defined by the --faulted-child-wait-period
parameter in the data plane. Default is 10 minutes, configurable via values.yaml
.
Viewing Rebuild History
The data-plane records rebuild operations both full and partial, allowing you to audit and analyze replica health over time. You can retrieve this data in table or JSON format using kubectl mayastor
.
kubectl puls8 mayastor get rebuild-history {your_volume_UUID}
DST SRC STATE TOTAL RECOVERED TRANSFERRED IS-PARTIAL START-TIME END-TIME
b5de71a6-055d-433a-a1c5-2b39ade05d86 0dafa450-7a19-4e21-a919-89c6f9bd2a97 Completed 7MiB 7MiB 0 B true 2023-07-04T05:45:47Z 2023-07-04T05:45:47Z
b5de71a6-055d-433a-a1c5-2b39ade05d86 0dafa450-7a19-4e21-a919-89c6f9bd2a97 Completed 7MiB 7MiB 0 B true 2023-07-04T05:45:46Z 2023-07-04T05:45:46Z
kubectl puls8 mayastor get rebuild-history {your_volume_UUID} -ojson
{
"targetUuid": "c9eb4172-e90c-40ca-9db0-26b2ae372b28",
"records": [
{
"childUri": "nvmf://10.1.0.9:8420/nqn.2019-05.io.openebs:b5de71a6-055d-433a-a1c5-2b39ade05d86?uuid=b5de71a6-055d-433a-a1c5-2b39ade05d86",
"srcUri": "bdev:///0dafa450-7a19-4e21-a919-89c6f9bd2a97?uuid=0dafa450-7a19-4e21-a919-89c6f9bd2a97",
"rebuildJobState": "Completed",
"blocksTotal": 14302,
"blocksRecovered": 14302,
"blocksTransferred": 0,
"blocksRemaining": 0,
"blockSize": 512,
"isPartial": true,
"startTime": "2023-07-04T05:45:47.765932276Z",
"endTime": "2023-07-04T05:45:47.766825274Z"
},
{
"childUri": "nvmf://10.1.0.9:8420/nqn.2019-05.io.openebs:b5de71a6-055d-433a-a1c5-2b39ade05d86?uuid=b5de71a6-055d-433a-a1c5-2b39ade05d86",
"srcUri": "bdev:///0dafa450-7a19-4e21-a919-89c6f9bd2a97?uuid=0dafa450-7a19-4e21-a919-89c6f9bd2a97",
"rebuildJobState": "Completed",
"blocksTotal": 14302,
"blocksRecovered": 14302,
"blocksTransferred": 0,
"blocksRemaining": 0,
"blockSize": 512,
"isPartial": true,
"startTime": "2023-07-04T05:45:46.242015389Z",
"endTime": "2023-07-04T05:45:46.242927463Z"
}
]
}
- Rebuild history is available only while the volume target remains active.
- If the volume target is deleted or re-created (Example: After a node failure), previous rebuild records will be lost.
Benefits of Rebuilds
- Efficiency: Only changed data is transferred.
- Speed: Reduced rebuild time means quicker recovery.
- Resource Optimization: Avoids unnecessary load on storage and network.
Learn More