Creating a DiskPool

Explore this Page

Overview

In DataCore Puls8, storage is provisioned using pools of block devices known as DiskPools. These pools serve as the foundation for creating Replicated Persistent Volumes (PVs) using the Replicated PV Mayastor storage. A DiskPool is tied to a specific node and is responsible for managing a single block device that defines its total capacity.

This document provides guidance on configuring and managing DiskPools, including best practices, supported device schemes, and procedures for verifying pool status.

DiskPool Creation

When a node in the cluster is designated to host a replica of a PV, it utilizes a DiskPool to allocate the required storage capacity. Each node can maintain one or more DiskPools, but ownership of a pool is exclusive to a single node. Furthermore, each DiskPool is associated with only one block device, which defines its total capacity and forms the underlying data persistence layer.

DiskPools are defined declaratively through DiskPool Custom Resources (CRs) within the cluster. These resources must be created in the same namespace where the Replicated PV Mayastor component is deployed. The DiskPool specification allows you to configure the pool's unique name, the host node on which it will reside, and the device reference to be used. The referenced block device must adhere to supported URI schemes depending on the transport or device type.

Supported Device Schemes

It is recommended to use persistent device links that are stable across node reboots - such as by-id or by-path.

Avoid using /dev/sdx paths, as device naming can change after a reboot, potentially leading to data corruption.

Type Format Example
Disk (non-PCI) with persistent link (Best Practice) Device File aio:///dev/disk/by-id/... or uring:///dev/disk/by-id/...
Asynchronous Disk (AIO) /dev/sdx
AIO URI aio:///dev/sdx
io_uring uring:///dev/sdx

Use the following command to identify device links for block devices on a node:

Copy
Retrieve Device Links for a Node
kubectl puls8 mayastor get block-devices <node-name>

Pool Behavior

  • Once a DiskPool is created, it is assumed that it henceforth has exclusive use of the associated block device.
  • Do not partition, format, or reuse this device for any other process.
  • Any existing data on the device will be erased during pool creation.

RAM drives are not suitable for use in production as it uses volatile memory for backing the data. The memory for this disk emulation is allocated from the hugepages pool. Make sure to allocate sufficient additional hugepages resource on any storage nodes which will provide this type of storage.

DiskPool Configuration

A DiskPool configuration specifies three primary components:

  • Node: The specific worker node where the pool will be created. Each DiskPool is tightly bound to a single node.
  • Disks: A list of one or more block devices (typically referenced using persistent device links) that provide the pool's storage capacity. Currently, a DiskPool can have only one disk.
  • Topology (Optional): Key-value labels used for topology-aware replica placement, helping the scheduler make intelligent decisions about replica distribution based on zone, rack, or other infrastructure details.

Basic DiskPool Definition

To get started, you must configure at least one DiskPool per node involved in replication. The number of available pools should equal or exceed the desired replication factor.

For example, provisioning a PV with 3 replicas requires 3 nodes, each with atleast one DiskPool over a unique block device.

Copy
YAML – Create a Basic DiskPool
cat <<EOF | kubectl create -f -
apiVersion: "openebs.io/v1beta3"
kind: DiskPool
metadata:
  name: pool-on-node-1
  namespace: puls8
spec:
  node: INSERT_WORKERNODE_HOSTNAME_HERE
  disks: ["aio:///dev/disk/by-id/<id>"]
EOF

Topology-Aware DiskPool Configuration

To enable topology-aware replica placement (Example: With poolHasTopologyKey or poolAffinityTopologyLabel in a StorageClass), assign labels to your DiskPools.

Option 1: Create Pool with Labels

Copy
YAML – Create DiskPool with Topology Labels
cat <<EOF | kubectl create -f -
apiVersion: "openebs.io/v1beta3"
kind: DiskPool
metadata:
  name: pool-on-node-1
  namespace: puls8
spec:
  node: INSERT_WORKERNODE_HOSTNAME_HERE
  disks: ["aio:///dev/disk/by-id/<id>"]
  topology:
    labelled:
      topology-key: topology-value
EOF

Option 2: Label Existing Pools Using Plugin

Copy
Add Labels to Existing Pool
kubectl puls8 mayastor label pool pool-on-node-1 topology-key=topology-value -n puls8
Copy
YAML
apiVersion: "openebs.io/v1beta3"
kind: DiskPool
metadata:
  name: INSERT_POOL_NAME_HERE
  namespace: puls8
spec:
  node: INSERT_WORKERNODE_HOSTNAME_HERE
  disks: ["INSERT_DEVICE_URI_HERE"]

Verifying DiskPool Status

To confirm that DiskPools have been successfully created and are operational, use the following command:

Copy
View DiskPool Status
kubectl get dsp -n puls8
Copy
Example Output - DiskPools
NAME             NODE            STATE     POOL_STATUS   ENCRYPTED   CAPACITY   USED   AVAILABLE
pool-on-node-0   node-0-352384   Created   Online        false       15 GiB     0 B    15 GiB
pool-on-node-1   node-1-352384   Created   Online        false       15 GiB     0 B    15 GiB
pool-on-node-2   node-2-352384   Created   Online        false       15 GiB     0 B    15 GiB

Pools should report a status of Online and Created. If any pool is missing, misconfigured, or marked Offline, consult the operator logs or validation commands.

Learn More