Creating a DiskPool
Explore this Page
Overview
In DataCore Puls8, storage is provisioned using pools of block devices known as DiskPools. These pools serve as the foundation for creating Replicated Persistent Volumes (PVs) using the Replicated PV Mayastor storage. A DiskPool is tied to a specific node and is responsible for managing a single block device that defines its total capacity.
This document provides guidance on configuring and managing DiskPools, including best practices, supported device schemes, and procedures for verifying pool status.
DiskPool Creation
When a node in the cluster is designated to host a replica of a PV, it utilizes a DiskPool to allocate the required storage capacity. Each node can maintain one or more DiskPools, but ownership of a pool is exclusive to a single node. Furthermore, each DiskPool is associated with only one block device, which defines its total capacity and forms the underlying data persistence layer.
DiskPools are defined declaratively through DiskPool Custom Resources (CRs) within the cluster. These resources must be created in the same namespace where the Replicated PV Mayastor component is deployed. The DiskPool specification allows you to configure the pool's unique name, the host node on which it will reside, and the device reference to be used. The referenced block device must adhere to supported URI schemes depending on the transport or device type.
Supported Device Schemes
It is recommended to use persistent device links that are stable across node reboots - such as by-id
or by-path
.
Avoid using /dev/sdx
paths, as device naming can change after a reboot, potentially leading to data corruption.
Type | Format | Example |
---|---|---|
Disk (non-PCI) with persistent link (Best Practice) | Device File | aio:///dev/disk/by-id/... or uring:///dev/disk/by-id/... |
Asynchronous Disk (AIO) | /dev/sdx
|
|
AIO URI | aio:///dev/sdx
|
|
io_uring | uring:///dev/sdx
|
Use the following command to identify device links for block devices on a node:
Pool Behavior
- Once a DiskPool is created, it is assumed that it henceforth has exclusive use of the associated block device.
- Do not partition, format, or reuse this device for any other process.
- Any existing data on the device will be erased during pool creation.
RAM drives are not suitable for use in production as it uses volatile memory for backing the data. The memory for this disk emulation is allocated from the hugepages pool. Make sure to allocate sufficient additional hugepages resource on any storage nodes which will provide this type of storage.
DiskPool Configuration
A DiskPool configuration specifies three primary components:
- Node: The specific worker node where the pool will be created. Each DiskPool is tightly bound to a single node.
- Disks: A list of one or more block devices (typically referenced using persistent device links) that provide the pool's storage capacity. Currently, a DiskPool can have only one disk.
- Topology (Optional): Key-value labels used for topology-aware replica placement, helping the scheduler make intelligent decisions about replica distribution based on zone, rack, or other infrastructure details.
Basic DiskPool Definition
To get started, you must configure at least one DiskPool per node involved in replication. The number of available pools should equal or exceed the desired replication factor.
For example, provisioning a PV with 3 replicas requires 3 nodes, each with atleast one DiskPool over a unique block device.
cat <<EOF | kubectl create -f -
apiVersion: "openebs.io/v1beta3"
kind: DiskPool
metadata:
name: pool-on-node-1
namespace: puls8
spec:
node: INSERT_WORKERNODE_HOSTNAME_HERE
disks: ["aio:///dev/disk/by-id/<id>"]
EOF
Topology-Aware DiskPool Configuration
To enable topology-aware replica placement (Example: With poolHasTopologyKey
or poolAffinityTopologyLabel
in a StorageClass), assign labels to your DiskPools.
Option 1: Create Pool with Labels
cat <<EOF | kubectl create -f -
apiVersion: "openebs.io/v1beta3"
kind: DiskPool
metadata:
name: pool-on-node-1
namespace: puls8
spec:
node: INSERT_WORKERNODE_HOSTNAME_HERE
disks: ["aio:///dev/disk/by-id/<id>"]
topology:
labelled:
topology-key: topology-value
EOF
Option 2: Label Existing Pools Using Plugin
kubectl puls8 mayastor label pool pool-on-node-1 topology-key=topology-value -n puls8
apiVersion: "openebs.io/v1beta3"
kind: DiskPool
metadata:
name: INSERT_POOL_NAME_HERE
namespace: puls8
spec:
node: INSERT_WORKERNODE_HOSTNAME_HERE
disks: ["INSERT_DEVICE_URI_HERE"]
Verifying DiskPool Status
To confirm that DiskPools have been successfully created and are operational, use the following command:
NAME NODE STATE POOL_STATUS ENCRYPTED CAPACITY USED AVAILABLE
pool-on-node-0 node-0-352384 Created Online false 15 GiB 0 B 15 GiB
pool-on-node-1 node-1-352384 Created Online false 15 GiB 0 B 15 GiB
pool-on-node-2 node-2-352384 Created Online false 15 GiB 0 B 15 GiB
Pools should report a status of Online
and Created
. If any pool is missing, misconfigured, or marked Offline
, consult the operator logs or validation commands.
Learn More