Installing DataCore Puls8 on GKE
Explore this Page
- Overview
- GKE with Local SSDs
- Requirements
- Install DataCore Puls8 on GKE
- Verify Installation
- Create and Configure Disk Pools
- Storage Configuration and Application Deployment
- Node Failure Handling
- Benefits of Using DataCore Puls8 with GKE
Overview
This document outlines the steps required to install and configure Replicated Persistent Volumes (PV) on Google Kubernetes Engine (GKE). It includes key prerequisites, commands for setting up local SSDs, pool creation, and considerations for node failure recovery. Leveraging DataCore Puls8 with GKE local SSDs ensures high performance and resilient storage in Kubernetes environments.
GKE with Local SSDs
Local SSDs in GKE are physically attached to the virtual machine (VM) instances and provide high Input/Output Operations Per Second (IOPS) and low latency. However, they are ephemeral. Data is lost if the node is deleted, restarted, or rescheduled.
To overcome the limitations of ephemeral storage, DataCore Puls8 with Mayastor provides:
- Data Replication: Replicates data across nodes to prevent data loss during node failure.
- High Performance: Utilizes the native performance of local SSDs for latency-sensitive workloads.
- Cloud-Native Resilience: Ensures storage resilience and availability in a Kubernetes-native way.
Additional local SSDs must be attached during cluster creation. You cannot add SSDs to an existing node pool.
Requirements
Node Image
Replicated PV Mayastor requires GKE nodes to be provisioned with the ubuntu_containerd
image. Ensure this image type is selected during cluster creation.
Hardware and Node Configuration
- Minimum of 3 worker nodes is required.
- The number of nodes must be equal to or exceed the replication factor (for synchronous replication).
Adding Local SSDs (Block Device Mode)
Use the following command to create a GKE cluster with local SSDs in block mode:
gcloud container clusters create <CLUSTER_NAME> \
--machine-type <MACHINE_TYPE> \
--num-nodes <NUMBER_OF_NODES> \
--zone <ZONE> \
--local-nvme-ssd-block count=1 \
--image-type ubuntu_containerd
Enable HugePages
HugePages of size 2MiB must be enabled on each storage node. Ensure at least 1024 HugePages (2GiB) are reserved for Mayastor IO Engine pods.
SSH into the GKE nodes and configure HugePages as per GKE node SSH instructions.
Load Kernel Modules
SSH into each GKE node and load the required nvme_tcp
kernel module:
Prepare the Cluster
Refer to the DataCore Puls8 Prerequisites documentation for steps to prepare the cluster environment.
Configure ETCD and Loki Storage Classes
Use the GKE standard-rwo StorageClass for persistent volumes used by ETCD and Loki.
Install DataCore Puls8 on GKE
Install DataCore Puls8 using Helm:
helm install openebs --namespace openebs openebs/openebs \
--create-namespace \
--set openebs-crds.csi.volumeSnapshots.enabled=false \
--set mayastor.etcd.localpvScConfig.enabled=false \
--set mayastor.etcd.persistence.enabled=true \
--set mayastor.etcd.persistence.storageClass=standard-rwo \
--set mayastor.loki-stack.localpvScConfig.enabled=false \
--set mayastor.loki-stack.loki.persistence.enabled=true \
--set mayastor.loki-stack.loki.persistence.storageClassName=standard-rwo
GKE pre-installs volume snapshot CRDs. Disable them in the Helm chart to avoid resource conflicts during installation.
Verify Installation
Use the following command to verify that all pods are running:
NAME READY STATUS RESTARTS AGE
openebs-agent-core-674f784df5-7szbm 2/2 Running 0 11m
openebs-agent-ha-node-nnkmv 1/1 Running 0 11m
openebs-agent-ha-node-pvcrr 1/1 Running 0 11m
openebs-agent-ha-node-rqkkk 1/1 Running 0 11m
openebs-api-rest-79556897c8-b824j 1/1 Running 0 11m
openebs-csi-controller-b5c47d49-5t5zd 6/6 Running 0 11m
openebs-csi-node-flq49 2/2 Running 0 11m
openebs-csi-node-k8d7h 2/2 Running 0 11m
openebs-csi-node-v7jfh 2/2 Running 0 11m
openebs-etcd-0 1/1 Running 0 11m
openebs-etcd-1 1/1 Running 0 11m
openebs-etcd-2 1/1 Running 0 11m
openebs-io-engine-7t6tf 2/2 Running 0 11m
openebs-io-engine-9df6r 2/2 Running 0 11m
openebs-io-engine-rqph4 2/2 Running 0 11m
openebs-localpv-provisioner-6ddf7c7978-4fkvs 1/1 Running 0 11m
openebs-loki-0 1/1 Running 0 11m
openebs-lvm-localpv-controller-7b6d6b4665-fk78q 5/5 Running 0 11m
openebs-lvm-localpv-node-mcch4 2/2 Running 0 11m
openebs-lvm-localpv-node-pdt88 2/2 Running 0 11m
openebs-lvm-localpv-node-r9jn2 2/2 Running 0 11m
openebs-nats-0 3/3 Running 0 11m
openebs-nats-1 3/3 Running 0 11m
openebs-nats-2 3/3 Running 0 11m
openebs-obs-callhome-854bc967-5f879 2/2 Running 0 11m
openebs-operator-diskpool-5586b65c-cwpr8 1/1 Running 0 11m
openebs-promtail-2vrzk 1/1 Running 0 11m
openebs-promtail-mwxk8 1/1 Running 0 11m
openebs-promtail-w7b8k 1/1 Running 0 11m
openebs-zfs-localpv-controller-f78f7467c-blr7q 5/5 Running 0 11m
openebs-zfs-localpv-node-h46m5 2/2 Running 0 11m
openebs-zfs-localpv-node-svfgq 2/2 Running 0 11m
openebs-zfs-localpv-node-wm9ks 2/2 Running 0 11m
Create and Configure Disk Pools
List Available Block Devices
Use the kubectl puls8 mayastor plugin to list available block devices:
kubectl puls8 mayastor get block-devices gke-gke-ssd-default-pool-<NODE_ID>
Identify SSD Block Size
Run the following commands on the node to determine the block size:
Disk /dev/nvme1n1: 375 GiB, 402653184000 bytes, 98304000 sectors
Disk model: nvme_card0
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
NAME PHY-SEC
nvme0n1 512
├─nvme0n1p1 512
├─nvme0n1p14 512
└─nvme0n1p15 512
nvme1n1 4096
Create a Pool YAML
Below is a sample pool.yaml to define a DiskPool:
apiVersion: "openebs.io/v1beta2"
kind: DiskPool
metadata:
name: pool-1
namespace: mayastor
spec:
node: <NODE_NAME>
disks:
- "aio:////dev/disk/by-id/google-local-nvme-ssd-0?blk_size=4096"
Storage Configuration and Application Deployment
- Refer to the Creating a StorageClass Documentation for StorageClass creation.
- Refer to the Deploying Workloads Documentation for steps to deploy workloads.
Node Failure Handling
GKE nodes are part of a managed instance group. When a node is terminated, GKE automatically creates a replacement node with a new SSD.
In case of node failure:
- The associated pool becomes Unknown.
- The Mayastor volume enters a Degraded state due to replica loss.
NAME NODE STATE POOL_STATUS CAPACITY USED AVAILABLE
pool-1 gke-gke-local-ssd-default-pool-dd2b0b02-08cs Created Online 402258919424 5368709120 396890210304
pool-2 gke-gke-local-ssd-default-pool-dd2b0b02-n6wq Created Online 402258919424 5368709120 396890210304
pool-3 gke-gke-local-ssd-default-pool-dd2b0b02-8twd Created Unknown 0 0 0
ID REPLICAS TARGET-NODE ACCESSIBILITY STATUS SIZE THIN-PROVISIONED ALLOCATED SNAPSHOTS SOURCE
fa486a03-d806-4b5c-a534-5666900853a2 3 gke-gke-local-ssd-default-pool-dd2b0b02-08cs nvmf Degraded 5GiB false 5GiB 0 <none>
Recovery Steps
Reapply node-level configurations:
- Reconfigure HugePages
- Load
nvme_tcp
module - Recreate a new DiskPool with a new name.
Once the pool is created, the degraded volume is back online after the rebuild.
Benefits of Using DataCore Puls8 with GKE
- High Availability: Data is synchronously replicated across multiple nodes, ensuring availability even if a node or disk fails.
- Performance Optimization:DataCore Puls8 leverages GKE’s local SSDs, which offer high IOPS and low latency - ideal for performance-intensive applications such as databases and analytics.
- Fast Recovery: In the event of node failure, replica rebuilding ensures rapid restoration of full redundancy with minimal administrative intervention.
- Kubernetes-Native Operations: Seamless integration with Kubernetes through CSI drivers and custom resources allows declarative, consistent, and scalable storage operations.
Learn More