Installing DataCore Puls8 on GKE

Explore this Page

Overview
GKE with Local SSDs
Requirements
Install DataCore Puls8 on GKE
Verify Installation
Create and Configure Disk Pools
Storage Configuration and Application Deployment
Node Failure Handling
Benefits of Using DataCore Puls8 with GKE

Overview

This document outlines the steps required to install and configure Replicated Persistent Volumes (PV) on Google Kubernetes Engine (GKE). It includes key prerequisites, commands for setting up local SSDs, pool creation, and considerations for node failure recovery. Leveraging DataCore Puls8 with GKE local SSDs ensures high performance and resilient storage in Kubernetes environments.

GKE with Local SSDs

Local SSDs in GKE are physically attached to the virtual machine (VM) instances and provide high Input/Output Operations Per Second (IOPS) and low latency. However, they are ephemeral. Data is lost if the node is deleted, restarted, or rescheduled.

To overcome the limitations of ephemeral storage, DataCore Puls8 with Mayastor provides:

Data Replication: Replicates data across nodes to prevent data loss during node failure.
High Performance: Utilizes the native performance of local SSDs for latency-sensitive workloads.
Cloud-Native Resilience: Ensures storage resilience and availability in a Kubernetes-native way.

Additional local SSDs must be attached during cluster creation. You cannot add SSDs to an existing node pool.

Requirements

Node Image

Replicated PV Mayastor requires GKE nodes to be provisioned with the ubuntu_containerd image. Ensure this image type is selected during cluster creation.

Hardware and Node Configuration

Minimum of 3 worker nodes is required.
The number of nodes must be equal to or exceed the replication factor (for synchronous replication).

Adding Local SSDs (Block Device Mode)

Use the following command to create a GKE cluster with local SSDs in block mode:

Copy

Cluster Creation with Block Device Local SSDs

gcloud container clusters create <CLUSTER_NAME> \
  --machine-type <MACHINE_TYPE> \
  --num-nodes <NUMBER_OF_NODES> \
  --zone <ZONE> \
  --local-nvme-ssd-block count=1 \
  --image-type ubuntu_containerd

Enable HugePages

HugePages of size 2MiB must be enabled on each storage node. Ensure at least 1024 HugePages (2GiB) are reserved for Mayastor IO Engine pods.

SSH into the GKE nodes and configure HugePages as per GKE node SSH instructions.

Load Kernel Modules

SSH into each GKE node and load the required nvme_tcp kernel module:

Copy

Load nvme_tcp Kernel Module

modprobe nvme_tcp

Prepare the Cluster

Refer to the DataCore Puls8 Prerequisites documentation for steps to prepare the cluster environment.

Configure ETCD and Loki Storage Classes

Use the GKE standard-rwo StorageClass for persistent volumes used by ETCD and Loki.

Install DataCore Puls8 on GKE

Install DataCore Puls8 using Helm:

Copy

Helm Installation Command

helm install openebs --namespace openebs openebs/openebs \
  --create-namespace \
  --set openebs-crds.csi.volumeSnapshots.enabled=false \
  --set mayastor.etcd.localpvScConfig.enabled=false \
  --set mayastor.etcd.persistence.enabled=true \
  --set mayastor.etcd.persistence.storageClass=standard-rwo \
  --set mayastor.loki-stack.localpvScConfig.enabled=false \
  --set mayastor.loki-stack.loki.persistence.enabled=true \
  --set mayastor.loki-stack.loki.persistence.storageClassName=standard-rwo

GKE pre-installs volume snapshot CRDs. Disable them in the Helm chart to avoid resource conflicts during installation.

Verify Installation

Use the following command to verify that all pods are running:

Copy

Check the Status of DataCore Puls8 Components

kubectl get pods -n openebs

Copy

Sample Output

NAME                                              READY   STATUS    RESTARTS   AGE
openebs-agent-core-674f784df5-7szbm               2/2     Running   0          11m
openebs-agent-ha-node-nnkmv                       1/1     Running   0          11m
openebs-agent-ha-node-pvcrr                       1/1     Running   0          11m
openebs-agent-ha-node-rqkkk                       1/1     Running   0          11m
openebs-api-rest-79556897c8-b824j                 1/1     Running   0          11m
openebs-csi-controller-b5c47d49-5t5zd             6/6     Running   0          11m
openebs-csi-node-flq49                            2/2     Running   0          11m
openebs-csi-node-k8d7h                            2/2     Running   0          11m
openebs-csi-node-v7jfh                            2/2     Running   0          11m
openebs-etcd-0                                    1/1     Running   0          11m
openebs-etcd-1                                    1/1     Running   0          11m
openebs-etcd-2                                    1/1     Running   0          11m
openebs-io-engine-7t6tf                           2/2     Running   0          11m
openebs-io-engine-9df6r                           2/2     Running   0          11m
openebs-io-engine-rqph4                           2/2     Running   0          11m
openebs-localpv-provisioner-6ddf7c7978-4fkvs      1/1     Running   0          11m
openebs-loki-0                                    1/1     Running   0          11m
openebs-lvm-localpv-controller-7b6d6b4665-fk78q   5/5     Running   0          11m
openebs-lvm-localpv-node-mcch4                    2/2     Running   0          11m
openebs-lvm-localpv-node-pdt88                    2/2     Running   0          11m
openebs-lvm-localpv-node-r9jn2                    2/2     Running   0          11m
openebs-nats-0                                    3/3     Running   0          11m
openebs-nats-1                                    3/3     Running   0          11m
openebs-nats-2                                    3/3     Running   0          11m
openebs-obs-callhome-854bc967-5f879               2/2     Running   0          11m
openebs-operator-diskpool-5586b65c-cwpr8          1/1     Running   0          11m
openebs-promtail-2vrzk                            1/1     Running   0          11m
openebs-promtail-mwxk8                            1/1     Running   0          11m
openebs-promtail-w7b8k                            1/1     Running   0          11m
openebs-zfs-localpv-controller-f78f7467c-blr7q    5/5     Running   0          11m
openebs-zfs-localpv-node-h46m5                    2/2     Running   0          11m
openebs-zfs-localpv-node-svfgq                    2/2     Running   0          11m
openebs-zfs-localpv-node-wm9ks                    2/2     Running   0          11m

Create and Configure Disk Pools

List Available Block Devices

Use the kubectl puls8 mayastor plugin to list available block devices:

Copy

List Block Devices on a Specific Node

kubectl puls8 mayastor get block-devices gke-gke-ssd-default-pool-<NODE_ID>

Identify SSD Block Size

Run the following commands on the node to determine the block size:

Copy

Show Disk Details

fdisk -l /dev/nvme1n1

Copy

Sample Output

Disk /dev/nvme1n1: 375 GiB, 402653184000 bytes, 98304000 sectors
Disk model: nvme_card0
Units: sectors of 1 * 4096 = 4096 bytes
Sector size (logical/physical): 4096 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Copy

Display Block Sizes

lsblk -o NAME,PHY-SeC

Copy

Sample Output

NAME         PHY-SEC
nvme0n1          512
├─nvme0n1p1      512
├─nvme0n1p14     512
└─nvme0n1p15     512
nvme1n1         4096

Create a Pool YAML

Below is a sample pool.yaml to define a DiskPool:

Copy

pool.yaml - DiskPool Definition

apiVersion: "openebs.io/v1beta2"
kind: DiskPool
metadata:
  name: pool-1
  namespace: mayastor
spec:
  node: <NODE_NAME>
  disks:
    - "aio:////dev/disk/by-id/google-local-nvme-ssd-0?blk_size=4096"

Copy

Command

kubectl apply -f pool.yaml

Copy

Sample Output

diskpool.openebs.io/pool-1 created

Storage Configuration and Application Deployment

Refer to the Creating a StorageClass Documentation for StorageClass creation.
Refer to the Deploying Workloads Documentation for steps to deploy workloads.

Node Failure Handling

GKE nodes are part of a managed instance group. When a node is terminated, GKE automatically creates a replacement node with a new SSD.

In case of node failure:

The associated pool becomes Unknown.
The Mayastor volume enters a Degraded state due to replica loss.

Copy

View Pool States after a Node Failure

kubectl get dsp -n mayastor

Copy

Sample Output

NAME     NODE                                           STATE     POOL_STATUS   CAPACITY       USED         AVAILABLE
pool-1   gke-gke-local-ssd-default-pool-dd2b0b02-08cs   Created   Online        402258919424   5368709120   396890210304
pool-2   gke-gke-local-ssd-default-pool-dd2b0b02-n6wq   Created   Online        402258919424   5368709120   396890210304
pool-3   gke-gke-local-ssd-default-pool-dd2b0b02-8twd   Created   Unknown       0              0            0

Copy

View Volumes and their Status

kubectl puls8 mayastor get volumes

Copy

Sample Output

ID                                    REPLICAS  TARGET-NODE                                   ACCESSIBILITY  STATUS    SIZE  THIN-PROVISIONED  ALLOCATED  SNAPSHOTS  SOURCE 
fa486a03-d806-4b5c-a534-5666900853a2  3         gke-gke-local-ssd-default-pool-dd2b0b02-08cs  nvmf           Degraded  5GiB  false             5GiB       0          <none>

Recovery Steps

Reapply node-level configurations:

Reconfigure HugePages
Load nvme_tcp module
Recreate a new DiskPool with a new name.

Copy

Apply Updated Pool Configuration after Node Recovery

kubectl apply -f pool-4.yaml

Copy

Sample Output

diskpool.openebs.io/pool-4 created

Once the pool is created, the degraded volume is back online after the rebuild.

Benefits of Using DataCore Puls8 with GKE

High Availability: Data is synchronously replicated across multiple nodes, ensuring availability even if a node or disk fails.
Performance Optimization:DataCore Puls8 leverages GKE’s local SSDs, which offer high IOPS and low latency - ideal for performance-intensive applications such as databases and analytics.
Fast Recovery: In the event of node failure, replica rebuilding ensures rapid restoration of full redundancy with minimal administrative intervention.
Kubernetes-Native Operations: Seamless integration with Kubernetes through CSI drivers and custom resources allows declarative, consistent, and scalable storage operations.

Learn More