Scaling Etcd Members

Explore this Page

Overview

DataCore Puls8 leverages an embedded etcd cluster to manage its configuration and state information. By default, the system provisions three etcd members through a Kubernetes StatefulSet. If you attempt to increase the number of etcd replicas without proper configuration, errors may occur.

This document provides detailed instructions to correctly scale up the number of etcd replicas beyond the default limit, ensuring that data integrity and cluster stability are maintained.

Understanding StatefulSets

StatefulSets are Kubernetes resources designed for deploying and managing stateful applications. They provide:

  • Stable network identities for each pod
  • Persistent storage with consistent volume association
  • Ordered deployment, scaling, and termination of pods
  • Controlled pod management for applications that require strong consistency

In a StatefulSet with N replicas:

  • Pods are created sequentially in the order {0..N-1}.
  • Pods are terminated in reverse order {N-1..0}.
  • A pod must be running and ready before its successor is created or modified.
  • A pod must be terminated only after its successors have been removed.

Preliminary Step: Verify Existing DiskPools

To view existing diskpools:

Copy
View Existing DiskPools
kubectl get dsp -n puls8
Copy
Sample Output
NAME             NODE            STATE     POOL_STATUS   ENCRYPTED   CAPACITY   USED   AVAILABLE
pool-on-node-0   node-0-352384   Created   Online        false       15 GiB     0 B    15 GiB
pool-on-node-1   node-1-352384   Created   Online        false       15 GiB     0 B    15 GiB
pool-on-node-2   node-2-352384   Created   Online        false       15 GiB     0 B    15 GiB

Before making changes, take a snapshot of the current etcd database. Refer to the Disaster Recovery documentation for snapshot procedures.

Procedure to Scale Up etcd Members

The process to add an additional etcd member involves four main steps:

  1. Scale the etcd StatefulSet
  2. Add a New Peer URL
  3. Create a PV
  4. Validate key-value consistency across all members

Scale the etcd StatefulSet

Increase the number of etcd replicas using the kubectl scale command.

Copy
Scale StatefulSet to 4 replicas
kubectl scale sts puls8-etcd -n puls8 --replicas=4
Copy
Sample Output
statefulset.apps/puls8-etcd scaled

After scaling, the new pod will be created but will remain in a Pending state until a PV is provisioned.

Copy
Check the pod status
kubectl get pods -n puls8 -l app=etcd
Copy
Sample Output
NAME           READY   STATUS    RESTARTS   AGE
puls8-etcd-0   1/1     Running   0          28d
puls8-etcd-1   1/1     Running   0          28d
puls8-etcd-2   1/1     Running   0          28d
puls8-etcd-3   0/1     Pending   0          2m34s

Add a New Peer URL

Before the new pod can join the cluster, update the StatefulSet configuration to:

  • Set ETCD_INITIAL_CLUSTER_STATE to existing
  • Add the peer URL for puls8-etcd-3
Copy
Edit StatefulSet Configuration
kubectl edit sts puls8-etcd -n puls8
Copy
Required Changes
- name: ETCD_INITIAL_CLUSTER_STATE
  value: existing
- name: ETCD_INITIAL_CLUSTER
  value: |-
    puls8-etcd-0=http://puls8-etcd-0.puls8-etcd-headless.puls8.svc.cluster.local:2380,
    puls8-etcd-1=http://puls8-etcd-1.puls8-etcd-headless.puls8.svc.cluster.local:2380,
    puls8-etcd-2=http://puls8-etcd-2.puls8-etcd-headless.puls8.svc.cluster.local:2380,
    puls8-etcd-3=http://puls8-etcd-3.puls8-etcd-headless.puls8.svc.cluster.local:2380

Since the new pod is pending, these changes will take effect only after the Persistent Volume is created and the pod is able to start.

Create a PV

A PV must be manually created to bind the new etcd pod to storage. Below is an example YAML configuration:

Copy
YAML: Persistent Volume Definition
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    meta.helm.sh/release-name: puls8
    meta.helm.sh/release-namespace: puls8
    pv.kubernetes.io/bound-by-controller: "yes"
  finalizers:
  - kubernetes.io/pv-protection
  labels:
    app.kubernetes.io/managed-by: Helm
    statefulset.kubernetes.io/pod-name: puls8-etcd-3
  name: etcd-volume-3
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 2Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: data-puls8-etcd-3
    namespace: puls8
  hostPath:
    path: /var/local/puls8/etcd/pod-3
    type: ""
  persistentVolumeReclaimPolicy: Delete
  storageClassName: manual
  volumeMode: Filesystem

 

Copy
Apply the PV Configuration
kubectl apply -f pv-etcd.yaml -n puls8
Copy
Sample Output
persistentvolume/etcd-volume-3 created

This procedure assumes the use of a "manual" StorageClass.

Validate Key-Value Pair Consistency

After the new etcd pod becomes active, ensure that it has the same key-value data as the existing etcd-0/1/2 nodes.

Copy
Connect to the new etcd pod
kubectl exec -it puls8-etcd-3 -n puls8 -- bash
Copy
Command (inside the pod): List all Keys
ETCDCTL_API=3 etcdctl get --prefix ""

Check the output and verify that the data matches across all etcd members.

If discrepancies are found, it indicates possible data loss that must be addressed immediately.

Learn More