Scaling Etcd Members
Explore this Page
- Overview
- Understanding StatefulSets
- Preliminary Step: Verify Existing DiskPools
- Procedure to Scale Up etcd Members
Overview
DataCore Puls8 leverages an embedded etcd cluster to manage its configuration and state information. By default, the system provisions three etcd members through a Kubernetes StatefulSet. If you attempt to increase the number of etcd replicas without proper configuration, errors may occur.
This document provides detailed instructions to correctly scale up the number of etcd replicas beyond the default limit, ensuring that data integrity and cluster stability are maintained.
Understanding StatefulSets
StatefulSets are Kubernetes resources designed for deploying and managing stateful applications. They provide:
- Stable network identities for each pod
- Persistent storage with consistent volume association
- Ordered deployment, scaling, and termination of pods
- Controlled pod management for applications that require strong consistency
In a StatefulSet with N
replicas:
- Pods are created sequentially in the order
{0..N-1}
. - Pods are terminated in reverse order
{N-1..0}
. - A pod must be running and ready before its successor is created or modified.
- A pod must be terminated only after its successors have been removed.
Preliminary Step: Verify Existing DiskPools
To view existing diskpools:
NAME NODE STATE POOL_STATUS ENCRYPTED CAPACITY USED AVAILABLE
pool-on-node-0 node-0-352384 Created Online false 15 GiB 0 B 15 GiB
pool-on-node-1 node-1-352384 Created Online false 15 GiB 0 B 15 GiB
pool-on-node-2 node-2-352384 Created Online false 15 GiB 0 B 15 GiB
Before making changes, take a snapshot of the current etcd database. Refer to the Disaster Recovery documentation for snapshot procedures.
Procedure to Scale Up etcd Members
The process to add an additional etcd member involves four main steps:
- Scale the etcd StatefulSet
- Add a New Peer URL
- Create a PV
- Validate key-value consistency across all members
Scale the etcd StatefulSet
Increase the number of etcd replicas using the kubectl scale
command.
After scaling, the new pod will be created but will remain in a Pending state until a PV is provisioned.
NAME READY STATUS RESTARTS AGE
puls8-etcd-0 1/1 Running 0 28d
puls8-etcd-1 1/1 Running 0 28d
puls8-etcd-2 1/1 Running 0 28d
puls8-etcd-3 0/1 Pending 0 2m34s
Add a New Peer URL
Before the new pod can join the cluster, update the StatefulSet
configuration to:
- Set
ETCD_INITIAL_CLUSTER_STATE
toexisting
- Add the peer URL for
puls8-etcd-3
- name: ETCD_INITIAL_CLUSTER_STATE
value: existing
- name: ETCD_INITIAL_CLUSTER
value: |-
puls8-etcd-0=http://puls8-etcd-0.puls8-etcd-headless.puls8.svc.cluster.local:2380,
puls8-etcd-1=http://puls8-etcd-1.puls8-etcd-headless.puls8.svc.cluster.local:2380,
puls8-etcd-2=http://puls8-etcd-2.puls8-etcd-headless.puls8.svc.cluster.local:2380,
puls8-etcd-3=http://puls8-etcd-3.puls8-etcd-headless.puls8.svc.cluster.local:2380
Since the new pod is pending, these changes will take effect only after the Persistent Volume is created and the pod is able to start.
Create a PV
A PV must be manually created to bind the new etcd pod to storage. Below is an example YAML configuration:
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
meta.helm.sh/release-name: puls8
meta.helm.sh/release-namespace: puls8
pv.kubernetes.io/bound-by-controller: "yes"
finalizers:
- kubernetes.io/pv-protection
labels:
app.kubernetes.io/managed-by: Helm
statefulset.kubernetes.io/pod-name: puls8-etcd-3
name: etcd-volume-3
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 2Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: data-puls8-etcd-3
namespace: puls8
hostPath:
path: /var/local/puls8/etcd/pod-3
type: ""
persistentVolumeReclaimPolicy: Delete
storageClassName: manual
volumeMode: Filesystem
This procedure assumes the use of a "manual" StorageClass.
Validate Key-Value Pair Consistency
After the new etcd pod becomes active, ensure that it has the same key-value data as the existing etcd-0/1/2 nodes.
Check the output and verify that the data matches across all etcd members.
If discrepancies are found, it indicates possible data loss that must be addressed immediately.
Learn More