Skip to main content

Command Palette

Search for a command to run...

Cost Optimization with Planned Downtime Migrating an EBS-Backed StatefulSet from Multi-AZ to Single-AZ in Amazon EKS (Production Pattern)

This migration was performed on a production workload where cost reduction was prioritized over zone-level high availability.

Published
8 min read
Cost Optimization with Planned Downtime
Migrating an EBS-Backed StatefulSet from Multi-AZ to Single-AZ in Amazon EKS (Production Pattern)

1. Overview

This article documents a real production-style migration of a Kubernetes StatefulSet backed by Amazon EBS from Multi-AZ to Single-AZ in Amazon EKS.

The migration was executed at the storage layer using EBS snapshots during a controlled maintenance window.

Objectives

  • Reduce cross-AZ replication and inter-AZ data transfer cost

  • Preserve existing EBS-backed data

  • Avoid provisioning a parallel Kafka cluster

  • Maintain deterministic storage recovery

  • Ensure logical RPO = 0 with verified clean shutdown and completed snapshot.

This was a cost-first architectural decision with acknowledged availability trade-offs.


2. Business Context

Kafka (with ZooKeeper) was deployed across:

  • ap-south-1b

  • ap-south-1c

Multi-AZ improved resilience, but introduced:

  • Continuous cross-AZ replication traffic

  • Inter-AZ data transfer billing

  • Increased recurring infrastructure cost

After analyzing recurring billing, cross-AZ traffic became a major cost driver.

Cost analysis showed that inter-AZ data transfer and cross-AZ replication traffic accounted for a significant percentage of the monthly Kafka infrastructure spend.

While Multi-AZ improved resilience, the business determined that the availability gain did not justify the recurring transfer cost for this workload profile.

Business decision:

Consolidate into single AZ
Downtime acceptable
No data loss allowed


3. Technical Constraint — Why This Is Not a Simple Scheduler Change

The workload runs as a StatefulSet using volumeClaimTemplates.

Storage backend: Amazon EBS via EBS CSI driver.

Important constraints:

  • EBS volumes are strictly AZ-scoped

  • PVs include topology.kubernetes.io/zone nodeAffinity

  • PVCs are tightly bound to PVs

  • EBS cannot attach across AZ

If we restrict nodeAffinity without moving storage:

  • Pod schedules successfully

  • Volume attach fails

  • Pod stuck in ContainerCreating

This is fundamentally a storage locality constraint.

Storage must move before pods move.


4. Architecture Before Migration

Multi-AZ StatefulSet with cross-AZ replication traffic between replicas.

Characteristics

  • Replica-0 in ap-south-1b

  • Replica-1 in ap-south-1c

  • Independent EBS volumes per replica

  • Continuous cross-AZ replication

  • Higher availability

  • Higher recurring cost


5. Migration Options Evaluated

Option 1 — Snapshot-Based Storage Migration (Chosen)

Flow:

  1. Validate Kafka stability

  2. Scale StatefulSet to 0

  3. Snapshot EBS volumes

  4. Restore volumes in the target AZ

  5. Rebind PVCs via static PVs

  6. Restrict scheduling

  7. Safe staged bring-up

Properties:

  • Logical RPO = 0, assuming:

    • No under-replicated partitions

    • Clean shutdown

    • Snapshot completion verified

  • Planned downtime required

  • No duplicate cluster

  • Lowest infrastructure cost

  • Requires operational precision


Option 2 — Dual Cluster + MirrorMaker2

Flow:

  1. Deploy new Kafka cluster in single AZ

  2. Configure MirrorMaker2

  3. Replicate topics

  4. Validate offsets

  5. Cut traffic

  6. Decommission old cluster

Properties:

  • Near-zero downtime

  • Higher infrastructure cost

  • More operational complexity

  • Easier rollback

Because downtime was acceptable and cost reduction was urgent, Option 1 was selected.


6. Multi-AZ Deployment YAML (Reproducible Lab Setup)

# multi-az-zk.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: tools
---
apiVersion: v1
kind: Service
metadata:
  name: zk-staging-headless
  namespace: tools
  labels:
    app: cp-zookeeper
    release: kafka-staging
spec:
  clusterIP: None
  selector:
    app: cp-zookeeper
    release: kafka-staging
  ports:
    - name: client
      port: 2181
      targetPort: 2181
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: zk-staging
  namespace: tools
spec:
  serviceName: zk-staging-headless
  replicas: 2
  selector:
    matchLabels:
      app: cp-zookeeper
      release: kafka-staging
  template:
    metadata:
      labels:
        app: cp-zookeeper
        release: kafka-staging
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: topology.kubernetes.io/zone
                    operator: In
                    values:
                      - ap-south-1b
                      - ap-south-1c
      containers:
        - name: cp-zookeeper
          image: docker.io/confluentinc/cp-zookeeper:5.5.6
          volumeMounts:
            - name: datadir
              mountPath: /var/lib/zookeeper/data
            - name: datalogdir
              mountPath: /var/lib/zookeeper/log
  volumeClaimTemplates:
    - metadata:
        name: datadir
      spec:
        accessModes:
          - ReadWriteOnce
        storageClassName: gp3
        resources:
          requests:
            storage: 10Gi
    - metadata:
        name: datalogdir
      spec:
        accessModes:
          - ReadWriteOnce
        storageClassName: gp3
        resources:
          requests:
            storage: 10Gi

Apply:

kubectl apply -f multi-az-zk.yaml

Validate zone distribution:

kubectl get nodes -L topology.kubernetes.io/zone
kubectl get pods -o wide -n tools
kubectl describe pv <pv-name>

7. Production Migration — Snapshot-Based Execution


Step 1 — Validate Kafka Stability

Before shutdown, ensure:

  • No under-replicated partitions

  • No leader elections are ongoing

  • ISR stable

Example:

kubectl logs <pod-name> -n tools

Step 2 — Protect Reclaim Policy

Before deleting PVCs:

kubectl get pv

Ensure persistentVolumeReclaimPolicy: Retain.

If not:

kubectl patch pv <pv-name> \
  -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

This prevents accidental EBS deletion.


Step 3 — Scale Down

kubectl scale statefulset zk-staging -n tools --replicas=0
kubectl get pods -n tools

Pre-Snapshot Validation — StatefulSet Ordinal Mapping (Critical)

Before taking snapshots, validate which EBS volume belongs to which StatefulSet ordinal.

StatefulSet PVC naming convention:

<claim-name>-<statefulset-name>-<ordinal>

Example:

datadir-zk-staging-0
datadir-zk-staging-1

Validate volume mapping:

kubectl get pvc -n tools
kubectl describe pvc datadir-zk-staging-1 -n tools

Extract:

Volume: pvc-xxxx

Then map to AWS volume:

aws ec2 describe-volumes \
  --filters Name=tag:KubernetesCluster,Values=<cluster-name>

Or inspect via:

kubectl describe pv <pv-name>

Confirm:

  • Correct ordinal

  • Correct AZ

  • Correct volume ID

If you restore the wrong ordinal volume to the wrong replica, data corruption or cluster quorum failure can occur.

Never assume volume ordering.

Validate explicitly.

Step 4 — Snapshot Volumes

aws ec2 create-snapshot \
  --volume-id vol-xxxx \
  --description "zk-migration"

Wait until the snapshot state = completed before proceeding.

Do not scale up or delete any additional resources until snapshot completion is verified via AWS CLI or console.


Step 5 — Restore With Matching Performance

Check original performance:

aws ec2 describe-volumes --volume-ids <volume-id>

Restore:

aws ec2 create-volume \
  --snapshot-id snap-xxxx \
  --availability-zone ap-south-1b \
  --volume-type gp3 \
  --iops 3000 \
  --throughput 125

Do not proceed until the restored volume state is "available" and fully initialized.

The restored volume must match the original volume type, IOPS, and throughput.

If performance parameters are reduced during restore:

  • Kafka disk flush latency may increase

  • Log segment recovery may slow

  • Consumer lag may spike

  • ZooKeeper session instability may occur

Storage migration must preserve performance characteristics, not just data.


Step 6 — Delete Replica PVCs

kubectl delete pvc datadir-zk-staging-1 -n tools
kubectl delete pvc datalogdir-zk-staging-1 -n tools

Because the reclaim policy is set to Retain, the underlying EBS volumes will not be deleted.

Step 7 — Static PV Restore YAML

# restore-pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-restore-datadir-1
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: gp3
  persistentVolumeReclaimPolicy: Retain
  csi:
    driver: ebs.csi.aws.com
    volumeHandle: vol-RESTORED-DATADIR
    fsType: ext4
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: topology.kubernetes.io/zone
              operator: In
              values:
                - ap-south-1b
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-restore-datalogdir-1
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: gp3
  persistentVolumeReclaimPolicy: Retain
  csi:
    driver: ebs.csi.aws.com
    volumeHandle: vol-RESTORED-DATALOGDIR
    fsType: ext4
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: topology.kubernetes.io/zone
              operator: In
              values:
                - ap-south-1b

Apply:

kubectl apply -f restore-pv.yaml
kubectl get pv

Step 8 — Restore PVC YAML

# restore-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: datadir-zk-staging-1
  namespace: tools
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: gp3
  resources:
    requests:
      storage: 10Gi
  volumeName: pv-restore-datadir-1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: datalogdir-zk-staging-1
  namespace: tools
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: gp3
  resources:
    requests:
      storage: 10Gi
  volumeName: pv-restore-datalogdir-1

Apply and verify:

kubectl apply -f restore-pvc.yaml
kubectl get pv,pvc -n tools

Step 9 — Restrict Scheduling to Single AZ

Update nodeAffinity to:

ap-south-1b

Apply updated StatefulSet.


Step 10 — Safe Staged Bring-Up

kubectl scale statefulset zk-staging -n tools --replicas=1

Validate stability.

Then:

kubectl scale statefulset zk-staging -n tools --replicas=2

Migration complete.


8. Architecture After Migration

Inter-AZ data transfer cost is eliminated because both replicas now operate within ap-south-1b, but AZ-level redundancy is removed.

Capacity Validation Before Consolidation

Before consolidating both replicas into a single Availability Zone, validate:

  • Worker node CPU headroom

  • Available memory capacity

  • EBS volume attachment limits per node

  • Network bandwidth availability

  • Failure of ap-south-1b will now result in a full service outage until recovery.

Single-AZ consolidation increases resource contention risk and expands the blast radius.

Cost optimization must not introduce saturation instability.

Trade-Off

Single AZ failure = Full outage.

Availability ↓

Cost ↓

Intentional architectural decision.


Rollback Strategy

If issues occur:

  1. Scale down

  2. Restore original snapshots in their original Availability Zones.

  3. Recreate original PV bindings

  4. Revert nodeAffinity

  5. Scale up gradually


Final Takeaway

This migration was not a scheduler tweak.

It was a storage topology redesign.

Stateful workloads are constrained by storage locality.

Storage must move before pods move.

Ops Migration Playbooks

Part 1 of 6

Real production migration and incident playbooks focused on safe execution, root cause analysis, and rollback-first DevOps practices. Each post documents how real production issues were handled and fixed without downtime.

Up next

Cross-Cloud VM Migration: GCP → AWS Using AWS Application Migration Service (MGN)

Cross-cloud VM migration is not a disk copy task. It is: An access model transformation A replication lifecycle management exercise A downtime control operation A cost boundary decision We execu