Deploying Apache Superset on Kubernetes (Helm): From Chaos to Production
Real-world failure analysis, custom image build, and secure production deployment with flexible DB architecture.

Introduction
Deploying Apache Superset on Kubernetes using the official Helm chart appears straightforward when following the documentation. In real-world environments, however, production deployments often expose issues across multiple layers — Helm dependency resolution, container image integrity, Python runtime behavior, database connectivity, and secret management.
This article walks through a real-world failure analysis, explains the root causes, and documents the production-ready deployment that supports:
In-cluster PostgreSQL & Redis
External PostgreSQL (e.g., AWS RDS) & External Redis
Optional Kubernetes Secret–based credential injection
The final architecture is flexible, secure, and restart-safe.
1. Problem Statement
We attempted to deploy Apache Superset on Kubernetes using the official Helm chart.
Target Setup
Apache Superset (Web + Celery Worker)
PostgreSQL (metadata database)
Redis (Celery broker and caching)
Kubernetes
Helm-based deployment
Custom Superset image
Optional external PostgreSQL (AWS RDS)
Optional external Redis
Expected Outcome
Superset UI accessible
Database migrations completed successfully
Celery workers start without errors
Stable across restarts
Secure credential handling
What Actually Happened
The deployment failed at multiple stages:
Dependency image pull failures
Python module errors inside the container
Runtime package installation failures
SECRET_KEY validation error
Database connectivity issues
This was a multi-layer failure — not a single misconfiguration.
2. Issue #1: PostgreSQL and Redis Images Not Found
Observed Error
ImagePullBackOff
Failed to pull image
not found
Both PostgreSQL and Redis pods failed to start.
Root Cause
The Helm chart referenced specific image tags that were no longer available in the container registry.
Helm does not validate tag existence.
Kubernetes only detects the failure during image pull.
Until dependencies are healthy:
Superset init job cannot complete
Application errors remain hidden
Debugging becomes misleading
Infrastructure must be stable before diagnosing application issues.
3. Fix #1: Diagnostic Use of latest
To confirm whether the issue was caused by deprecated image tags or application logic, dependency images were temporarily switched to latest.
postgresql:
image:
tag: latest
redis:
image:
tag: latest
This confirmed:
The Helm chart’s default tags were deprecated.
The infrastructure was blocking deployment.
Superset itself was not the initial issue.
⚠ The latest tag was used only for diagnostics.
In production environments, pinned image versions are recommended for deterministic deployments.
Once dependencies were running, the real application error surfaced.
4. Issue #2: psycopg2 Module Missing
Superset failed with:
ModuleNotFoundError: No module named 'psycopg2'
This affected:
Superset Web pod
Superset Worker pod
Superset Init DB job
5. Why This Breaks Superset
Superset requires a metadata database.
Dependency chain:
Superset → SQLAlchemy → psycopg2 → PostgreSQL
If psycopg2 is missing:
Superset cannot start
Database migrations fail
Celery workers fail
No fallback mode exists
6. Why Runtime Installation Failed
Attempts included:
extraPipPackagesbootstrapScriptInstalling packages inside running pods
Init container installation
All failed.
Root Cause
The official Superset image runs inside a prebuilt Python virtual environment:
/app/.venv/
Key details:
Superset executes strictly inside this environment.
Runtime installations either failed.
Or installed packages outside the active environment.
Container immutability was violated.
Even when psycopg2 appeared installed, it was outside Superset’s active virtual environment — making it effectively unusable.
7. Correct Fix: Build a Custom Immutable Superset Image
Database drivers must be installed at image build time.
Dockerfile Used
FROM apachesuperset.docker.scarf.sh/apache/superset:3.0.0
USER root
RUN apt-get update && apt-get install -y libpq-dev gcc \
&& /app/.venv/bin/python -m ensurepip --upgrade \
&& /app/.venv/bin/python -m pip install --no-cache-dir psycopg2==2.9.9
USER superset
Why This Works
Installs psycopg2 inside Superset’s active virtual environment
Immutable and reproducible
Restart-safe
Production aligned
8. Flexible Credential Management
Superset supports multiple ways to provide database and Redis credentials.
Option A – Directly in Helm Values (Testing Only)
supersetNode:
connections:
db_type: postgresql
db_host: my-db-endpoint
db_port: "5432"
db_user: superset
db_pass: superset123
db_name: superset
Suitable for:
Local testing
Temporary debugging
Learning environments
⚠ Credentials stored in plaintext.
Option B – Kubernetes Secret Injection (Recommended)
Instead of storing credentials in Helm values, they can be injected securely.
Create Secret
kubectl create secret generic superset-backend-secret \
--from-literal=DB_HOST=<db-endpoint> \
--from-literal=DB_PORT=5432 \
--from-literal=DB_USER=<db-user> \
--from-literal=DB_PASSWORD=<db-password> \
--from-literal=DB_NAME=<db-name> \
--from-literal=REDIS_HOST=<redis-endpoint> \
--from-literal=REDIS_PORT=6379
Reference Secret in Helm Values
envFromSecrets:
- superset-backend-secret
Superset connections then use environment variables:
supersetNode:
connections:
db_type: postgresql
db_host: "$(DB_HOST)"
db_port: "$(DB_PORT)"
db_user: "$(DB_USER)"
db_pass: "$(DB_PASSWORD)"
db_name: "$(DB_NAME)"
redis_host: "$(REDIS_HOST)"
redis_port: "$(REDIS_PORT)"
Benefits:
No plaintext credentials in Git
Secure runtime injection
Easier rotation
Environment portability
Using Kubernetes Secrets is optional but strongly recommended for production.
9. Database & Redis Architecture Options
Superset supports two architectural modes.
Option 1 – In-Cluster PostgreSQL & Redis
Enable Helm-managed dependencies:
postgresql:
enabled: true
redis:
enabled: true
Best for:
Development
Testing
Small internal tools
Pros:
Simple
Self-contained
Cons:
You manage backups
You manage scaling
Higher operational overhead
Option 2 – External PostgreSQL & Redis (Optional)
Disable internal services:
postgresql:
enabled: false
redis:
enabled: false
Best for:
Production
High availability needs
Managed backups
Reduced operational risk
Pros:
Managed durability
Better reliability
Clear stateless/stateful separation
External services are optional — the deployment remains flexible.
The final production architecture is designed to support both Helm-managed in-cluster stateful services and externally managed database/cache services (such as AWS RDS and ElastiCache), ensuring operational flexibility and scalability across environments.
10. Enforcing SSL for Database Connections
import os
SQLALCHEMY_DATABASE_URI = (
f"postgresql+psycopg2://{os.environ['DB_USER']}:{os.environ['DB_PASSWORD']}"
f"@{os.environ['DB_HOST']}:{os.environ['DB_PORT']}/{os.environ['DB_NAME']}"
"?sslmode=require"
)
Ensures encrypted communication with PostgreSQL.
11. Startup Readiness Handling
Init containers wait for DB and Redis:
command:
- dockerize
- -wait
- tcp://$(DB_HOST):$(DB_PORT)
- -wait
- tcp://$(REDIS_HOST):$(REDIS_PORT)
- -timeout
- 120s
Prevents:
CrashLoopBackOff
Early DB connection failures
Celery startup issues
12. Secure SUPERSET_SECRET_KEY
extraSecretEnv:
SUPERSET_SECRET_KEY: <strong-random-secret>
Superset refuses to start without a secure secret key.
13. Final Deployment
helm upgrade --install superset apache/superset \
-f values.yaml \
--namespace superset \
--create-namespace
14. Final Root Cause Summary
Deployment failed due to:
Deprecated dependency image tags
Missing psycopg2 driver in container
Runtime package installation is incompatible with Superset’s virtual environment
Missing secure SECRET_KEY
Resolution involved:
Diagnosing infrastructure image failures
Building a custom immutable Superset image
Securely injecting credentials
Supporting flexible DB/Redis architecture
Enforcing SSL
Implementing readiness checks
15. 30-Second Summary
Apache Superset initially failed due to deprecated dependency image tags and a missing PostgreSQL driver inside the container. Runtime installation failed because the official Superset image runs inside a prebuilt Python virtual environment, making post-start package installation ineffective. The issue was resolved by building a custom immutable image with psycopg2 installed at build time, securely managing credentials, and supporting both in-cluster and external database/Redis architectures. The final deployment is stable, secure, and production-ready.
Keywords:
Apache Superset Kubernetes,
Superset Helm Chart,
Superset Production Deployment,
psycopg2 error in Superset,
Kubernetes ImagePullBackOff,
Superset with AWS RDS,
Superset External PostgreSQL,
Superset Redis configuration






