Production Checklist¶
Before deploying OpenBao Operator in production, complete this checklist to ensure a secure, reliable, and compliant deployment.
Critical Security¶
Security Hardening (Required)
Failure to configure these settings puts your cluster at significant risk.
- Hardened Profile: Set
spec.profile: Hardenedto enforce secure defaults. - External Unseal: Use Transit or Cloud KMS. Do NOT use auto-unseal with Kubernetes Secrets in production.
- Etcd Encryption: Ensure your Kubernetes cluster enables encryption at rest for Secrets (where unseal keys might be stored).
- TLS Mode: Use
ACME(Let's Encrypt) orExternal(Custom CA). AvoidOperatorManagedfor public-facing endpoints. - Self-Initialization: Enable
spec.selfInitto prevent the initial root token from ever being surfaced to the operator or logs.
Admission Control
Without these policies, tenant isolation cannot be guaranteed.
- ValidatingAdmissionPolicies: Verify that
validate-openbaoclusterandopenbao-restrict-provisioner-delegateare installed and Enforced.
Reliability & Scale¶
Resource Planning
- Resources: Set explicit
requestsandlimits. Minimum 256Mi memory for small clusters; scale CPU based on expected request rate. - Storage Class: Use a high-performance (SSD), low-latency StorageClass. Raft requires low fsync latency.
- Volume Size: Plan for growth. Raft snapshots can consume significant space.
Availability
- Topology Spread: Ensure your
Kubernetescluster has nodes in multiple zones. The Operator automatically sets standard anti-affinity. - Replica Count: Use at least 3 replicas for high availability.
Day 2 Operations¶
Operational Readiness
- Backups: Configure scheduled backups to S3/GCS. Test a restore before going live.
- Network Policy: Verify
egressRulesallow access to necessary external services (Cloud KMS, S3, OIDC providers). - Monitoring: Ensure Prometheus is scraping
openbao_*metrics and alerts are configured for high error rates or leader loss. - Logs: Verify structured logs (
cluster_name,cluster_namespace) are reaching your log aggregator. - Alerts: Configure alerts for backup staleness, cluster degradation, and reconciliation errors.
Final Verification¶
Check the cluster status one last time before routing traffic:
Success Criteria:
- Condition
ProductionReadyis True. - Condition
Availableis True. -
Status.Phaseis Running.