Skip to main content
Version: 0.1.0

Decision matrix

Production gates

Production gates.
GateWhat must be trueWhy it mattersGo deeper
ObservabilityMetrics, logs, and alerts reach the systems operators actually watch.Incidents are slower and riskier when the first debugging session starts after go-live.Observability and network egress configuration.
Cluster readinessThe status conditions show healthy convergence and no unresolved integration blockers.A production launch should start from a stable status surface, not an optimistic assumption.Use the final verification commands on this page.

Lock down the security baseline

  • Set spec.profile: Hardened so the workload starts from the strict controller posture rather than the evaluation defaults.
  • Use a non-static external seal such as Transit, cloud KMS, ocikms, kmip, or pkcs11. Do not keep long-lived unseal keys in Kubernetes Secrets for the production path.
  • Confirm your Kubernetes cluster already encrypts Secrets at rest. The operator cannot compensate for an unencrypted control plane.
  • Use ACME or External TLS for public or shared edges. Avoid OperatorManaged certificates for public-facing production entry points.
  • Enable spec.selfInit and configure real user authentication in spec.selfInit.requests so the first operator-driven bootstrap does not end in a lockout.
  • If you rely on operator lifecycle auth for backups and upgrades, enable spec.selfInit.oidc.enabled: true or deliberately provision the equivalent JWT roles yourself.
Do not stop at install success

A cluster that initializes successfully is not automatically ready for production. The production gate is the combination of security hardening, backup readiness, and clean status conditions, not the fact that pods started once.

Enforce the tenant guardrails

  • Verify the ValidatingAdmissionPolicies and related guardrails are installed and enforced, including:
    • openbao-validate-openbaocluster
    • openbao-validate-openbao-tenant
    • openbao-validate-openbaorestore
    • openbao-lock-controller-statefulset-mutations
    • openbao-lock-managed-resource-mutations
    • openbao-enforce-managed-image-digests
    • openbao-restrict-provisioner-rbac
    • openbao-restrict-provisioner-namespace-mutations
    • openbao-restrict-provisioner-tenant-governance
    • openbao-restrict-controller-rbac
    • openbao-restrict-controller-secret-writes
  • Confirm that the operator namespace, tenant onboarding flow, and shared-controller trust boundaries match the tenancy model you chose during Get Started.

Inspect

Inspect the control-plane baseline

bash

kubectl get validatingadmissionpolicy | grep openbao
kubectl get deploy -n <operator-namespace>
kubectl get openbaotenant -A

The exact number of policies and controller Deployments depends on the features you enabled, but the OpenBao guardrail set should be visible before you bring real tenants onto the platform.

Make the cluster durable

  • Set explicit CPU and memory requests and limits. A cluster that only works under zero pressure is not production-ready.
  • Choose a low-latency StorageClass and set spec.storage.storageClassName explicitly for new clusters. The effective storage class is not something you want to discover by accident after PVC creation.
  • Use at least three replicas for a highly available Raft cluster and verify the Kubernetes nodes span the intended zones or failure domains.
  • Configure scheduled backups and test a restore path before the first risky upgrade.
  • Confirm spec.network.egressRules allow the cluster to reach the services it really depends on: cloud KMS, OIDC discovery, backup storage, and any external gateway edges.
Backups are part of the production gate

Treat backup success and restore confidence as part of the launch checklist, not as follow-up work for a later sprint.

Prove observability and operational response

  • Configure metrics scraping through Prometheus Operator (ServiceMonitor) or VictoriaMetrics Operator (VMServiceScrape).
  • Grant the scraping identity permission to read /metrics and keep TLS verification strict in production.
  • Make sure structured logs including cluster_name and cluster_namespace reach the log system your operators actually use.
  • Alert on backup staleness, degradation, reconciliation failures, and other conditions that should wake a human before tenants feel the failure.

Verify the cluster before routing traffic

Verify

Inspect the final readiness surface

bash

kubectl describe openbaocluster <name> -n <namespace>
kubectl get openbaocluster <name> -n <namespace> -o jsonpath='{.status.phase}{"\n"}{range .status.conditions[*]}{.type}={.status}{"\n"}{end}'

Run both commands from the target namespace so you can see the reconciler status, recent events, and the final condition set in one pass.

Reference table

Signals to see before go-live

Signals to see before go-live.
SignalHealthy stateWhy it is important
AvailableTrueThe workload is up and the operator believes the service is available to consumers.
ProductionReadyTrueThis is the clearest signal that the cluster passed the production-readiness gate.
Integration-specific conditionsHealthy for the features you enabled, such as CloudUnsealIdentityReady, GatewayIntegrationReady, APIServerNetworkReady, or BackupConfigurationReady.These conditions expose dependency problems that may not show up as plain pod readiness failures.

Continue operating

Published release documentation

You are reading docs for version 0.1.0. Use the version menu to switch to next or another archived release.

Was this page helpful?

Use Needs work to open a structured GitHub issue for this page. The Yes button only acknowledges the signal locally.