Version: 0.4.x

Diagnose a sealed cluster

A sealed cluster usually means the Pods can start but cannot complete the configured trust or unseal path they need to serve traffic. Use this runbook to start with operator-visible conditions, then narrow the problem by seal mode and move to emergency manual unseal only if needed.

Use this runbook when

Pods are running but remain sealed and not ready
the cluster reports OpenBaoSealed=True
cloud KMS, transit, TLS, or static-key dependencies might be blocking startup
you need to decide whether this is a seal problem or a broader quorum problem

Read the first conditions.
Condition or signal	What it usually means	Where to look next
`OpenBaoSealed=True` while Pods are running	The workload is up far enough to report status, but the unseal path is still blocked.	Check the configured seal mode and the corresponding credentials or trust material.
`CloudUnsealIdentityReady=False`	The workload identity or cloud credentials for a cloud KMS backend are not usable.	Inspect the identity binding, IAM policy, and KMS reachability.
`TLSReady=False`	The cluster may not trust the configured certificates or may be missing required TLS material.	Inspect the rendered TLS Secrets and pod logs for `x509` errors.
The cluster unseals but still does not become active.	This may no longer be a seal problem.	Move to Recover from No Leader.

Inspect the operator-visible state first

kubectl get openbaocluster <name> -n <namespace> \
-o jsonpath='{range .status.conditions[*]}{.type}={.status} {.reason}{"\n"}{end}'
kubectl get openbaocluster <name> -n <namespace> -o yaml | yq '.spec.unseal'
kubectl logs -n <namespace> <pod-name> | grep -i unseal

Focus on OpenBaoSealed, CloudUnsealIdentityReady, and TLSReady. These usually tell you whether the next step is credentials, trust, or network rather than generic application debugging.

Diagnose by seal mode

Static
Transit
Cloud KMS
KMIP / PKCS#11
Manual emergency

Use this path when the cluster reads its unseal key from a Kubernetes Secret.

kubectl get secret -n <namespace> <cluster-name>-unseal-key
kubectl get secret -n <namespace> <cluster-name>-unseal-key -o jsonpath='{.data}'

The Secret must exist and use the expected key name key.

kubectl create secret generic <cluster-name>-unseal-key -n <namespace> \
--from-literal=key=<UNSEAL_KEY> \
--dry-run=client -o yaml | kubectl apply -f -

Use this path when the cluster unseals through another OpenBao deployment.

Transit-specific failure signals.
Signal	Likely cause	Fix first
`permission denied` or auth failures	The token, auth path, or transit policy is wrong.	Replace the credentials Secret and verify the transit-side role or policy.
`x509` or trust-chain errors	The transit CA or client certificate material does not match the endpoint.	Reconcile the Secret contents and referenced TLS file paths.
`context deadline exceeded`	The transit endpoint is not reachable from the workload.	Check DNS, egress rules, and the remote endpoint health.

Use this path for AWS KMS, GCP Cloud KMS, Azure Key Vault, or OCI KMS unseal backends.

kubectl logs -n <namespace> <pod-name> | grep -Ei 'unseal|kms|decrypt|accessdenied|forbidden|timeout'

Cloud KMS failure patterns.
Log or condition	Likely cause	Fix first
`CloudUnsealIdentityReady=False` or `AccessDenied`	The workload identity or IAM policy is not allowed to decrypt.	Fix the ServiceAccount binding and grant the decrypt permission on the configured key.
`context deadline exceeded`	The cluster cannot reach the KMS endpoint.	Check egress rules, proxy behavior, firewall policy, and DNS.
Provider-side `5xx` errors	The KMS service itself may be degraded.	Confirm regional health and retry only when the upstream service is stable.

Use this path for external HSM or KMIP-backed unseal modes.

Check these first:

referenced client certificate, key, and CA material
library and device mount paths for pkcs11
network reachability to the KMIP endpoint
rendered seal configuration inside the Pod

These modes do not report CloudUnsealIdentityReady, so pod logs and rendered configuration are the primary signal surface.

Emergency only

Manual unseal is the escape hatch when automation is broken and you need immediate access. It does not fix the underlying seal path. Use it to regain access, then repair the actual trust or credential dependency.

kubectl exec -n <namespace> -it <pod-name> -- sh
bao operator unseal

Repeat this on every Pod that needs to join the active cluster. If the cluster then stays sealed again after restart, return to the relevant automated seal mode and fix it there.

Verify the cluster is actually serving again

kubectl get openbaocluster <name> -n <namespace>
kubectl exec -n <namespace> -it <pod-name> -- bao status

If the cluster unseals but only reaches standby state or still cannot elect a leader, move to Recover from No Leader.

Continue with the right recovery path

Unseal configurationReturn to the exact provider and Secret contract when the incident came from wrong credential shape or mounted file paths.Recover from no leaderSwitch here when sealing is fixed but the cluster still cannot elect or keep a leader.Enter safe modeInspect and acknowledge break glass only after the seal path and workload health are stable.Run a restoreUse the restore workflow if the live cluster is no longer the safest path to service recovery.

Published release documentation

You are reading docs for version 0.4.x. Use the version menu to switch to next or another archived release.

Inspect the operator-visible state first​

Diagnose by seal mode​

Verify the cluster is actually serving again​

Inspect the operator-visible state first

Diagnose by seal mode

Verify the cluster is actually serving again