Run the cluster as a service, not as a one-time install.

This section covers the full day 2 lifecycle of an OpenBao cluster: make it reliable, protect data before risky change, use cluster controls deliberately, and move from troubleshooting into recovery when normal operations are no longer enough.

Open the production checklist Open recovery and restore

Reliability & Change

01
Production checklist
Use the checklist before you call an environment production-ready or supportable.
Open
02
Configure backups
Set up snapshot streaming, backup identity, and restore readiness before you need them.
Open
03
Plan upgrades
Use RollingUpdate or BlueGreen with a clear understanding of prerequisites, cutover, and retry behavior.
Open

Cluster Controls

01
Run planned maintenance
Use maintenance workflows for controlled disruption, scaling, and planned cluster interventions.
Open
02
Pause reconciliation
Temporarily stop operator-driven changes when you need manual intervention or a controlled investigation window.
Open
03
Decommission a cluster
Remove a cluster deliberately with the right deletion policy for data, PVCs, and external backups.
Open

Troubleshooting & Recovery

01
Troubleshoot the cluster
Use conditions, events, and common failure patterns to understand the problem before it becomes a wider incident.
Open
02
Recovery and restore
Use safe mode, no-leader, sealed-cluster, and restore workflows when normal operations are no longer enough.
Open

Supporting context

Recovery & RestoreUse the incident runbooks when symptoms turn into an active recovery event.Operation lifecycle architectureRead the architecture deep dive when you need to understand locks, manager ownership, or serialized operations.

Prerelease documentation

This version tracks a prerelease build. Features and behavior may change before the next stable release.

Run the cluster as a service, not as a one-time install.

Production checklist

Configure backups

Plan upgrades

Run planned maintenance

Pause reconciliation

Decommission a cluster

Troubleshoot the cluster

Recovery and restore