Skip to main content
Version: next

Operate

Cluster operations

Use these guides for the day 2 lifecycle of an OpenBao cluster, including production readiness, backups, upgrades, maintenance, troubleshooting, and decommissioning.

Reliability and change

  1. 01

    Production checklist

    Use the checklist to confirm an environment is production-ready and supportable.

    Open
  2. 02

    Configure backups

    Set up snapshot streaming, backup identity, and restore readiness as part of the operating baseline.

    Open
  3. 03

    Plan upgrades

    Use RollingUpdate or BlueGreen with a clear understanding of prerequisites, cutover, and retry behavior.

    Open

Cluster controls

  1. 01

    Run planned maintenance

    Use maintenance workflows for controlled disruption, scaling, and planned cluster interventions.

    Open
  2. 02

    Pause reconciliation

    Temporarily stop operator-driven changes when you need manual intervention or a controlled investigation window.

    Open
  3. 03

    Decommission a cluster

    Remove a cluster with the deletion policy that matches data, PVCs, and external backups.

    Open

Troubleshooting and recovery

  1. 01

    Troubleshoot the cluster

    Use conditions, events, and common failure patterns to understand the problem before it becomes a wider incident.

    Open
  2. 02

    Recovery and restore

    Use safe mode, no-leader, sealed-cluster, and restore workflows when normal operations are no longer enough.

    Open

Supporting context

Next release documentation

You are reading the unreleased main docs. Use the version menu for the newest published release, or check the release notes for what is already out.

Was this page helpful?

Use Needs work to open a structured GitHub issue for this page. The Yes button only acknowledges the signal locally.