No active OpenBao leader
Use this runbook when the OpenBaoNoActiveNode alert fires because Prometheus
does not see an active OpenBao node. The steps help you determine whether the
cluster has no leader, Prometheus is missing the active node, or the cluster is
sealed or partitioned.
Before you begin
- Get access to Prometheus or the metrics backend that evaluates the alert.
- Get OpenBao CLI access to at least one reachable node.
- Get access to platform networking, storage, and pod or process health data.
- Confirm whether the deployment uses integrated storage with Raft.
Confirm the alert signal
Query the active-node metric.
sum( ${p}_core_active )${p}: Metric prefix for your deployment. Usevaultfor the OpenBao default prefix oropenbaowhen you configuredmetrics_prefix = "openbao".
Check the raw series to identify which nodes Prometheus still scrapes.
${p}_core_activeIf Prometheus is missing one or more OpenBao targets, switch to OpenBao metrics scrape failing .
Check node state
Query leader status on a reachable node.
curl -fsS http://<openbao_address>/v1/sys/leader<openbao_address>: OpenBao API address for a reachable node, including scheme and port.
Check seal status on each node.
bao status -address=<openbao_address><openbao_address>: OpenBao API address for the node being checked.
If every reachable node is sealed, switch to OpenBao sealed unexpectedly .
Check operational logs for leader election, storage, network, and Raft messages.
journalctl -u openbao --since <incident_start><incident_start>: Time shortly before the alert first fired.
Check Raft health
Use these steps when the deployment uses integrated storage.
List Raft peers from a reachable node.
bao operator raft list-peers -address=<openbao_address><openbao_address>: OpenBao API address for a reachable node.
Check Autopilot state when Autopilot is enabled.
bao operator raft autopilot state -address=<openbao_address><openbao_address>: OpenBao API address for a reachable node.
Check whether the cluster still has quorum. A Raft cluster cannot elect a leader without quorum.
Check pod, VM, or host reachability between peers on the OpenBao cluster address and storage paths.
Restore leadership
Restore sealed nodes, failed pods, failed VMs, or broken network paths.
Restore storage backend availability before forcing process restarts.
Restart failed OpenBao processes one at a time. Confirm each node rejoins before moving to the next node.
Do not remove Raft peers, restore snapshots, or rebootstrap a cluster unless your incident commander approves the action and you have a current backup.
If an active node exists but clients still fail, check load balancer health checks and service selectors.
Verify the result
Confirm exactly one active node.
sum( ${p}_core_active )Confirm that the leader endpoint identifies a leader.
curl -fsS http://<openbao_address>/v1/sys/leader<openbao_address>: OpenBao API address for a reachable node.
Confirm that clients can complete a permitted request through the normal service endpoint.
Wait for the alert window to pass and confirm that
OpenBaoNoActiveNoderesolves.
Troubleshooting
Metrics show no active node but the API has a leader
Prometheus probably does not scrape the active node. Fix service discovery or the active node scrape target before changing OpenBao.
Nodes are unsealed but no leader is elected
Check quorum, storage, and peer connectivity. For Raft, inspect peer state before making membership changes.
More than one node appears active
Switch to Multiple active OpenBao nodes . Treat the incident as possible split brain until you prove the signal is a scrape artifact.
What’s next
- Use OpenBao metrics scrape failing if Prometheus is missing the active node.
- Use OpenBao sealed unexpectedly if leader loss follows a seal event.
Source: OpenBao documents leader status in the OpenBao leader API documentation . OpenBao documents Raft peer inspection and Autopilot state in the OpenBao raft command documentation .