Secret engine feature warnings
Use this runbook when a PKI, Transit, or database secrets engine warning fires for OpenBao. These alerts point to failed or unusually slow feature-specific operations and need correlation with audit logs, operational logs, and the backend or application context behind the secret engine.
Before you begin
- Get access to Prometheus or the metrics backend that evaluates the alert.
- Get access to OpenBao operational logs and audit logs.
- Get OpenBao CLI access with permission to inspect the affected secret engine.
- Get access to the external PKI, database, or application platform if the alert points to a backend dependency or client workload.
- Get approval from the affected secret engine owner before you change roles, issuers, root credentials, leases, or mount configuration.
[!WARNING] Do not rotate root credentials, revoke certificates, revoke leases, or change issuer configuration only to clear an alert. These actions can affect active workloads and must follow your local change or incident process.
Confirm the warning
Check which warning fired.
ALERTS{alertstate="firing", alertname=~"OpenBao(PKI|Transit|DatabaseCredential).*"}Open the
OpenBao secret engines and mountsdashboard.Open the
OpenBao PKI,OpenBao Transit, orOpenBao database secretsdashboard when a feature-specific warning fires.Check whether the warning correlates with request latency, storage latency, audit failures, or HA/Raft alerts.
openbao:core_handle_request:avg5m openbao:barrier_get:avg5m openbao:audit_log_request_failure:increase5m openbao:autopilot_healthy:maxCheck operational logs around the alert window.
{log_stream="openbao.operational"} |~ "(?i)(pki|transit|database|plugin|lease|revoke|issuer|certificate|connection|crypto|key|timeout|error|failed)"
Investigate PKI warnings
Check PKI failure counters.
openbao:pki_issue_failure:increase15m openbao:pki_revoke_failure:increase15mCheck PKI operation rate and latency.
openbao:pki_issue:rate5m openbao:pki_revoke:rate5m openbao:pki_issue:avg5m openbao:pki_revoke:avg5mCheck audited PKI requests.
{log_stream="openbao.audit"} | json request_path="request.path", audit_error="error" | request_path=~"pki/(roles|issue|issuer|root|cert|tidy|revoke).*"Inspect PKI mount configuration and issuer state.
bao secrets list -detailed -address=<openbao_address> bao read -address=<openbao_address> pki/cert/ca bao list -address=<openbao_address> pki/roles<openbao_address>: OpenBao API address for a reachable active node.
If certificate issue failures affect a specific role, inspect that role before you change issuer or mount-level configuration.
bao read -address=<openbao_address> pki/roles/<role_name>
Investigate Transit warnings
Check audited Transit response errors.
{log_stream="openbao.audit"} | json audit_type="type", request_path="request.path", audit_error="error" | audit_type="response" | audit_error!="" | request_path=~"transit/(keys|encrypt|decrypt|rewrap|sign|verify|hmac|random|hash|datakey).*"Check denied Transit requests.
{log_stream="openbao.audit"} | json request_path="request.path", audit_error="error" | audit_error=~"(?s).*permission denied.*" | request_path=~"transit/(keys|encrypt|decrypt|rewrap|sign|verify|hmac|random|hash|datakey).*"Check whether errors affect key management or cryptographic operations.
{log_stream="openbao.audit"} | json audit_type="type", request_path="request.path", request_id="request.id" | audit_type="request" | request_path=~"transit/(keys|encrypt|decrypt|rewrap|sign|verify|hmac|random|hash|datakey).*"Inspect Transit mount configuration and key metadata before you change key policy, deletion settings, or rotation settings.
bao secrets list -detailed -address=<openbao_address> bao list -address=<openbao_address> transit/keys bao read -address=<openbao_address> transit/keys/<key_name>If errors affect decrypt, verify, or rewrap operations, check for recent key rotations, key version changes, policy changes, or application release changes before you rotate again.
Investigate database warnings
Check database operation failure counters.
openbao:database_initialize_error:increase15m openbao:database_close_error:increase15m openbao:database_new_user_error:increase15m openbao:database_update_user_error:increase15m openbao:database_delete_user_error:increase15mCheck database credential operation rates and latency.
openbao:database_new_user:rate5m openbao:database_update_user:rate5m openbao:database_delete_user:rate5m openbao:database_new_user:avg5m openbao:database_update_user:avg5m openbao:database_delete_user:avg5m openbao:database_close:avg5mCheck dynamic secret lease creation by engine.
openbao:secret_lease_creation_by_engine:increase15mIf lease creation is concentrated in one tenant, use namespace drilldown without adding namespace to alert labels.
topk(10, openbao:secret_lease_creation_by_engine_namespace:increase15m{secret_engine="database"})Check audited database secrets engine requests.
{log_stream="openbao.audit"} | json request_path="request.path", audit_error="error" | request_path=~"database/(config|roles|creds|static-roles|static-creds|rotate-root|rotate-role).*"Inspect database secrets engine configuration and roles.
bao secrets list -detailed -address=<openbao_address> bao read -address=<openbao_address> database/config/<connection_name> bao read -address=<openbao_address> database/roles/<role_name>Check the external database directly for connection limits, authentication failures, lock waits, permission errors, or slow credential-management statements.
Restore the baseline
If failures correlate with external backend errors, restore the external backend before you change OpenBao configuration.
If failures started after a role, issuer, Transit key, policy, plugin, or mount change, roll back or repair that change with the owner.
If database revocation fails, identify affected leases before you revoke or tidy lease state.
bao list -address=<openbao_address> sys/leases/lookup/database/creds/<role_name>/If PKI issue latency rises during expected high certificate volume, record the new baseline and expected duration in the change record.
If PKI revoke latency rises with storage or Raft symptoms, use the HA/Raft runbook before you tune PKI settings.
If Transit errors affect application decrypt, verify, or rewrap traffic, coordinate recovery with the application owner before you delete keys, change key versions, disable deletion protection, or change derived key settings.
Verify the result
Confirm that failure counters stop increasing.
openbao:pki_issue_failure:increase15m openbao:pki_revoke_failure:increase15m openbao:database_new_user_error:increase15m openbao:database_update_user_error:increase15m openbao:database_delete_user_error:increase15m openbao:database_close_error:increase15mConfirm that Transit audit response errors stop increasing.
sum(count_over_time({log_stream="openbao.audit"} | json audit_type="type", request_path="request.path", audit_error="error" | audit_type="response" | audit_error!="" | request_path=~"transit/(keys|encrypt|decrypt|rewrap|sign|verify|hmac|random|hash|datakey).*" [5m]))Confirm that operation latency returns toward baseline.
openbao:pki_issue:avg5m openbao:pki_revoke:avg5m openbao:database_new_user:avg5m openbao:database_update_user:avg5m openbao:database_delete_user:avg5m openbao:database_close:avg5mConfirm that operational logs no longer show correlated backend or plugin errors.
{log_stream="openbao.operational"} |~ "(?i)(pki|transit|database|plugin|lease|revoke|crypto|key)" |~ "(?i)(error|failed|timeout|denied)"Wait for the alert window to pass and confirm that the warning resolves.
Troubleshooting
The alert fires with no dashboard data
Confirm that generated recording rules are loaded and that Prometheus scrapes
OpenBao source metrics with the expected vault_* or openbao_* prefix.
Failure counters are empty
The failure metrics are optional and only appear after OpenBao emits the underlying source counter. Check audit and operational logs to confirm whether the alert came from a stale recording rule, a log-based Transit alert, or a now-resolved failure.
Transit alert does not match a custom mount
The generated Transit warning matches the default transit mount path. If your
deployment mounts Transit elsewhere, copy the alert and replace the
transit/ path prefix with the approved mount path.
Latency is high but operations still succeed
Treat the warning as early pressure. Check storage, Raft, external database, and client workload changes before you change secret engine configuration.
What’s next
- Use OpenBao secret engines and mounts dashboard to inspect feature metrics and audit context together.
- Use OpenBao PKI dashboard to inspect PKI operation metrics and certificate lifecycle audit streams together.
- Use OpenBao Transit dashboard to inspect Transit key management, cryptographic operations, and response errors.
- Use OpenBao database secrets dashboard to inspect database operation rates, latency, failures, leases, and audit streams together.
- Use Irrevocable leases present when database credential revocation leaves leases behind.
- Use OpenBao Raft and Autopilot health when feature latency correlates with storage or Raft symptoms.
Source: OpenBao documents telemetry metric behavior in the OpenBao telemetry metrics overview . OpenBao documents database secrets engine behavior in the OpenBao database secrets engine documentation . OpenBao documents PKI behavior in the OpenBao PKI documentation . OpenBao documents Transit behavior in the OpenBao Transit documentation .