Understanding OpenBao metrics

Use this explainer to understand how OpenBao metrics become recording rules, dashboards, and alerts in this reference architecture. It is for operators who need to read metric names, prefixes, labels, and scrape-profile assumptions without reverse-engineering PromQL.

Why this matters

OpenBao metrics are the fastest way to detect health, rate, latency, saturation, runtime pressure, audit-device failures, token pressure, lease pressure, and HA/Raft state.

They are also easy to misread. Raw source metrics can use historical vault naming, labels vary by metric family, some gauges update slowly, and scrape profile determines which nodes Prometheus can see.

Mental model

Read the metrics pipeline in layers.

OpenBao telemetry
  -> /v1/sys/metrics?format=prometheus
  -> Prometheus source metrics
  -> normalized openbao: recording rules
  -> dashboards and alerts

Validate raw source metrics when you troubleshoot scraping or OpenBao telemetry. Use normalized recording rules for dashboards and alerts when a rule exists.

Metric types

OpenBao documents three telemetry metric types.

TypeMeaningHow to read it
CounterAn event count that increases when something happens.Use rate() or increase() over a time window.
GaugeA current value.Read current value or a max/min over a time window.
SummaryObservations for discrete work, often duration.Use _sum and _count to calculate average latency.

High-cardinality usage gauges, such as token counts and secret counts, update on usage_gauge_period. The OpenBao default is 10 minutes. Do not read those gauges as per-scrape real-time inventory.

Source metric names

OpenBao documentation uses dot-separated metric names such as vault.core.active. Prometheus exposes those names with underscores, such as vault_core_active.

OpenBao still documents vault as the default telemetry prefix. You can set metrics_prefix = "openbao" in the telemetry stanza when you want source metrics such as openbao_core_active.

This project supports both source prefixes:

Source prefixExample source metricUse when
vaultvault_core_activeYou use the OpenBao default prefix.
openbaoopenbao_core_activeYou explicitly configure metrics_prefix = "openbao".

Normalized recording rules

Generated recording rules use the openbao: namespace, even when source metrics use the vault_* prefix. This keeps dashboards and alerts stable across both source-prefix profiles.

Examples:

Source signalRecording rule
Active node countopenbao:core_active:sum
Request rateopenbao:core_handle_request:rate5m
Request latencyopenbao:core_handle_request:avg5m
Audit request failuresopenbao:audit_log_request_failure:increase5m
Lease countopenbao:expire_num_leases:max
Token countopenbao:token_count:max30m
KV secret countopenbao:secret_kv_count:max30m
Raft peer countopenbao:raft_peers:max
Runtime heap objectsopenbao:runtime_heap_objects:max
Barrier read latencyopenbao:barrier_get:avg5m
Cache hit ratioopenbao:cache_hit_ratio:ratio5m
Mount table entriesopenbao:core_mount_table_num_entries:max

Use Understand metric prefixes and recording rules when you need the full prefix and artifact mapping.

Labels

OpenBao metric labels are not uniform. Some metrics include cluster, some runtime metrics rely mostly on scrape labels, and token metrics can include labels such as auth_method, creation_ttl, mount_point, namespace, and token_type.

Treat labels as part of the metric contract, not as free dimensions. Before you group by a label:

  • Check whether the label exists on the live series.
  • Check whether the label can expose sensitive metadata.
  • Check whether the label has bounded cardinality.
  • Check whether the dashboard or alert still works across both source prefixes.

Scrape profiles

Metrics interpretation depends on the scrape profile.

ProfileStrengthLimitation
Authenticated active-node scrapeStrong secure baseline for cluster-level health and active request behavior.Limited standby and follower visibility.
Private all-node scrapeBetter HA/Raft, standby, follower, and per-node runtime visibility.Requires isolated metrics access and label review.
Local Docker Compose scrapeUseful for reference-stack validation.Not a production security model.

Use the active-node profile as the secure baseline. Add all-node scraping when you need HA/Raft diagnostics or per-node visibility.

Validation

This project validates metrics at three layers:

  • Captured OpenBao 2.5.4 fixtures under fixtures/captured/openbao-2.5.4/.
  • Metric contracts under contracts/metrics/.
  • Generated Prometheus rules under generated/prometheus/ and generated/prometheusrules/.

Run the full verification target after you change metrics, rules, dashboards, or alerts.

make verify

Common mistakes

  • Querying raw vault_* metrics from a deployment that emits openbao_*.
  • Treating high-cardinality usage gauges as real-time inventory.
  • Grouping by mount_point, policy, request path, or token metadata without a label review.
  • Expecting active-node scraping to show every standby or follower signal.
  • Reading barrier or cache metrics without request latency, token checks, and storage context.
  • Treating an empty panel as an incident before checking source prefix, scrape profile, and recording rule deployment.
  • Writing dashboards directly against source metrics when a normalized rule exists.

What’s next

Source: OpenBao documents telemetry collection in the OpenBao telemetry documentation . OpenBao documents metric types, labels, and high-cardinality gauge behavior in the OpenBao telemetry metrics overview . OpenBao documents the Prometheus metrics endpoint in the OpenBao metrics API documentation .