Skip to main content
Version: next

At a glance

Control path

  • workload reconciler
  • internal/controller/openbaocluster
  • internal/service/init

Owns

  • bootstrap detection and first init call behavior
  • root token handling or self-init completion detection
  • initial autopilot configuration immediately after cluster initialization

Writes

  • status.initialized and status.selfInitialized
  • root token Secret when self-init is disabled
  • autopilot configuration through the OpenBao API once initialization completes

Depends on

  • single-replica bootstrap from the infrastructure path
  • pod readiness and TLS Secret availability before init proceeds
  • self-init requests and auth bootstrap configuration when self-init is enabled

Architectural Placement

Initialization stays on the workload-side controller path while the cluster is not yet ready for normal steady-state reconciliation:

  1. internal/controller/openbaocluster keeps the cluster on the uninitialized path.
  2. The controller calls internal/service/init once the first pod and TLS prerequisites are ready.
  3. The init manager marks initialization state, configures autopilot, and only then allows the infrastructure path to scale to the requested replica count.

That separation prevents first-boot logic from leaking into every steady-state reconcile.

Bootstrap Flow

Diagram

Initialize, then scale

The infrastructure path holds the workload at one replica until the init manager confirms the cluster is initialized. Only then does the cluster expand to the requested replica count.

Initialization Phases

Bootstrap contract

  • A new cluster starts at one replica even when spec.replicas is greater than one.
  • The infrastructure manager keeps the StatefulSet capped until status.initialized becomes true.
  • This avoids race conditions where multiple uninitialized pods could compete to become the first Raft leader.

Autopilot Defaults

Reference table

Autopilot configuration defaults

Autopilot configuration defaults.
SettingDefault behaviorWhy the init manager sets it early
dead_server_last_contact_threshold5mDead-peer cleanup should wait long enough to avoid reacting to short network turbulence.
last_contact_threshold10sAutopilot needs a consistent heartbeat tolerance before higher replica counts join.
server_stabilization_time10sNew members should remain stable briefly before being treated as healthy participants.
max_trailing_logs1000Replication lag needs a default budget before dead-server or readiness logic starts treating peers as unhealthy.
min_quorumHardened profile defaults to 3, or replicas when replicas > 3; other profiles use max(1, replicas).The cleanup policy and quorum safety model must be aligned from the first initialized reconcile.
Already initialized is recovery, not import

If the manager detects that a cluster is already initialized, it takes the initialized-cluster path as recovery for an operator-managed cluster. It is not a generic import path for arbitrary unmanaged OpenBao clusters.

Reference table

Safety boundaries

Safety boundaries.
ConcernManager behavior
Root material handlingThe init response is used in-memory only for the current request and is not logged; self-init intentionally avoids creating a root-token Secret.
TLS readinessInitialization waits for the TLS server Secret when TLS is managed by the operator so the API path is not used before the workload is ready.
Invalid autopilot cleanupThe manager forces cleanupDeadServers off for small-cluster configurations that OpenBao would reject.

Related deep dives

Next release documentation

You are reading the unreleased main docs. Use the version menu for the newest published release, or check the release notes for what is already out.

Was this page helpful?

Use Needs work to open a structured GitHub issue for this page. The Yes button only acknowledges the signal locally.