Day 2: Operations & Upgrades¶
Day 2 operations cover the ongoing management of the cluster, including version upgrades and maintenance.
User Guide
See the Upgrade Guide for detailed upgrade strategies (Rolling vs Blue/Green).
Cluster Operations / Upgrades¶
Cluster Operations / Upgrades¶
- User configures upgrade executor:
- Set
spec.upgrade.executorImage(container image used by upgrade Jobs) - Set
spec.upgrade.jwtAuthRoleand configure the role in OpenBao (binds to<cluster-name>-upgrade-serviceaccount, automatically created by operator) - User updates
OpenBaoCluster.Spec.Versionand/orSpec.Image. - Upgrade Manager (adminops controller) detects version drift and performs pre-upgrade validation:
- Validates semantic versioning (blocks downgrades by default).
- Verifies all pods are Ready and quorum is healthy.
- Optionally triggers a pre-upgrade backup if
spec.upgrade.preUpgradeSnapshotis enabled. - Upgrade Manager orchestrates Raft-aware rolling updates:
- Locks StatefulSet updates using partitioning.
- Iterates pods in reverse ordinal order.
- Runs an upgrade Job to perform leader step-down before updating the leader pod.
- Waits for pod Ready, OpenBao health, and Raft sync after each update.
- Upgrade progress is persisted in
Status.Upgrade, allowing resumption after Operator restart. - On completion,
Status.CurrentVersionis updated andStatus.Upgradeis cleared.
Upgrade Policy
Upgrades are designed to be safe and resumable. Downgrades are blocked by default. If an upgrade fails, it halts and sets Degraded=True; automated rollback is not supported. Root tokens are not used for upgrade operations.
Sequence Diagram (Rolling Updates)¶
sequenceDiagram
autonumber
participant U as User
participant K as Kubernetes API
participant Op as OpenBao Operator
participant Bao as OpenBao Pods
U->>K: Patch OpenBaoCluster.spec.version
K-->>Op: Watch OpenBaoCluster (version drift)
Op->>Op: Validate versions, health, optional pre-upgrade backup
Op->>K: Patch StatefulSet updateStrategy (lock with partition)
loop per pod (highest ordinal -> 0)
Op->>Bao: /v1/sys/health on target pod
alt pod is leader
Op->>Bao: /v1/sys/step-down
end
Op->>K: Decrement StatefulSet.partition to update pod
K-->>Bao: Roll new pod
Bao-->>Op: Pod Ready + OpenBao health OK
end
Op->>K: Update OpenBaoCluster.status.currentVersion
Op->>K: Clear OpenBaoCluster.status.upgrade
Blue/Green upgrades provide zero-downtime updates by creating a parallel "Green" standby cluster.
- Drift Detection: User updates
OpenBaoClusterspec with a new version or image, using the Blue/Green strategy. - Green Creation: The operator creates a new "Green" StatefulSet with the new version.
- Join & Standby: Green pods start and join the existing "Blue" Raft cluster as non-voters (or voters, depending on strategy). They replicate data but do not serve traffic.
- Health Check: Operator verifies the Green cluster is healthy and fully replicated.
- Cutover: Operator updates the Service selector to point to the Green pods. Traffic switches instantly.
- Cleanup: After a verification period or manual confirmation, the old "Blue" StatefulSet is scaled down and terminated.
Sequence Diagram (Blue/Green)¶
sequenceDiagram
autonumber
participant U as User
participant K as Kubernetes API
participant Op as OpenBao Operator
participant Blue as Blue Pods (v1)
participant Green as Green Pods (v2)
U->>K: Update Image to v2 (BlueGreen Strategy)
K-->>Op: Watch OpenBaoCluster
Op->>K: Create Green StatefulSet (v2)
K-->>Green: Start Green Pods
Green->>Blue: Join Raft Cluster (Standby)
Op->>Green: Wait for Healthy
Op->>K: Switch Service Selector to Green
Op->>Blue: Scale Down / Terminate
Maintenance / Manual Recovery¶
- User sets
OpenBaoCluster.Spec.Paused = trueto enter maintenance mode. - All reconcilers for that cluster short-circuit and stop mutating resources, allowing manual actions (e.g., manual restore from snapshot).
- If an upgrade was in progress, it is paused but state is preserved in
Status.Upgrade. - After maintenance, user sets
Paused = falseto resume normal reconciliation (including any paused upgrade).