Skip to main content
Version: 0.1.0

At a glance

Used by

  • internal/service/backup
  • internal/service/restore
  • internal/service/upgrade

Owns

  • operation-lock identity helpers for disruptive work
  • retry intent classes and default requeue mapping
  • phase-transition audit field normalization

Writes through

  • internal/adapter/operationlock for status.operationLock updates
  • audit event fields for phase transitions
  • shared retry delays consumed by controller requeues

Depends on

  • OpenBaoCluster.status.operationLock as the persisted mutex surface
  • controller requeue behavior for long-running progress polling
  • manager-specific phase names and audit metadata

Architectural Placement

Operation lifecycle coordination sits below the concrete managers and above the lock adapter:

  1. A manager such as backup, restore, or upgrade decides it needs to start or resume work.
  2. It uses internal/service/opslifecycle to acquire or release the expected lock identity, classify retry intent, and log phase changes.
  3. opslifecycle delegates the actual status patching to internal/adapter/operationlock.

That keeps the shared safety model in one place instead of scattering lock and retry semantics across several managers.

Diagram

Coordination model

Backup, restore, and upgrade do not each implement their own lock and retry policy. They share one coordination service that wraps the operation-lock adapter and keeps audit fields consistent.

Reference table

Shared primitives

Shared primitives.
PrimitiveWhat it standardizesWhy it exists
Acquire / ReleaseStatus-based lock ownership via the adapter.Controllers should not each patch status.operationLock differently or invent different lock messages.
IsLockHeld / HeldError / AddHeldAuditFieldsA shared way to classify contention and enrich audit events with who currently owns the lock.Contention should produce consistent diagnostics instead of manager-specific strings.
LogPhaseTransitionStable phase_from / phase_to audit fields for long-running operations.Audit streams stay comparable across backup, restore, and upgrade.

Retry And Lock Model

Reference table

Retry classes

Retry classes.
Retry classDefault delayTypical use
progress-poll5sA Job or long-running operation is still in progress and the manager is waiting for the next observable state change.
standard1m by default, overridable with OPENBAO_REQUEUE_STANDARDBackground retry work that does not need tight polling.

Reference table

Lock contract

Lock contract.
ConcernShared behavior
Exact-match releaseRelease succeeds only when holder and operation match the active lock, so one manager cannot accidentally clear another manager’s ownership.
Force overrideForce semantics exist for explicit override paths only; normal long-running operations should not silently steal the lock.
Contention diagnosticsHeldError exposes the current operation and holder so audit events and logs can explain why a manager requeued.
This is coordination, not orchestration

opslifecycle does not decide whether an upgrade should roll or blue-green, whether a restore request is valid, or whether a backup target is reachable. It only standardizes the lock, retry, and audit mechanics around those domain decisions.

Related deep dives

Published release documentation

You are reading docs for version 0.1.0. Use the version menu to switch to next or another archived release.

Was this page helpful?

Use Needs work to open a structured GitHub issue for this page. The Yes button only acknowledges the signal locally.