Skip to main content
Version: next

At a glance

Starts with

  • a live initialized cluster and an optional spec.backup schedule
  • object-storage configuration plus backup authentication
  • explicit OpenBaoRestore requests when destructive recovery is needed

Primary owners

  • internal/service/backup
  • internal/service/restore
  • internal/service/opslifecycle

Writes

  • backup executor Jobs and status.backup timing and failure state
  • OpenBaoRestore phase progression and cluster operation-lock ownership
  • retention cleanup after successful uploads and restore job launch/cleanup state

Hands off to

  • normal steady-state operation when backups succeed
  • post-restore follow-up when a restore request completes
  • operator-facing backup, restore, and recovery procedures

Architectural Placement

Durability work is shared across two explicit operation surfaces:

  1. The backup manager lives on the adminops path and handles scheduled, manual, and pre-upgrade snapshot jobs.
  2. The restore manager runs through the dedicated OpenBaoRestore controller path so destructive recovery stays explicit and auditable.
  3. internal/service/opslifecycle supplies shared lock and retry behavior so backups, restores, and upgrades coordinate instead of colliding.

That model keeps backup routine and restore exceptional, even though both exist in the same durability phase of the lifecycle.

Diagram

Day N durability loop

Backups produce durable recovery points during live operation; restore consumes one of those recovery points through a separate request path when the cluster needs destructive recovery.

Reference table

Durability surfaces

Durability surfaces.
SurfacePrimary ownerPurpose
status.backupBackup manager writes it.Records last attempt, next schedule, last success, and consecutive failures so durability is visible without inspecting Jobs.
OpenBaoRestoreRestore manager consumes and updates it.Keeps restore explicit, immutable, and auditable instead of hiding destructive recovery inside cluster status.
status.operationLockShared via opslifecycle.Blocks conflicting upgrade, backup, or restore work while one disruptive operation is in flight.

Reference table

Safety boundaries

Safety boundaries.
ConcernDurability behavior
Authentication surfaceBackup and restore use dedicated auth wiring such as JWT roles or explicit token references; root tokens are not the durability mechanism.
Restore visibilityRestore is modeled as a separate CRD-backed request so destructive recovery has its own audit trail and phase status.
Retention timingRetention cleanup runs only after a successful backup so older recovery points are not removed before a new one exists.

Related durability pages

Next release documentation

You are reading the unreleased main docs. Use the version menu for the newest published release, or check the release notes for what is already out.

Was this page helpful?

Use Needs work to open a structured GitHub issue for this page. The Yes button only acknowledges the signal locally.