Skip to content

Day N: Backups & Disaster Recovery

Day N operations ensure data durability through regular backups and disaster recovery procedures.

User Guide

See Backups and Restore for configuration examples.

Backups

  1. User configures backup schedule (spec.backup.schedule) and target object storage in the OpenBaoCluster spec. Supported providers:
  2. S3: AWS S3 or S3-compatible storage (MinIO, Ceph, etc.)
  3. GCS: Google Cloud Storage
  4. Azure: Azure Blob Storage
  5. User configures authentication method:
  6. JWT Auth (Preferred): Set spec.backup.jwtAuthRole and configure the role in OpenBao
  7. Static Token (Fallback): For all clusters, set spec.backup.tokenSecretRef pointing to a backup token Secret (root tokens are not used)
  8. Backup Manager (adminops controller) schedules backups using cron expressions (e.g., "0 3 * * *" for daily at 3 AM).
  9. On schedule, Backup Manager:
  10. Creates a Kubernetes Job with the backup executor container
  11. Job uses <cluster-name>-backup-serviceaccount (automatically created by operator)
  12. Backup executor:
    • Authenticates to OpenBao using JWT Auth (via projected ServiceAccount token) or static token
    • Discovers the current Raft leader via OpenBao API
    • Streams GET /v1/sys/storage/raft/snapshot directly to object storage (no disk buffering)
    • Names backups predictably: <prefix>/<namespace>/<cluster>/<timestamp>-<uuid>.snap
    • Verifies upload completion
  13. Backup status is recorded in Status.Backup:
  14. LastBackupTime, NextScheduledBackup for visibility
  15. ConsecutiveFailures for alerting
  16. Optional retention policies (spec.backup.retention) automatically delete old backups:
  17. MaxCount: Keep only the N most recent backups
  18. MaxAge: Delete backups older than a specified duration

Backup Limitations

Backups are skipped during upgrades to avoid inconsistent snapshots. Backups are optional for all clusters. If backups are enabled, either jwtAuthRole or tokenSecretRef must be configured. Root tokens are not used for backup operations.

Sequence Diagram

sequenceDiagram
    autonumber
    participant U as User
    participant K as Kubernetes API
    participant Op as OpenBao Operator
    participant Job as Backup Job Pod
    participant Bao as OpenBao API
    participant Storage as Object Storage

    U->>K: Configure backup schedule and target in OpenBaoCluster
    K-->>Op: Watch OpenBaoCluster (backup spec)
    Op->>Op: Schedule backup via cron
    Op->>K: Create Job/<cluster>-backup
    K-->>Job: Start backup executor Pod
    Job->>Bao: Authenticate (JWT or token)
    Job->>Bao: GET /v1/sys/storage/raft/snapshot
    Job->>Storage: Stream snapshot to object storage
    Job-->>Op: Exit status (success/failure)
    Op->>K: Update OpenBaoCluster.status.backup (last backup, failures)
    Op->>Storage: Apply retention policies (via backup manager, if configured)