Validated Patterns

Trilio Continuous Restore

Validation status:
Sandbox Sandbox
Links:

Trilio Continuous Restore — Red Hat Validated Pattern

Overview

This Validated Pattern delivers an automated, GitOps-driven Disaster Recovery (DR) solution for stateful applications running on Red Hat OpenShift. By integrating Trilio for Kubernetes with the Red Hat Validated Patterns framework, the pattern delivers:

  • Automated backup of stateful workloads on the primary (hub) cluster
  • Continuous Restore — Trilio’s accelerated Recovery Time Objective (RTO) DR path that continuously pre-stages backup data on the DR cluster so that recovery requires only metadata retrieval, not a full data transfer
  • Automated DR testing — the full backup-to-restore lifecycle runs as a scheduled, self-healing GitOps workflow with no human intervention after initial setup
  • Multi-cluster lifecycle management through Red Hat Advanced Cluster Management (ACM)

Use case

The pattern targets organizations that need a documented, repeatable DR posture for Kubernetes-native workloads — particularly those that must demonstrate RTO/Recovery Point Objective (RPO) targets through regular, automated DR tests rather than annual manual exercises.

A WordPress + MySQL deployment is included as a representative stateful application. It serves as the reference workload for the full backup, restore, and URL-rewrite lifecycle.


Architecture

  graph TD
    subgraph Git["Git (Source of Truth)"]
        values["values-hub.yaml\nvalues-secondary.yaml\ncharts/"]
    end

    subgraph Hub["Hub Cluster (primary)"]
        ACM["ACM"]
        ArgoCD["ArgoCD"]
        Vault["HashiCorp Vault + ESO"]
        Trilio_Hub["Trilio Operator + TVM"]
        CronJob["Imperative CronJob\n(DR lifecycle automation)"]
    end

    subgraph Spoke["DR Cluster (secondary)"]
        Trilio_Spoke["Trilio Operator + TVM"]
        EventTarget["EventTarget pod\n(pre-stages PVCs)"]
        ConsistentSet["ConsistentSet\n(restore point)"]
    end

    S3["Shared S3 Bucket"]

    Git -->|GitOps sync| ArgoCD
    ArgoCD --> Trilio_Hub
    Vault -->|S3 creds + license| Trilio_Hub
    Trilio_Hub -->|backups| S3
    ACM -->|provisions| Spoke
    S3 -->|EventTarget polls| EventTarget
    EventTarget --> ConsistentSet
    CronJob -->|restore from ConsistentSet| ConsistentSet

Component roles

ComponentWhereRole
Trilio OperatorHub + SpokeInstalled through Operator Lifecycle Manager (OLM) from the certified-operators catalog, channel 5.3.x
TrilioVaultManagerHub + SpokeTrilio operand Custom Resource (CR); manages the Trilio data plane
Red Hat OpenShiftHub + SpokeContainer orchestration platform; provides OLM, storage, networking, and the GitOps operator substrate
Red Hat OpenShift GitOps (ArgoCD)Hub + SpokeGitOps sync engine; all configuration is driven from Git
Red Hat Advanced Cluster Management (ACM)HubCluster lifecycle, policy enforcement, and spoke provisioning
Validated Patterns Imperative CronJobHub + SpokeRuns the automated DR lifecycle on a 10-minute schedule
BackupTargetHub + SpokePoints to the shared S3 bucket; the spoke BackupTarget has the EventTarget flag set
BackupPlanHubDefines backup scope (wordpress namespace), quiesce/unquiesce hooks, and retention
CR BackupPlanHubContinuous Restore variant of BackupPlan; drives pre-staging on the spoke
EventTarget podSpokeWatches the shared S3 bucket for new backups; pre-stages Persistent Volume Claims (PVCs) locally
ConsistentSetSpokeCluster-scoped CR representing a fully pre-staged restore point
HashiCorp Vault and External Secrets Operator (ESO)HubSecret management; S3 credentials and Trilio license are never stored in Git

How Continuous Restore works

  1. The hub creates a backup using the CR BackupPlan and writes it to the shared S3 storage.
  2. The EventTarget pod on the spoke detects the new backup and begins copying volume data locally — ahead of any DR event.
  3. When the spoke’s imperative job detects an Available ConsistentSet, it submits a Restore CR. Because the data is already local, only backup metadata is fetched — resulting in significantly lower RTO than a standard on-demand restore.
  4. The post-restore Hook CR rewrites WordPress database URLs to the DR cluster’s ingress domain.

Next steps