Validated Patterns

Pattern

Trilio Continuous Restore

Status Sandbox Sandbox

Trilio and related names, logos, and product names are trademarks or registered trademarks of their respective owners. This pattern is contributed as community content and is not an official product endorsement.

Trilio Continuous Restore — Red Hat Validated Pattern

Overview

This Validated Pattern delivers an automated, GitOps-driven Disaster Recovery (DR) solution for stateful applications running on Red Hat OpenShift. By integrating Trilio for Kubernetes with the Red Hat Validated Patterns framework, the pattern delivers:

  • Automated backup of stateful workloads on the primary (hub) cluster
  • Continuous Restore — Trilio’s accelerated Recovery Time Objective (RTO) DR path that continuously pre-stages backup data on the DR cluster so that recovery requires only metadata retrieval, not a full data transfer
  • Automated DR testing — the full backup-to-restore lifecycle runs as a scheduled, self-healing GitOps workflow with no human intervention after initial setup
  • Multi-cluster lifecycle management through Red Hat Advanced Cluster Management (ACM)

Use case

The pattern targets organizations that need a documented, repeatable DR posture for Kubernetes-native workloads — particularly those that must demonstrate RTO/Recovery Point Objective (RPO) targets through regular, automated DR tests rather than annual manual exercises.

A WordPress + MySQL deployment is included as a representative stateful application. It serves as the reference workload for the full backup, restore, and URL-rewrite lifecycle.


Architecture

  graph TD
    subgraph Git["Git (Source of Truth)"]
        values["values-hub.yaml\nvalues-secondary.yaml\ncharts/"]
    end

    subgraph Hub["Hub Cluster (primary)"]
        ACM["ACM"]
        ArgoCD["ArgoCD"]
        Vault["HashiCorp Vault + ESO"]
        Trilio_Hub["Trilio Operator + TVM"]
        CronJob["Imperative CronJob\n(DR lifecycle automation)"]
    end

    subgraph Spoke["DR Cluster (secondary)"]
        Trilio_Spoke["Trilio Operator + TVM"]
        EventTarget["EventTarget pod\n(pre-stages PVCs)"]
        ConsistentSet["ConsistentSet\n(restore point)"]
    end

    S3["Shared S3 Bucket"]

    Git -->|GitOps sync| ArgoCD
    ArgoCD --> Trilio_Hub
    Vault -->|S3 creds + license| Trilio_Hub
    Trilio_Hub -->|backups| S3
    ACM -->|provisions| Spoke
    S3 -->|EventTarget polls| EventTarget
    EventTarget --> ConsistentSet
    CronJob -->|restore from ConsistentSet| ConsistentSet

Component roles

ComponentWhereRole
Trilio OperatorHub + SpokeInstalled through Operator Lifecycle Manager (OLM) from the certified-operators catalog, channel 5.3.x
TrilioVaultManagerHub + SpokeTrilio operand Custom Resource (CR); manages the Trilio data plane
Red Hat OpenShiftHub + SpokeContainer orchestration platform; provides OLM, storage, networking, and the GitOps operator substrate
Red Hat OpenShift GitOps (ArgoCD)Hub + SpokeGitOps sync engine; all configuration is driven from Git
Red Hat Advanced Cluster Management (ACM)HubCluster lifecycle, policy enforcement, and spoke provisioning
Validated Patterns Imperative CronJobHub + SpokeRuns the automated DR lifecycle on a 10-minute schedule
BackupTargetHub + SpokePoints to the shared S3 bucket; the spoke BackupTarget has the EventTarget flag set
BackupPlanHubDefines backup scope (wordpress namespace), quiesce/unquiesce hooks, and retention
CR BackupPlanHubContinuous Restore variant of BackupPlan; drives pre-staging on the spoke
EventTarget podSpokeWatches the shared S3 bucket for new backups; pre-stages Persistent Volume Claims (PVCs) locally
ConsistentSetSpokeCluster-scoped CR representing a fully pre-staged restore point
HashiCorp Vault and External Secrets Operator (ESO)HubSecret management; S3 credentials and Trilio license are never stored in Git

How Continuous Restore works

  1. The hub creates a backup using the CR BackupPlan and writes it to the shared S3 storage.
  2. The EventTarget pod on the spoke detects the new backup and begins copying volume data locally — ahead of any DR event.
  3. When the spoke’s imperative job detects an Available ConsistentSet, it submits a Restore CR. Because the data is already local, only backup metadata is fetched — resulting in significantly lower RTO than a standard on-demand restore.
  4. The post-restore Hook CR rewrites WordPress database URLs to the DR cluster’s ingress domain.

Next steps