Skip to content

7. Replication

This section covers cross-site and cross-region data replication for disaster recovery, backup, and data residency compliance. Each data store in the RCIIS platform has its own replication strategy, mechanism, and recovery objectives.


Replication Strategies

Data Store Mechanism Topology RPO Page
S3 Object Storage Apache Camel K (event-driven) / Ceph RGW Multi-Site / AWS CRR On-Prem ↔ AWS, On-Prem ↔ On-Prem, AWS → AWS Near real-time (all scenarios) 7.1 S3 Replication
Kafka Strimzi MirrorMaker 2 (Kafka Connect) Active-Active (bidirectional, testing ↔ proxmox) via Cilium Cluster Mesh 60 seconds 7.2 Kafka Replication
PostgreSQL CNPG Declarative Pub/Sub (logical replication) Active-Active (bidirectional, on-prem ↔ on-prem) Near real-time (seconds) 7.3 PostgreSQL
SQL Server Bidirectional Merge Replication (Standard Edition) Active-Active (both nodes Publisher + Subscriber) via Cilium Cluster Mesh Merge Agent interval (configurable, ≥1 min) 7.4 SQL Server

Recovery Objectives

Metric S3 (On-Prem ↔ AWS) S3 (On-Prem ↔ On-Prem) S3 (AWS CRR) Kafka (MM2) PostgreSQL SQL Server
RPO Seconds to minutes (event-driven) Seconds to minutes (async log-based) Minutes (CRR) 60 seconds Near real-time (seconds) Merge Agent interval (≥1 min)
RTO N/A (destination always readable) N/A (active-active) N/A N/A (active-active) Automatic (CNPG failover) Manual (restart Merge Agent)

Cross-Cutting Concerns

All replication strategies share common requirements:

  • Encryption in transitTLS for all cross-cluster traffic
  • Secret management — credentials stored as SOPS-encrypted secrets with KSOPS
  • Monitoring — Prometheus alerting rules for replication lag and failures
  • Network connectivity — Cilium Cluster Mesh (Kafka), VPN/ingress (S3 on-prem), or AWS-internal (S3 CRR)