7. Replication¶
This section covers cross-site and cross-region data replication for disaster recovery, backup, and data residency compliance. Each data store in the RCIIS platform has its own replication strategy, mechanism, and recovery objectives.
Replication Strategies¶
| Data Store | Mechanism | Topology | RPO | Page |
|---|---|---|---|---|
| S3 Object Storage | Apache Camel K (event-driven) / Ceph RGW Multi-Site / AWS CRR | On-Prem ↔ AWS, On-Prem ↔ On-Prem, AWS → AWS | Near real-time (all scenarios) | 7.1 S3 Replication |
| Kafka | Strimzi MirrorMaker 2 (Kafka Connect) | Active-Active (bidirectional, testing ↔ proxmox) via Cilium Cluster Mesh | 60 seconds | 7.2 Kafka Replication |
| PostgreSQL | CNPG Declarative Pub/Sub (logical replication) | Active-Active (bidirectional, on-prem ↔ on-prem) | Near real-time (seconds) | 7.3 PostgreSQL |
| SQL Server | Bidirectional Merge Replication (Standard Edition) | Active-Active (both nodes Publisher + Subscriber) via Cilium Cluster Mesh | Merge Agent interval (configurable, ≥1 min) | 7.4 SQL Server |
Recovery Objectives¶
| Metric | S3 (On-Prem ↔ AWS) | S3 (On-Prem ↔ On-Prem) | S3 (AWS CRR) | Kafka (MM2) | PostgreSQL | SQL Server |
|---|---|---|---|---|---|---|
| RPO | Seconds to minutes (event-driven) | Seconds to minutes (async log-based) | Minutes (CRR) | 60 seconds | Near real-time (seconds) | Merge Agent interval (≥1 min) |
| RTO | N/A (destination always readable) | N/A (active-active) | N/A | N/A (active-active) | Automatic (CNPG failover) | Manual (restart Merge Agent) |
Cross-Cutting Concerns¶
All replication strategies share common requirements:
- Encryption in transit — TLS for all cross-cluster traffic
- Secret management — credentials stored as SOPS-encrypted secrets with KSOPS
- Monitoring — Prometheus alerting rules for replication lag and failures
- Network connectivity — Cilium Cluster Mesh (Kafka), VPN/ingress (S3 on-prem), or AWS-internal (S3 CRR)