7.1 S3 Object Storage Replication¶

This section covers replicating S3 object storage data between environments for disaster recovery, backup, and data residency compliance. Three replication scenarios are supported depending on where your source and destination buckets are hosted.

Replication Scenarios¶

Scenario	Source	Destination	Mechanism	Use Case
On-Prem ↔ AWS	Ceph RGW (on-prem)	AWS S3	Apache Camel K (event-driven, bidirectional)	Real-time sync between on-prem and cloud
On-Prem ↔ On-Prem	Ceph RGW (Cluster A)	Ceph RGW (Cluster B)	Ceph RGW Multi-Site	Active-active replication between on-prem clusters
AWS → AWS	AWS S3 (Region A)	AWS S3 (Region B)	AWS Cross-Region Replication	Cross-region redundancy within AWS

Which Scenario Do I Need?¶

Both source and destination are AWS S3?
- Use the AWS → AWS tab. This is fully managed by AWS with near real-time replication and requires no additional infrastructure.
Source is on-prem Ceph and destination is AWS S3 (or vice versa)?
- Use the On-Prem ↔ AWS tab. Two Apache Camel K integrations replicate objects bidirectionally in near real-time using event-driven triggers (Kafka for Ceph events, SQS for AWS events). Each integration uses native S3 components (aws2-s3, aws2-sqs) — no shell commands or external CLIs.
Both source and destination are on-prem Ceph clusters?
- Use the On-Prem ↔ On-Prem tab. Ceph RGW multi-site replication provides native, asynchronous, bidirectional replication between two Rook-Ceph clusters using realms, zonegroups, and zones. Requires network connectivity between the RGW endpoints on each cluster.

Note

All three scenarios use event-driven / asynchronous replication — objects are replicated within seconds to minutes of being written. Ceph multi-site uses internal change logs for incremental sync. AWS CRR replicates objects in near real-time as they are written.

Key Differences¶

Feature	Camel K (On-Prem ↔ AWS)	Ceph Multi-Site (On-Prem ↔ On-Prem)	AWS CRR
Sync model	Event-driven (near real-time)	Async log-based (near real-time)	Near real-time (event-driven)
Direction	Bidirectional	Bidirectional (active-active)	Unidirectional (or bidirectional with two rules)
Infrastructure	Camel K Integration CRs + Kafka + SQS	Rook-Ceph CRDs (Realm, ZoneGroup, Zone)	Fully managed by AWS
RPO	Seconds to minutes	Seconds to minutes	Minutes
Initial backfill	Manual (run `aws s3 sync` before enabling)	Automatic (full sync on zone join)	Requires S3 Batch Replication
Monitoring	Camel Micrometer metrics + PodMonitor	`radosgw-admin sync status` + Ceph metrics	CloudWatch metrics
Network	On-prem → AWS S3 egress; AWS → on-prem RGW egress	RGW-to-RGW HTTP/S between clusters	AWS internal
Cost	Compute + egress + SQS	Ceph cluster resources + egress bandwidth	S3 replication + storage

Architecture¶

On-Prem ↔ AWSOn-Prem ↔ On-PremAWS → AWS

Kubernetes Cluster (on-prem)                                 Kubernetes Cluster (AWS af-south-1)
┌─────────────────────────────────────────┐                  ┌──────────────────────────────────────┐
│                                         │                  │                                      │
│  Ceph RGW (ceph-objectstore)            │                  │  S3 Bucket                           │
│    │                                    │                  │  rciis-ceph-replica-*                │
│    │ bucket notification                │                  │    │                                 │
│    ▼                                    │                  │    │ S3 event notification           │
│  Kafka (Strimzi)                        │                  │    ▼                                 │
│  (ceph-bucket-notifications topic)      │                  │  SQS Queue                           │
│    │                                    │                  │  (rciis-s3-replication-events)       │
│    ▼                                    │                  │    │                                 │
│  ┌───────────────────────────────┐      │                  │    ▼                                 │
│  │ Camel K Integration           │      │                  │  ┌───────────────────────────────┐   │
│  │ (ceph-to-aws-replicator)      │──────┼── aws2-s3 ──────>│  │ Camel K Integration           │   │
│  │ loop: userIdentity check      │      │   putObject      │  │ (aws-to-ceph-replicator)      │   │
│  └───────────────────────────────┘      │                  │  │ loop: userIdentity check      │   │
│                                         │                  │  └──────────────┬────────────────┘   │
│                                         │                  │                 │                    │
│  Ceph RGW   <───────────────────────────┼── aws2-s3 ───────┼─────────────────┘                    │
│  (receives replicated objects)          │   putObject      │                                      │
│                                         │                  │  Camel K Operator                    │
│  Camel K Operator                       │                  │  PrometheusRule (alerts)             │
│  PrometheusRule (alerts)                │                  │                                      │
└─────────────────────────────────────────┘                  └──────────────────────────────────────┘

Each Camel K integration is co-located with its source data to minimize cross-WAN traffic. Data crosses the network exactly once per object per direction.

Direction 1 — Ceph → AWS (integration runs on-prem): Ceph RGW emits bucket notifications on object create/delete. These are published to a Kafka topic via CephBucketTopic and CephBucketNotification CRDs. The ceph-to-aws-replicator Camel K Integration consumes from Kafka locally, downloads the object from Ceph via the aws2-s3 component (local read, no WAN), and uploads it to AWS S3 via a second aws2-s3 producer (single WAN hop). Deletes invoke the deleteObject operation on AWS.

Direction 2 — AWS → Ceph (integration runs in AWS): S3 event notifications on the AWS bucket publish to an SQS queue. The aws-to-ceph-replicator Camel K Integration consumes from SQS locally in the AWS cluster, downloads the object from S3 via aws2-s3 (local read, no WAN), and uploads it to Ceph RGW via aws2-s3 with uriEndpointOverride pointing to the on-prem RGW endpoint (single WAN hop). Deletes invoke deleteObject on Ceph.

Loop Prevention via userIdentity.principalId: Both Ceph RGW and AWS S3 include the writer's identity (userIdentity.principalId) in the event notification payload. Each integration checks this field before processing:

Ceph → AWS (on-prem): If principalId is s3-replicator (the Ceph user that the AWS-side replicator writes as), the event is skipped — it was a replicated write from AWS.
AWS → Ceph (AWS): If principalId matches the AWS IAM user rciis-ceph-s3-replicator (the identity that the on-prem replicator writes as), the event is skipped — it was a replicated write from Ceph.

This approach requires zero extra API calls (no headObject), has no race conditions, works for both creates and deletes, and does not modify the object or its metadata.

Cluster A (master zone)                         Cluster B (secondary zone)
┌──────────────────────────────┐               ┌──────────────────────────────┐
│                              │               │                              │
│  Rook-Ceph                   │               │  Rook-Ceph                   │
│  ┌────────────────────────┐  │               │  ┌────────────────────────┐  │
│  │ CephObjectRealm        │  │               │  │ CephObjectRealm        │  │
│  │ CephObjectZoneGroup    │  │               │  │  (pull from master)    │  │
│  │ CephObjectZone: zoneA  │  │   async       │  │ CephObjectZone: zoneB  │  │
│  │ CephObjectStore        │  │   log-based   │  │ CephObjectStore        │  │
│  │                        │  │   replication │  │                        │  │
│  │  RGW instances   ◄─────┼──┼───────────────┼──┼──► RGW instances       │  │
│  │                        │  │               │  │                        │  │
│  └────────────────────────┘  │               │  └────────────────────────┘  │
│                              │               │                              │
└──────────────────────────────┘               └──────────────────────────────┘
                ▲                                              ▲
                └──────── Cross-cluster network ───────────────┘
                          (VPN / peering / ingress)

Ceph RGW multi-site replication is a native Ceph feature that replicates data asynchronously between two (or more) Ceph clusters. Both clusters participate in the same realm and zonegroup, each hosting a separate zone. Replication is bidirectional (active-active) by default — both zones accept reads and writes, with changes propagated via internal change logs.

The first zone created becomes the master zone (handles metadata operations and period updates). The second zone is the secondary zone and pulls the realm configuration from the master's RGW endpoint. After initial setup, data flows bidirectionally.

AWS Region A (source)                           AWS Region B (destination)
┌──────────────────────────────┐               ┌──────────────────────────────┐
│                              │               │                              │
│  S3 Bucket (versioned)       │  AWS CRR      │  S3 Bucket (versioned)       │
│  rciis-data-af-south-1       │ ─────────────>│  rciis-data-eu-west-1        │
│                              │  near         │                              │
│                              │  real-time    │                              │
└──────────────────────────────┘               └──────────────────────────────┘
                │
                │ IAM Role:
                │ s3-replication-role
                │ (assumed by S3 service)

AWS S3 handles replication automatically. When an object is written to the source bucket, S3 replicates it to the destination bucket asynchronously (typically within 15 minutes, often faster).

Note

S3 Standard storage class already replicates objects across a minimum of 3 Availability Zones within a single region. CRR is for replicating data to a different region or a different AWS account — you do not need it for AZ-level redundancy.

Prerequisites¶

On-Prem ↔ AWSOn-Prem ↔ On-PremAWS → AWS

On-prem Kubernetes cluster (runs ceph-to-aws-replicator):

Rook-Ceph cluster running with ceph-objectstore healthy
s3-replicator namespace added to allowUsersInNamespaces in the CephCluster values
Strimzi Kafka cluster running (e.g. kafka-rciis-prod in the rciis-prod namespace)
Apache Camel K operator deployed (watches the s3-replicator namespace)
KSOPS plugin available for SOPS secret decryption
Prometheus Operator installed (for alerting rules)
Network egress to AWS S3 (af-south-1)

AWS Kubernetes cluster (runs aws-to-ceph-replicator):

Kubernetes cluster running in the same region as the S3 bucket (af-south-1)
Apache Camel K operator deployed (watches the s3-replicator namespace)
KSOPS plugin available for SOPS secret decryption
Prometheus Operator installed (for alerting rules)
Network connectivity to the on-prem Ceph RGW endpoint (VPN, peering, or public ingress with TLS)

AWS managed services:

S3 bucket created in the target region with versioning and encryption enabled
SQS queue for receiving S3 event notifications
SQS dead-letter queue for failed messages
S3 event notification configuration pointing to the SQS queue
IAM user with least-privilege access to S3 and SQS

Both clusters:

Rook-Ceph cluster running with a healthy Ceph cluster
Rook operator v1.4+ (multi-site support)

Master cluster (Cluster A):

RGW endpoint reachable from Cluster B (for realm pull and data sync)
The realm, zonegroup, and master zone CRDs are created here first

Secondary cluster (Cluster B):

RGW endpoint reachable from Cluster A (for bidirectional sync)
The realm keys secret from Cluster A must be copied to this cluster before pulling the realm

Network connectivity:

Both clusters' RGW endpoints must be reachable from each other (bidirectional). This is required for the RGW sync agents to transfer data and metadata.

Method	Description
VPN tunnel	Site-to-site VPN between the two cluster networks
Network peering	Direct network peering if on the same WAN
Ingress / LoadBalancer	Expose RGW on both clusters via an ingress controller or LoadBalancer service with TLS

Warning

If exposing Ceph RGW over the internet via ingress, always use TLS and restrict access by source IP or mTLS.

Firewall rules — bidirectional connectivity between both clusters:

Source	Destination	Port	Protocol
Cluster A RGW	Cluster B RGW ingress	443	TCP (HTTPS)
Cluster B RGW	Cluster A RGW ingress	443	TCP (HTTPS)

Versioning enabled on both source and destination buckets (required by CRR)
IAM role with permissions for S3 to replicate on your behalf
Both buckets must exist before configuring replication

Destination Setup¶

On-Prem ↔ AWSOn-Prem ↔ On-PremAWS → AWS

1. Create the S3 Bucket¶

aws s3api create-bucket \
  --bucket rciis-ceph-replica-af-south-1 \
  --region af-south-1 \
  --create-bucket-configuration LocationConstraint=af-south-1

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket rciis-ceph-replica-af-south-1 \
  --versioning-configuration Status=Enabled

# Block public access
aws s3api put-public-access-block \
  --bucket rciis-ceph-replica-af-south-1 \
  --public-access-block-configuration \
    BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

# Enable SSE-S3 encryption
aws s3api put-bucket-encryption \
  --bucket rciis-ceph-replica-af-south-1 \
  --server-side-encryption-configuration \
    '{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"AES256"}}]}'

2. Create SQS Queues¶

Create the main event queue and a dead-letter queue for failed messages.

# Create the dead-letter queue first
aws sqs create-queue \
  --queue-name rciis-s3-replication-dlq \
  --region af-south-1

# Get the DLQ ARN
DLQ_ARN=$(aws sqs get-queue-attributes \
  --queue-url "https://sqs.af-south-1.amazonaws.com/$(aws sts get-caller-identity --query Account --output text)/rciis-s3-replication-dlq" \
  --attribute-names QueueArn \
  --query 'Attributes.QueueArn' --output text \
  --region af-south-1)

# Create the main queue with DLQ redrive policy
aws sqs create-queue \
  --queue-name rciis-s3-replication-events \
  --region af-south-1 \
  --attributes '{
    "RedrivePolicy": "{\"deadLetterTargetArn\":\"'"$DLQ_ARN"'\",\"maxReceiveCount\":\"5\"}",
    "VisibilityTimeout": "300"
  }'

3. Configure S3 Event Notifications → SQS¶

First, set the SQS queue policy to allow S3 to publish to it:

ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
QUEUE_ARN="arn:aws:sqs:af-south-1:${ACCOUNT_ID}:rciis-s3-replication-events"

aws sqs set-queue-attributes \
  --queue-url "https://sqs.af-south-1.amazonaws.com/${ACCOUNT_ID}/rciis-s3-replication-events" \
  --region af-south-1 \
  --attributes '{
    "Policy": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"s3.amazonaws.com\"},\"Action\":\"sqs:SendMessage\",\"Resource\":\"'"${QUEUE_ARN}"'\",\"Condition\":{\"ArnEquals\":{\"aws:SourceArn\":\"arn:aws:s3:::rciis-ceph-replica-af-south-1\"}}}]}"
  }'

Then configure S3 to send event notifications to SQS:

aws s3api put-bucket-notification-configuration \
  --bucket rciis-ceph-replica-af-south-1 \
  --region af-south-1 \
  --notification-configuration '{
    "QueueConfigurations": [
      {
        "QueueArn": "'"${QUEUE_ARN}"'",
        "Events": [
          "s3:ObjectCreated:*",
          "s3:ObjectRemoved:*"
        ]
      }
    ]
  }'

4. Create IAM User with S3 + SQS Permissions¶

aws iam create-user --user-name rciis-ceph-s3-replicator

cat > /tmp/s3-replicator-policy.json << 'EOF'
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "S3ReplicationAccess",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:GetObjectMetadata",
        "s3:HeadObject",
        "s3:DeleteObject",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::rciis-ceph-replica-af-south-1",
        "arn:aws:s3:::rciis-ceph-replica-af-south-1/*"
      ]
    },
    {
      "Sid": "SQSConsumerAccess",
      "Effect": "Allow",
      "Action": [
        "sqs:ReceiveMessage",
        "sqs:DeleteMessage",
        "sqs:GetQueueAttributes",
        "sqs:GetQueueUrl"
      ],
      "Resource": "arn:aws:sqs:af-south-1:*:rciis-s3-replication-events"
    },
    {
      "Sid": "SQSDLQAccess",
      "Effect": "Allow",
      "Action": [
        "sqs:SendMessage"
      ],
      "Resource": "arn:aws:sqs:af-south-1:*:rciis-s3-replication-dlq"
    }
  ]
}
EOF

aws iam put-user-policy \
  --user-name rciis-ceph-s3-replicator \
  --policy-name S3ReplicatorAccess \
  --policy-document file:///tmp/s3-replicator-policy.json

5. Generate Access Key¶

aws iam create-access-key --user-name rciis-ceph-s3-replicator
# Save the AccessKeyId and SecretAccessKey for the SOPS secret

Ceph multi-site does not require separate "destination setup" — the realm, zonegroup, and zone configuration on both clusters handles replication automatically. See the Kubernetes Manifests section for the full CRD configuration.

Both clusters need their RGW endpoints exposed externally. On each cluster, create an ingress or LoadBalancer for the RGW service:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ceph-rgw
  namespace: rook-ceph
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "0"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
spec:
  tls:
    - hosts:
        - rgw.cluster-a.example.com  # or rgw.cluster-b.example.com
      secretName: rgw-tls
  rules:
    - host: rgw.cluster-a.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: rook-ceph-rgw-multisite-store
                port:
                  number: 443

Tip

Set proxy-body-size: "0" to allow unlimited upload size through the ingress. The default limit is typically too small for object storage.

1. Create Source and Destination Buckets¶

# Source bucket (af-south-1)
aws s3api create-bucket \
  --bucket rciis-data-af-south-1 \
  --region af-south-1 \
  --create-bucket-configuration LocationConstraint=af-south-1

# Destination bucket (eu-west-1)
aws s3api create-bucket \
  --bucket rciis-data-eu-west-1 \
  --region eu-west-1 \
  --create-bucket-configuration LocationConstraint=eu-west-1

2. Enable Versioning on Both Buckets¶

aws s3api put-bucket-versioning \
  --bucket rciis-data-af-south-1 \
  --versioning-configuration Status=Enabled

aws s3api put-bucket-versioning \
  --bucket rciis-data-eu-west-1 \
  --versioning-configuration Status=Enabled

3. Create the IAM Replication Role¶

# Trust policy — allows S3 to assume this role
cat > /tmp/s3-replication-trust.json << 'EOF'
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "s3.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
EOF

aws iam create-role \
  --role-name s3-replication-role \
  --assume-role-policy-document file:///tmp/s3-replication-trust.json

# Permissions policy — read from source, write to destination
cat > /tmp/s3-replication-policy.json << 'EOF'
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetReplicationConfiguration",
        "s3:ListBucket"
      ],
      "Resource": "arn:aws:s3:::rciis-data-af-south-1"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObjectVersionForReplication",
        "s3:GetObjectVersionAcl",
        "s3:GetObjectVersionTagging"
      ],
      "Resource": "arn:aws:s3:::rciis-data-af-south-1/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:ReplicateObject",
        "s3:ReplicateDelete",
        "s3:ReplicateTags"
      ],
      "Resource": "arn:aws:s3:::rciis-data-eu-west-1/*"
    }
  ]
}
EOF

aws iam put-role-policy \
  --role-name s3-replication-role \
  --policy-name S3ReplicationPolicy \
  --policy-document file:///tmp/s3-replication-policy.json

4. Configure Replication on the Source Bucket¶

# Get your AWS account ID
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)

cat > /tmp/replication-config.json << EOF
{
  "Role": "arn:aws:iam::${ACCOUNT_ID}:role/s3-replication-role",
  "Rules": [
    {
      "ID": "replicate-all",
      "Status": "Enabled",
      "Priority": 1,
      "Filter": {},
      "Destination": {
        "Bucket": "arn:aws:s3:::rciis-data-eu-west-1",
        "StorageClass": "STANDARD"
      },
      "DeleteMarkerReplication": {
        "Status": "Enabled"
      }
    }
  ]
}
EOF

aws s3api put-bucket-replication \
  --bucket rciis-data-af-south-1 \
  --replication-configuration file:///tmp/replication-config.json

Tip

The Filter: {} setting replicates all objects. To replicate only a subset, add a Prefix or Tag filter.

Cross-Account Replication¶

If the source and destination buckets are in different AWS accounts, add a bucket policy on the destination to allow the source account's replication role to write:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<SOURCE_ACCOUNT_ID>:role/s3-replication-role"
      },
      "Action": [
        "s3:ReplicateObject",
        "s3:ReplicateDelete",
        "s3:ReplicateTags",
        "s3:ObjectOwnerOverrideToBucketOwner"
      ],
      "Resource": "arn:aws:s3:::rciis-data-eu-west-1/*"
    }
  ]
}

Add AccessControlTranslation and Account to the replication rule destination to transfer object ownership:

"Destination": {
  "Bucket": "arn:aws:s3:::rciis-data-eu-west-1",
  "Account": "<DEST_ACCOUNT_ID>",
  "AccessControlTranslation": {
    "Owner": "Destination"
  }
}

Pre-Existing Objects¶

Warning

CRR only replicates objects created after the replication rule is enabled. Pre-existing objects are not replicated automatically.

To replicate pre-existing objects, use S3 Batch Replication:

aws s3control create-job \
  --account-id $ACCOUNT_ID \
  --operation '{"S3ReplicateObject":{}}' \
  --manifest-generator '{"S3JobManifestGenerator":{"ExpectedBucketOwner":"'$ACCOUNT_ID'","SourceS3BucketArn":"arn:aws:s3:::rciis-data-af-south-1","EnableManifestOutput":false,"Filter":{"EligibleForReplication":true}}}' \
  --report '{"Enabled":true,"Bucket":"arn:aws:s3:::rciis-data-af-south-1","Format":"Report_CSV_20180820","ReportScope":"AllTasks","Prefix":"batch-replication-report"}' \
  --priority 1 \
  --role-arn arn:aws:iam::${ACCOUNT_ID}:role/s3-replication-role \
  --confirmation-required \
  --region af-south-1

Ceph Bucket Notification Setup¶

Note

This section applies only to the On-Prem ↔ AWS scenario. Skip this if you are using On-Prem → On-Prem or AWS → AWS replication.

Ceph RGW bucket notifications push object events to the Strimzi Kafka cluster. Two Rook CRs are required.

CephBucketTopic¶

Creates a Kafka-backed notification topic on the RGW:

apiVersion: ceph.rook.io/v1
kind: CephBucketTopic
metadata:
  name: s3-replication-topic
  namespace: rook-ceph
spec:
  objectStoreName: ceph-objectstore
  objectStoreNamespace: rook-ceph
  endpoint:
    kafka:
      uri: kafka-rciis-prod-kafka-bootstrap.rciis-prod.svc:9093
      useSSL: false
      ackLevel: broker

CephBucketNotification¶

Attaches the topic to the replication bucket so that create and delete events are published:

apiVersion: ceph.rook.io/v1
kind: CephBucketNotification
metadata:
  name: s3-replication-notification
  namespace: rook-ceph
spec:
  topic: s3-replication-topic
  events:
    - s3:ObjectCreated:*
    - s3:ObjectRemoved:*

Note

The CephBucketNotification must be in the same namespace as the CephBucketTopic. The notification is automatically attached to all buckets associated with the object store. To limit it to specific buckets, add a filter field with prefix/suffix/regex rules.

Kubernetes Manifests¶

On-Prem ↔ AWSOn-Prem ↔ On-PremAWS → AWS

CephObjectStoreUser¶

Creates a Rook-managed S3 user whose credentials are auto-generated as a Kubernetes secret.

apiVersion: ceph.rook.io/v1
kind: CephObjectStoreUser
metadata:
  name: s3-replicator
  namespace: s3-replicator
spec:
  store: ceph-objectstore
  clusterNamespace: rook-ceph
  displayName: s3-replicator

The auto-generated secret rook-ceph-object-user-ceph-objectstore-s3-replicator contains AccessKey and SecretKey fields.

ObjectBucketClaim¶

Dynamically provisions a bucket on Ceph for the replication source data.

apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  name: s3-replication-test
  namespace: s3-replicator
spec:
  bucketName: s3-replication-test
  storageClassName: ceph-bucket

Direction 1: Ceph → AWS Camel K Integration (on-prem cluster)¶

Location: apps/infra/s3-replicator/on-prem/extra/ceph-to-aws-pipeline.yaml

This integration runs in the on-prem Kubernetes cluster. It consumes bucket notifications from the local Kafka cluster, reads the object from Ceph locally (no WAN), and uploads it to AWS S3 (single WAN hop). The Camel K operator builds and deploys the integration pod automatically.

apiVersion: camel.apache.org/v1
kind: Integration
metadata:
  name: ceph-to-aws-replicator
  namespace: s3-replicator
  annotations:
    camel.apache.org/operator.id: camel-k-operator
  labels:
    app.kubernetes.io/name: ceph-to-aws-replicator
    app.kubernetes.io/component: s3-replication
    app.kubernetes.io/part-of: s3-replicator
spec:
  dependencies:
    - "camel:kafka"
    - "camel:aws2-s3"
    - "camel:jackson"
    - "camel:jsonpath"
    - "camel:bean"
    - "camel:log"
    - "camel:direct"

  traits:
    mount:
      configs:
        - "secret:rook-ceph-object-user-ceph-objectstore-s3-replicator"
        - "secret:aws-s3-credentials"
    container:
      requestMemory: "256Mi"
      requestCPU: "100m"
      limitMemory: "512Mi"
      limitCPU: "500m"
    health:
      enabled: true
      livenessProbeEnabled: true
      readinessProbeEnabled: true
    prometheus:
      enabled: true
      podMonitor: true

  flows:
    # Main route: Kafka → event routing → replication
    - from:
        uri: "kafka:ceph-bucket-notifications"
        parameters:
          brokers: "kafka-rciis-prod-kafka-bootstrap.rciis-prod.svc:9093"
          groupId: "s3-replicator-ceph-to-aws"
          autoOffsetReset: "earliest"
          autoCommitEnable: "true"
      steps:
        - unmarshal:
            json:
              library: Jackson

        # Loop prevention: check userIdentity.principalId
        - setHeader:
            name: "CephPrincipalId"
            jsonpath:
              expression: "$.Records[0].userIdentity.principalId"
              suppressExceptions: true
              resultType: "java.lang.String"
        - choice:
            when:
              - simple: "${header.CephPrincipalId} == 's3-replicator'"
                steps:
                  - log: "Loop prevention: skipping event written by s3-replicator"
                  - stop: {}

        # Extract event details
        - setHeader:
            name: "EventName"
            jsonpath:
              expression: "$.Records[0].eventName"
              resultType: "java.lang.String"
        - setHeader:
            name: "SourceBucket"
            jsonpath:
              expression: "$.Records[0].s3.bucket.name"
              resultType: "java.lang.String"
        - setHeader:
            name: "ObjectKey"
            jsonpath:
              expression: "$.Records[0].s3.object.key"
              resultType: "java.lang.String"

        - log: "Processing event=${header.EventName} bucket=${header.SourceBucket} key=${header.ObjectKey}"

        # Route by event type
        - choice:
            when:
              - simple: "${header.EventName} contains 'ObjectCreated'"
                steps:
                  - to: "direct:replicate-to-aws"
              - simple: "${header.EventName} contains 'ObjectRemoved'"
                steps:
                  - to: "direct:delete-from-aws"
            otherwise:
              steps:
                - log:
                    message: "Unknown event type ${header.EventName}, skipping"
                    loggingLevel: "WARN"

    # Sub-route: Download from Ceph → Upload to AWS
    - from:
        uri: "direct:replicate-to-aws"
      steps:
        - setHeader:
            name: "CamelAwsS3Key"
            simple: "${header.ObjectKey}"
        - setHeader:
            name: "CamelAwsS3BucketName"
            simple: "${header.SourceBucket}"
        - to:
            uri: "aws2-s3://ignored"
            parameters:
              operation: "getObject"
              accessKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/AccessKey}}"
              secretKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/SecretKey}}"
              region: "us-east-1"
              uriEndpointOverride: "http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80"
              overrideEndpoint: "true"
              forcePathStyle: "true"
              autoCreateBucket: "false"
        - log: "Downloaded ${header.ObjectKey} from Ceph (${header.CamelAwsS3ContentLength} bytes)"

        - setHeader:
            name: "CamelAwsS3Key"
            simple: "${header.ObjectKey}"
        - setHeader:
            name: "CamelAwsS3BucketName"
            simple: "${header.SourceBucket}"
        - to:
            uri: "aws2-s3://ignored"
            parameters:
              operation: "putObject"
              accessKey: "{{secret:aws-s3-credentials/access-key}}"
              secretKey: "{{secret:aws-s3-credentials/secret-key}}"
              region: "af-south-1"
              autoCreateBucket: "false"
        - log: "Uploaded ${header.ObjectKey} to AWS S3"

    # Sub-route: Delete from AWS
    - from:
        uri: "direct:delete-from-aws"
      steps:
        - setHeader:
            name: "CamelAwsS3Key"
            simple: "${header.ObjectKey}"
        - setHeader:
            name: "CamelAwsS3BucketName"
            simple: "${header.SourceBucket}"
        - to:
            uri: "aws2-s3://ignored"
            parameters:
              operation: "deleteObject"
              accessKey: "{{secret:aws-s3-credentials/access-key}}"
              secretKey: "{{secret:aws-s3-credentials/secret-key}}"
              region: "af-south-1"
              autoCreateBucket: "false"
        - log: "Deleted ${header.ObjectKey} from AWS S3"

Direction 2: AWS → Ceph Camel K Integration (AWS cluster)¶

Location: apps/infra/s3-replicator/aws/extra/aws-to-ceph-pipeline.yaml

This integration runs in the AWS Kubernetes cluster. It consumes S3 event notifications from SQS locally, reads the object from S3 locally (no WAN), and uploads it to Ceph RGW on-prem (single WAN hop). The uriEndpointOverride for the Ceph target must point to the externally reachable Ceph RGW endpoint.

apiVersion: camel.apache.org/v1
kind: Integration
metadata:
  name: aws-to-ceph-replicator
  namespace: s3-replicator
  annotations:
    camel.apache.org/operator.id: camel-k-operator
  labels:
    app.kubernetes.io/name: aws-to-ceph-replicator
    app.kubernetes.io/component: s3-replication
    app.kubernetes.io/part-of: s3-replicator
spec:
  dependencies:
    - "camel:aws2-sqs"
    - "camel:aws2-s3"
    - "camel:jackson"
    - "camel:jsonpath"
    - "camel:bean"
    - "camel:log"
    - "camel:direct"

  traits:
    mount:
      configs:
        - "secret:rook-ceph-object-user-ceph-objectstore-s3-replicator"
        - "secret:aws-s3-credentials"
    container:
      requestMemory: "256Mi"
      requestCPU: "100m"
      limitMemory: "512Mi"
      limitCPU: "500m"
    health:
      enabled: true
      livenessProbeEnabled: true
      readinessProbeEnabled: true
    prometheus:
      enabled: true
      podMonitor: true

  flows:
    # Main route: SQS → event routing → replication
    - from:
        uri: "aws2-sqs:rciis-s3-replication-events"
        parameters:
          region: "af-south-1"
          accessKey: "{{secret:aws-s3-credentials/access-key}}"
          secretKey: "{{secret:aws-s3-credentials/secret-key}}"
          waitTimeSeconds: "20"
          maxMessagesPerPoll: "10"
          deleteAfterRead: "true"
          visibilityTimeout: "120"
      steps:
        - unmarshal:
            json:
              library: Jackson

        # Loop prevention: check userIdentity.principalId
        - setHeader:
            name: "AwsPrincipalId"
            jsonpath:
              expression: "$.Records[0].userIdentity.principalId"
              suppressExceptions: true
              resultType: "java.lang.String"
        - choice:
            when:
              - simple: "${header.AwsPrincipalId} == 'rciis-ceph-s3-replicator'"
                steps:
                  - log: "Loop prevention: skipping event written by rciis-ceph-s3-replicator"
                  - stop: {}

        # Extract event details
        - setHeader:
            name: "EventName"
            jsonpath:
              expression: "$.Records[0].eventName"
              resultType: "java.lang.String"
        - setHeader:
            name: "SourceBucket"
            jsonpath:
              expression: "$.Records[0].s3.bucket.name"
              resultType: "java.lang.String"
        - setHeader:
            name: "ObjectKey"
            jsonpath:
              expression: "$.Records[0].s3.object.key"
              resultType: "java.lang.String"

        - log: "Processing event=${header.EventName} bucket=${header.SourceBucket} key=${header.ObjectKey}"

        # Route by event type
        - choice:
            when:
              - simple: "${header.EventName} contains 'ObjectCreated'"
                steps:
                  - to: "direct:replicate-to-ceph"
              - simple: "${header.EventName} contains 'ObjectRemoved'"
                steps:
                  - to: "direct:delete-from-ceph"
            otherwise:
              steps:
                - log:
                    message: "Unknown event type ${header.EventName}, skipping"
                    loggingLevel: "WARN"

    # Sub-route: Download from AWS → Upload to Ceph
    - from:
        uri: "direct:replicate-to-ceph"
      steps:
        - setHeader:
            name: "CamelAwsS3Key"
            simple: "${header.ObjectKey}"
        - setHeader:
            name: "CamelAwsS3BucketName"
            simple: "${header.SourceBucket}"
        - to:
            uri: "aws2-s3://ignored"
            parameters:
              operation: "getObject"
              accessKey: "{{secret:aws-s3-credentials/access-key}}"
              secretKey: "{{secret:aws-s3-credentials/secret-key}}"
              region: "af-south-1"
              autoCreateBucket: "false"
        - log: "Downloaded ${header.ObjectKey} from AWS S3 (${header.CamelAwsS3ContentLength} bytes)"

        - setHeader:
            name: "CamelAwsS3Key"
            simple: "${header.ObjectKey}"
        - setHeader:
            name: "CamelAwsS3BucketName"
            simple: "${header.SourceBucket}"
        - to:
            uri: "aws2-s3://ignored"
            parameters:
              operation: "putObject"
              accessKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/AccessKey}}"
              secretKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/SecretKey}}"
              region: "us-east-1"
              uriEndpointOverride: "http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80"
              overrideEndpoint: "true"
              forcePathStyle: "true"
              autoCreateBucket: "false"
        - log: "Uploaded ${header.ObjectKey} to Ceph RGW"

    # Sub-route: Delete from Ceph
    - from:
        uri: "direct:delete-from-ceph"
      steps:
        - setHeader:
            name: "CamelAwsS3Key"
            simple: "${header.ObjectKey}"
        - setHeader:
            name: "CamelAwsS3BucketName"
            simple: "${header.SourceBucket}"
        - to:
            uri: "aws2-s3://ignored"
            parameters:
              operation: "deleteObject"
              accessKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/AccessKey}}"
              secretKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/SecretKey}}"
              region: "us-east-1"
              uriEndpointOverride: "http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80"
              overrideEndpoint: "true"
              forcePathStyle: "true"
              autoCreateBucket: "false"
        - log: "Deleted ${header.ObjectKey} from Ceph RGW"

How Loop Prevention Works

Both integrations use the userIdentity.principalId field from the S3 event notification payload to prevent infinite replication loops. This field is set by both Ceph RGW and AWS S3 to identify the user who performed the write.

End-to-end example (AWS → Ceph → stop):

User uploads object to AWS S3 (writes as their own IAM user)
S3 event → SQS → aws-to-ceph-replicator integration
Integration checks principalId — it is the user's IAM ID, not rciis-ceph-s3-replicator → proceeds
Integration downloads from AWS via aws2-s3:getObject and uploads to Ceph via aws2-s3:putObject (writes as s3-replicator Ceph user)
Ceph emits bucket notification → Kafka → ceph-to-aws-replicator integration
Integration checks principalId — it is s3-replicator → event skipped

Advantages over metadata-based loop prevention:

Zero extra API calls (no headObject needed)
No race conditions between metadata check and copy
Works for both creates and deletes
Does not modify the object or its metadata
Information is already present in the event payload

Camel K Operator — Both Clusters

The Camel K operator must be installed in both the on-prem and AWS Kubernetes clusters, each watching its local s3-replicator namespace. The camel.apache.org/operator.id annotation binds each Integration to the correct operator instance. The operator automatically builds a container image, creates a Deployment, and manages the pod lifecycle — no manual Deployment, Service, or ConfigMap resources are needed.

AWS → Ceph Network Connectivity

The aws-to-ceph-replicator integration running in the AWS cluster must be able to reach the on-prem Ceph RGW endpoint. The uriEndpointOverride in the Integration CR must point to the externally reachable RGW URL (e.g. https://rgw.example.com), not the cluster-internal service DNS. Ensure connectivity via VPN, network peering, or a TLS-secured ingress.

PrometheusRule¶

Alerts for both Camel K replication integrations. The Camel K operator creates PodMonitors automatically when prometheus.podMonitor: true is set in the Integration traits.

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: s3-replicator-alerts
  namespace: s3-replicator
spec:
  groups:
    - name: s3-replicator
      rules:
        - alert: S3ReplicatorCephToAwsDown
          expr: |
            absent(up{namespace="s3-replicator", pod=~"ceph-to-aws-replicator.*"} == 1)
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "Ceph → AWS replication integration is down"

        - alert: S3ReplicatorAwsToCephDown
          expr: |
            absent(up{namespace="s3-replicator", pod=~"aws-to-ceph-replicator.*"} == 1)
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "AWS → Ceph replication integration is down"

        - alert: S3ReplicatorHighErrorRate
          expr: |
            sum(rate(camel_exchanges_failed_total{namespace="s3-replicator"}[5m])) > 0.1
          for: 10m
          labels:
            severity: warning
          annotations:
            summary: "S3 replication error rate elevated"

        - alert: S3ReplicatorHighLatency
          expr: |
            histogram_quantile(0.95,
              sum(rate(camel_exchange_event_notifier_seconds_bucket{namespace="s3-replicator"}[5m])) by (le, routeId)
            ) > 30
          for: 10m
          labels:
            severity: warning
          annotations:
            summary: "S3 replication latency is high"

        - alert: S3ReplicatorKafkaConsumerLag
          expr: |
            kafka_consumergroup_lag{group="s3-replicator-ceph-to-aws", namespace="rciis-prod"} > 1000
          for: 15m
          labels:
            severity: warning
          annotations:
            summary: "Ceph → AWS replication consumer lag is high"

SOPS-Encrypted Secrets¶

Each cluster needs its own set of SOPS-encrypted secrets.

On-prem cluster — needs AWS credentials (to write to AWS S3):

# apps/infra/secrets/on-prem/s3-replicator/secret-generator.yaml
apiVersion: viaduct.ai/v1
kind: ksops
metadata:
  name: aws-s3-credentials-generator
  annotations:
    config.kubernetes.io/function: |
      exec:
        path: ksops
files:
  - ./aws-s3-credentials.yaml

# apps/infra/secrets/on-prem/s3-replicator/aws-s3-credentials.yaml (before encryption)
apiVersion: v1
kind: Secret
metadata:
  name: aws-s3-credentials
  namespace: s3-replicator
type: Opaque
stringData:
  access-key: "AKIA..."
  secret-key: "..."
  region: "af-south-1"

On the on-prem cluster, Ceph credentials are auto-generated by the CephObjectStoreUser CR as secret rook-ceph-object-user-ceph-objectstore-s3-replicator.

AWS cluster — needs both AWS credentials (to read from S3 and consume SQS) and Ceph RGW credentials (to write to Ceph):

# apps/infra/secrets/aws/s3-replicator/aws-s3-credentials.yaml (before encryption)
apiVersion: v1
kind: Secret
metadata:
  name: aws-s3-credentials
  namespace: s3-replicator
type: Opaque
stringData:
  access-key: "AKIA..."
  secret-key: "..."
  region: "af-south-1"
  sqs-queue-url: "https://sqs.af-south-1.amazonaws.com/ACCOUNT_ID/rciis-s3-replication-events"

# apps/infra/secrets/aws/s3-replicator/ceph-rgw-credentials.yaml (before encryption)
apiVersion: v1
kind: Secret
metadata:
  name: ceph-rgw-credentials
  namespace: s3-replicator
type: Opaque
stringData:
  AccessKey: "..."
  SecretKey: "..."
  endpoint: "https://rgw.example.com"

Note

On the AWS cluster there is no Rook-Ceph operator, so the Ceph credentials cannot be auto-generated. They must be manually created as a SOPS-encrypted secret containing the AccessKey and SecretKey from the on-prem CephObjectStoreUser.

All secrets are mounted into the Camel K integration pods via the mount.configs trait and accessed using {{secret:name/key}} property placeholders in the route URIs.

Ceph multi-site replication is configured entirely via Rook CRDs. No CronJobs, external tools, or application-level replication pods are required — the RGW instances handle replication natively.

Master Cluster (Cluster A)¶

Create the realm, zonegroup, zone, and object store on the master cluster:

# Realm — top-level container for multi-site
apiVersion: ceph.rook.io/v1
kind: CephObjectRealm
metadata:
  name: rciis
  namespace: rook-ceph
---
# Zonegroup — collection of zones that replicate data
apiVersion: ceph.rook.io/v1
kind: CephObjectZoneGroup
metadata:
  name: rciis-sites
  namespace: rook-ceph
spec:
  realm: rciis
---
# Master zone — first zone created becomes master
apiVersion: ceph.rook.io/v1
kind: CephObjectZone
metadata:
  name: site-a
  namespace: rook-ceph
spec:
  zoneGroup: rciis-sites
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
      requireSafeReplicaSize: true
  dataPool:
    failureDomain: host
    replicated:
      size: 3
      requireSafeReplicaSize: true
  customEndpoints:
    - "https://rgw.cluster-a.example.com"
  preservePoolsOnDelete: true
---
# Object store — references the zone
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: multisite-store
  namespace: rook-ceph
spec:
  gateway:
    port: 80
    securePort: 443
    instances: 2
    sslCertificateRef: rgw-tls
  zone:
    name: site-a

Bootstrap Secret¶

After the realm is created on the master cluster, Rook auto-generates a secret rciis-keys containing the system user credentials. Export this secret and apply it to the secondary cluster:

# On the master cluster — export the realm keys
kubectl -n rook-ceph get secret rciis-keys -o yaml > rciis-keys.yaml

# Edit the namespace to match the secondary cluster's Rook namespace
# Then apply on the secondary cluster
kubectl apply -f rciis-keys.yaml

Secondary Cluster (Cluster B)¶

After applying the realm keys secret, create the realm (with pull), zone, and object store:

# Pull the realm configuration from the master cluster's RGW endpoint
apiVersion: ceph.rook.io/v1
kind: CephObjectRealm
metadata:
  name: rciis
  namespace: rook-ceph
spec:
  pull:
    endpoint: "https://rgw.cluster-a.example.com"
---
# Zonegroup — must match the master's zonegroup name
apiVersion: ceph.rook.io/v1
kind: CephObjectZoneGroup
metadata:
  name: rciis-sites
  namespace: rook-ceph
spec:
  realm: rciis
---
# Secondary zone — joins the same zonegroup
apiVersion: ceph.rook.io/v1
kind: CephObjectZone
metadata:
  name: site-b
  namespace: rook-ceph
spec:
  zoneGroup: rciis-sites
  metadataPool:
    failureDomain: host
    replicated:
      size: 3
      requireSafeReplicaSize: true
  dataPool:
    failureDomain: host
    replicated:
      size: 3
      requireSafeReplicaSize: true
  customEndpoints:
    - "https://rgw.cluster-b.example.com"
  preservePoolsOnDelete: true
---
# Object store — references the secondary zone
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
  name: multisite-store
  namespace: rook-ceph
spec:
  gateway:
    port: 80
    securePort: 443
    instances: 2
    sslCertificateRef: rgw-tls
  zone:
    name: site-b

Order of operations

The CRDs must be created in this order: Realm → ZoneGroup → Zone → ObjectStore. On the secondary cluster, the realm keys secret must exist before creating the CephObjectRealm with pull.

customEndpoints

The customEndpoints field must contain the externally reachable RGW URL for each zone. This is how the other zone knows where to send sync traffic. If omitted, Rook defaults to the internal ClusterIP service DNS, which is not reachable from another cluster.

No Kubernetes resources are required. AWS Cross-Region Replication is fully managed by AWS and is configured via the AWS CLI or Console (see Destination Setup).

Deployment¶

On-Prem ↔ AWSOn-Prem ↔ On-PremAWS → AWS

The s3-replicator is deployed across two clusters. Each cluster has its own FluxCD Kustomization for the s3-replicator.

On-prem cluster (Ceph → AWS):

# Sync the on-prem s3-replicator application
flux reconcile kustomization rciis-s3-replicator-on-prem

# Verify the integration is running
kubectl get integrations -n s3-replicator
# Expected:
# NAME                      PHASE    KIT                        REPLICAS
# ceph-to-aws-replicator    Running  kit-xxxx                   1

AWS cluster (AWS → Ceph):

# Sync the AWS s3-replicator application
flux reconcile kustomization rciis-s3-replicator-aws

# Verify the integration is running
kubectl get integrations -n s3-replicator
# Expected:
# NAME                      PHASE    KIT                        REPLICAS
# aws-to-ceph-replicator    Running  kit-xxxx                   1

Verify pods on each cluster:

# On-prem cluster
kubectl get pods -n s3-replicator
# Expected: ceph-to-aws-replicator-xxx   1/1  Running

# AWS cluster
kubectl get pods -n s3-replicator
# Expected: aws-to-ceph-replicator-xxx   1/1  Running

Tip

The Camel K operator first builds a container image for each Integration (the Building phase), then transitions to Running. The first build takes 1-3 minutes. Subsequent changes to the Integration CR trigger a new build automatically.

Multi-site replication is configured entirely via Rook CRDs in the rook-ceph namespace on each cluster. There is no separate FluxCD Kustomization — the CRDs are part of the Rook-Ceph cluster configuration.

Step 1 — Master cluster (Cluster A):

Apply the realm, zonegroup, zone, and object store CRDs:

kubectl apply -f realm.yaml -f zonegroup.yaml -f zone-a.yaml -f objectstore-a.yaml -n rook-ceph

Wait for the object store to be ready and the realm keys secret to be generated:

kubectl wait --for=jsonpath='{.status.phase}'=Ready \
  cephobjectstore/multisite-store -n rook-ceph --timeout=300s

# Verify the realm keys secret exists
kubectl get secret rciis-keys -n rook-ceph

Step 2 — Export realm keys:

kubectl -n rook-ceph get secret rciis-keys -o yaml > rciis-keys.yaml
# Edit the file to update the namespace if needed, then apply on Cluster B

Step 3 — Secondary cluster (Cluster B):

Apply the realm keys secret first, then the CRDs:

kubectl apply -f rciis-keys.yaml -n rook-ceph
kubectl apply -f realm-pull.yaml -f zonegroup.yaml -f zone-b.yaml -f objectstore-b.yaml -n rook-ceph

Wait for the secondary object store and verify sync is established:

kubectl wait --for=jsonpath='{.status.phase}'=Ready \
  cephobjectstore/multisite-store -n rook-ceph --timeout=300s

Step 4 — Verify sync status from either cluster:

# Exec into the Rook toolbox
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin sync status

Expected output shows both zones with data is caught up with source when fully synced.

Tip

If the CRDs are managed via FluxCD as part of your Rook-Ceph cluster values, deploy Cluster A first, export the realm keys, apply them to Cluster B, and then deploy Cluster B. The realm keys secret is the only manual step — everything else is declarative.

No Kubernetes deployment needed. Replication is active as soon as the replication configuration is applied to the source bucket via put-bucket-replication (see Destination Setup).

Verify the replication configuration:

aws s3api get-bucket-replication --bucket rciis-data-af-south-1

Testing & Verification¶

On-Prem ↔ AWSOn-Prem ↔ On-PremAWS → AWS

Test Ceph → AWS¶

Upload a file to Ceph and verify it appears in AWS within 60 seconds.

# Get Ceph S3 credentials
export AWS_ACCESS_KEY_ID=$(kubectl get secret \
  rook-ceph-object-user-ceph-objectstore-s3-replicator \
  -n s3-replicator -o jsonpath='{.data.AccessKey}' | base64 -d)
export AWS_SECRET_ACCESS_KEY=$(kubectl get secret \
  rook-ceph-object-user-ceph-objectstore-s3-replicator \
  -n s3-replicator -o jsonpath='{.data.SecretKey}' | base64 -d)

# Upload a test file to Ceph
echo "ceph-to-aws-test-$(date +%s)" > /tmp/test-ceph-to-aws.txt
aws s3 cp /tmp/test-ceph-to-aws.txt s3://s3-replication-test/test-ceph-to-aws.txt \
  --endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80

# Wait and verify in AWS (should appear within 60 seconds)
sleep 60
aws s3 ls s3://rciis-ceph-replica-af-south-1/s3-replication-test/test-ceph-to-aws.txt \
  --region af-south-1

Test AWS → Ceph¶

Upload a file to AWS and verify it appears in Ceph within 60 seconds.

echo "aws-to-ceph-test-$(date +%s)" > /tmp/test-aws-to-ceph.txt
aws s3 cp /tmp/test-aws-to-ceph.txt \
  s3://rciis-ceph-replica-af-south-1/s3-replication-test/test-aws-to-ceph.txt \
  --region af-south-1

sleep 60
aws s3 ls s3://s3-replication-test/test-aws-to-ceph.txt \
  --endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80

Test Loop Prevention¶

Verify that replicated objects are not re-replicated in the opposite direction.

# Upload to Ceph (on-prem cluster), wait for it to arrive in AWS
aws s3 cp /tmp/test-ceph-to-aws.txt s3://s3-replication-test/loop-test.txt \
  --endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80
sleep 60

# Verify the object exists in AWS
aws s3 ls s3://rciis-ceph-replica-af-south-1/s3-replication-test/loop-test.txt \
  --region af-south-1

# On the AWS cluster, check the aws-to-ceph-replicator logs to confirm it skipped the event
# (switch kubectl context to the AWS cluster)
kubectl logs -l camel.apache.org/integration=aws-to-ceph-replicator \
  -n s3-replicator --tail=50 | grep "Loop prevention"
# Expected: "Loop prevention: skipping event written by rciis-ceph-s3-replicator"

Test Delete Replication¶

# Delete from Ceph (on-prem cluster), verify deletion in AWS
aws s3 rm s3://s3-replication-test/test-ceph-to-aws.txt \
  --endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80
sleep 60
aws s3 ls s3://rciis-ceph-replica-af-south-1/s3-replication-test/test-ceph-to-aws.txt \
  --region af-south-1
# Expected: no output (object deleted)

Test Failure Recovery¶

# On the on-prem cluster: kill the Ceph→AWS replicator pod
kubectl delete pod -l camel.apache.org/integration=ceph-to-aws-replicator -n s3-replicator

# Upload a file while the pod is down
aws s3 cp /tmp/test-ceph-to-aws.txt s3://s3-replication-test/recovery-test.txt \
  --endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80

# Wait for the pod to restart (the Camel K operator recreates it)
kubectl wait --for=condition=Ready \
  pod -l camel.apache.org/integration=ceph-to-aws-replicator \
  -n s3-replicator --timeout=120s

# The Kafka consumer group will resume from the last committed offset.
sleep 60
aws s3 ls s3://rciis-ceph-replica-af-south-1/s3-replication-test/recovery-test.txt \
  --region af-south-1

Check Sync Status¶

Verify that both zones are in sync:

# From the Rook toolbox on either cluster
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin sync status

Expected output when healthy:

          realm rciis (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
      zonegroup rciis-sites (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
           zone site-a (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
  metadata sync no sync (zone is master)
      data sync source: site-b (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source

Upload Test Object on Cluster A¶

# Create a user on the multisite object store (Cluster A)
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin user create --uid=test-user --display-name="Test User" \
  --rgw-zone=site-a

# Get credentials from the output and set environment variables
export AWS_ACCESS_KEY_ID="<access-key-from-output>"
export AWS_SECRET_ACCESS_KEY="<secret-key-from-output>"

# Upload via Cluster A's RGW endpoint
echo "multisite-replication-test-$(date +%s)" > /tmp/test-multisite.txt
aws s3 cp /tmp/test-multisite.txt s3://test-bucket/test-multisite.txt \
  --endpoint-url https://rgw.cluster-a.example.com

Verify on Cluster B¶

Objects replicate asynchronously (typically within seconds to minutes):

# List via Cluster B's RGW endpoint (same credentials — users replicate with data)
aws s3 ls s3://test-bucket/ \
  --endpoint-url https://rgw.cluster-b.example.com

Test Bidirectional Replication¶

Upload an object via Cluster B and verify it appears on Cluster A:

echo "reverse-replication-test-$(date +%s)" > /tmp/test-reverse.txt
aws s3 cp /tmp/test-reverse.txt s3://test-bucket/test-reverse.txt \
  --endpoint-url https://rgw.cluster-b.example.com

# Wait a few seconds, then verify on Cluster A
sleep 30
aws s3 ls s3://test-bucket/test-reverse.txt \
  --endpoint-url https://rgw.cluster-a.example.com

Verify Metadata Sync¶

Users, buckets, and ACLs replicate automatically between zones. Verify a user created on Cluster A exists on Cluster B:

kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin user info --uid=test-user --rgw-zone=site-b

Upload a Test Object¶

aws s3 cp /tmp/testfile.txt s3://rciis-data-af-south-1/testfile.txt --region af-south-1

Verify Replication (wait a few minutes)¶

aws s3 ls s3://rciis-data-eu-west-1/testfile.txt --region eu-west-1

Check Replication Status of an Object¶

aws s3api head-object \
  --bucket rciis-data-af-south-1 \
  --key testfile.txt \
  --query ReplicationStatus
# Expected: "COMPLETED"

Operations¶

On-Prem ↔ AWSOn-Prem ↔ On-PremAWS → AWS

Monitoring¶

Alert	Condition	Severity
`S3ReplicatorCephToAwsDown`	Ceph→AWS integration pod is down	critical
`S3ReplicatorAwsToCephDown`	AWS→Ceph integration pod is down	critical
`S3ReplicatorHighErrorRate`	Camel exchange failure rate > 0.1/s for 10 min	warning
`S3ReplicatorHighLatency`	95th percentile exchange time > 30s for 10 min	warning
`S3ReplicatorKafkaConsumerLag`	Consumer lag > 1000 messages for 15 min	warning

Each Camel K integration pod exposes Camel Micrometer metrics automatically via PodMonitors created by the Camel K operator. PrometheusRules are deployed in each cluster's s3-replicator namespace.

Scaling¶

The Camel K operator manages the pod lifecycle. To scale, you can either modify the Integration CR to increase replicas or create additional Integration CRs with different consumer group IDs for partitioned workloads. Each Kafka consumer within the same group joins the consumer group and processes partitions independently, so horizontal scaling is straightforward for Direction 1 (on-prem). Direction 2 (SQS, AWS cluster) also scales naturally since SQS supports multiple concurrent consumers.

Kafka Consumer Lag (on-prem cluster)¶

Monitor Kafka consumer lag for the s3-replicator-ceph-to-aws consumer group to detect replication backlog:

# On the on-prem cluster
kubectl exec -n rciis-prod kafka-rciis-prod-kraft-dual-role-0 -- \
  bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  --describe --group s3-replicator-ceph-to-aws

SQS Queue Depth (AWS)¶

Check the AWS SQS queue for pending messages:

aws sqs get-queue-attributes \
  --queue-url "https://sqs.af-south-1.amazonaws.com/ACCOUNT_ID/rciis-s3-replication-events" \
  --attribute-names ApproximateNumberOfMessages \
  --region af-south-1

Integration Status¶

Check the Camel K Integration status on each cluster:

# On-prem cluster
kubectl get integrations -n s3-replicator
kubectl describe integration ceph-to-aws-replicator -n s3-replicator
kubectl get integrationkit -n s3-replicator

# AWS cluster (switch kubectl context)
kubectl get integrations -n s3-replicator
kubectl describe integration aws-to-ceph-replicator -n s3-replicator
kubectl get integrationkit -n s3-replicator

Sync Status¶

Check the sync status from either cluster's Rook toolbox:

# Overall sync status
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin sync status

# Detailed data sync status for a specific source zone
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin data sync status --source-zone=site-b

Key fields in the output:

Field	Meaning
`data is caught up with source`	All data is fully synced
`full sync: X/128 shards`	Full (initial) sync progress
`incremental sync: X/128 shards`	Incremental (ongoing) sync progress
`behind shards`	Number of shards that are behind — indicates replication lag

Sync Performance Counters¶

# View RGW performance counters related to sync
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  ceph --admin-daemon /var/run/ceph/ceph-client.rgw.*.asok perf dump | \
  jq '.data_sync'

Monitoring¶

Ceph RGW multi-site exposes sync metrics via the Ceph MGR Prometheus module. Key metrics:

Metric	Description
`ceph_rgw_sync_status`	Overall sync status per zone
`ceph_data_sync_from_*_fetch_bytes_sum`	Total bytes fetched from remote zone
`ceph_data_sync_from_*_fetch_bytes_count`	Number of fetch operations from remote zone
`ceph_data_sync_from_*_poll_latency_sum`	Latency of data log polling

These metrics are scraped automatically by the Prometheus instance monitoring the Rook-Ceph cluster.

RGW Instance Scaling¶

The number of RGW instances per zone is controlled by the gateway.instances field in the CephObjectStore CR. Increasing this scales both client request handling and sync throughput:

kubectl -n rook-ceph patch cephobjectstore multisite-store \
  --type merge -p '{"spec":{"gateway":{"instances":3}}}'

Sync Throttling¶

If sync traffic is saturating the network link between clusters, configure sync throttling via Ceph config:

kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  ceph config set client.rgw rgw_sync_data_inject_err_probability 0
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  ceph config set client.rgw rgw_data_sync_spawn_window 16

Config Key	Default	Description
`rgw_data_sync_spawn_window`	16	Max concurrent sync operations per shard
`rgw_meta_sync_spawn_window`	16	Max concurrent metadata sync operations
`rgw_sync_lease_period`	120	Sync lease period in seconds

S3 Replication Metrics¶

Enable replication metrics on the source bucket by adding Metrics and ReplicationTime to the replication rule:

{
  "Rules": [
    {
      "ID": "replicate-all",
      "Status": "Enabled",
      "Priority": 1,
      "Filter": {},
      "Destination": {
        "Bucket": "arn:aws:s3:::rciis-data-eu-west-1",
        "Metrics": {
          "Status": "Enabled",
          "EventThreshold": {
            "Minutes": 15
          }
        },
        "ReplicationTime": {
          "Status": "Enabled",
          "Time": {
            "Minutes": 15
          }
        }
      },
      "DeleteMarkerReplication": {
        "Status": "Enabled"
      }
    }
  ]
}

CloudWatch Metrics¶

With replication metrics enabled, the following CloudWatch metrics are available:

Metric	Description
`ReplicationLatency`	Time to replicate objects to the destination
`OperationsPendingReplication`	Number of objects pending replication
`BytesPendingReplication`	Total bytes pending replication
`OperationsFailedReplication`	Number of objects that failed to replicate

Troubleshooting¶

On-Prem ↔ AWSOn-Prem ↔ On-PremAWS → AWS

Auth Error (Ceph Side)¶

kubectl get cephobjectstoreuser s3-replicator -n s3-replicator -o yaml
kubectl get secret rook-ceph-object-user-ceph-objectstore-s3-replicator -n s3-replicator
kubectl get cephobjectstore ceph-objectstore -n rook-ceph \
  -o jsonpath='{.spec.allowUsersInNamespaces}'

Auth Error (AWS Side)¶

kubectl get secret aws-s3-credentials -n s3-replicator
kubectl run -it --rm aws-test --image=amazon/aws-cli --restart=Never \
  -n s3-replicator -- s3 ls s3://rciis-ceph-replica-af-south-1/ --region af-south-1

Ceph → AWS Integration Not Replicating (on-prem cluster)¶

# On the on-prem cluster:
# Check integration status
kubectl get integration ceph-to-aws-replicator -n s3-replicator

# Check integration pod logs
kubectl logs -l camel.apache.org/integration=ceph-to-aws-replicator \
  -n s3-replicator --tail=100

# Verify Kafka topic has messages
kubectl exec -n rciis-prod kafka-rciis-prod-kraft-dual-role-0 -- \
  bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 \
  --topic ceph-bucket-notifications --from-beginning --max-messages 5

# Verify CephBucketTopic and CephBucketNotification are healthy
kubectl get cephbuckettopic -n rook-ceph
kubectl get cephbucketnotification -n rook-ceph

AWS → Ceph Integration Not Replicating (AWS cluster)¶

# On the AWS cluster (switch kubectl context):
# Check integration status
kubectl get integration aws-to-ceph-replicator -n s3-replicator

# Check integration pod logs
kubectl logs -l camel.apache.org/integration=aws-to-ceph-replicator \
  -n s3-replicator --tail=100

# Verify SQS queue has messages
aws sqs get-queue-attributes \
  --queue-url "https://sqs.af-south-1.amazonaws.com/ACCOUNT_ID/rciis-s3-replication-events" \
  --attribute-names ApproximateNumberOfMessages \
  --region af-south-1

# Check the DLQ for failed messages
aws sqs get-queue-attributes \
  --queue-url "https://sqs.af-south-1.amazonaws.com/ACCOUNT_ID/rciis-s3-replication-dlq" \
  --attribute-names ApproximateNumberOfMessages \
  --region af-south-1

Integration Build Failures¶

If an Integration is stuck in Building or shows Error phase:

# Check the integration conditions
kubectl describe integration ceph-to-aws-replicator -n s3-replicator

# Check the IntegrationKit build logs
kubectl get integrationkit -n s3-replicator
kubectl logs -l camel.apache.org/component=operator -n <operator-namespace> --tail=200

# Common causes:
# - Missing Maven dependencies (check spec.dependencies)
# - Secret not found (check mount.configs references)
# - Operator not watching the namespace (check operator.id annotation)

Ceph RGW Endpoint Unreachable¶

kubectl get svc -n rook-ceph | grep rgw
kubectl run -it --rm curl-test --image=curlimages/curl --restart=Never \
  -n s3-replicator -- \
  curl -s http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80

Sync Status Shows "behind shards"¶

If radosgw-admin sync status shows shards that are behind, check the sync error log:

kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin sync error list

To retry failed sync operations:

# Reset the data sync marker for a specific source zone
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin data sync init --source-zone=site-b

Warning

data sync init resets the sync state and triggers a full resync of data from the source zone. Only use this if incremental sync is consistently failing.

Remote RGW Endpoint Unreachable¶

Sync requires bidirectional HTTP/S connectivity between the RGW instances on each cluster. Test from the Rook toolbox:

# From Cluster A, test connectivity to Cluster B's RGW
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  curl -sv https://rgw.cluster-b.example.com

# From Cluster B, test connectivity to Cluster A's RGW
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  curl -sv https://rgw.cluster-a.example.com

If the endpoint is unreachable, check:

VPN tunnel or network peering is up
Ingress controller or LoadBalancer is routing to the RGW service
Firewall rules allow TCP 443 between clusters
DNS resolution works from within the Rook toolbox

TLS Certificate Errors¶

If the remote RGW endpoint uses a certificate not trusted by the Ceph RGW sync agent:

# Check the RGW pod logs for TLS errors
kubectl -n rook-ceph logs -l app=rook-ceph-rgw --tail=100 | grep -i "ssl\|tls\|certificate"

To add a custom CA certificate to the RGW trust store, configure the CephObjectStore CR:

spec:
  gateway:
    sslCertificateRef: rgw-tls
    caBundleRef: custom-ca-bundle  # ConfigMap with the CA certificate

Realm Keys Secret Issues¶

If the secondary cluster cannot pull the realm configuration:

# Verify the realm keys secret exists on the secondary cluster
kubectl get secret rciis-keys -n rook-ceph

# Verify the secret contains the expected keys
kubectl get secret rciis-keys -n rook-ceph -o jsonpath='{.data}' | jq 'keys'
# Expected: ["access-key", "secret-key"]

If the secret is missing or corrupted, re-export it from the master cluster and re-apply.

Metadata Sync Failures¶

Metadata (users, buckets, ACLs) is synced by the master zone. If metadata sync is failing:

kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin metadata sync status

Common causes:

Master zone's RGW endpoint is unreachable from the secondary
Realm period has not been committed after configuration changes

To force a period update (run on the master cluster):

kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
  radosgw-admin period update --commit

Object Store Stuck in "Progressing"¶

If the CephObjectStore on either cluster does not reach Ready:

kubectl describe cephobjectstore multisite-store -n rook-ceph

# Check Rook operator logs
kubectl logs -l app=rook-ceph-operator -n rook-ceph --tail=200 | grep -i multisite

Common causes:

CRDs created out of order (must be: Realm -> ZoneGroup -> Zone -> ObjectStore)
Realm keys secret missing on secondary cluster
Zone name conflicts (each zone must have a unique name)
customEndpoints not set or not reachable

Check Replication Configuration¶

aws s3api get-bucket-replication --bucket rciis-data-af-south-1

Check Object Replication Status¶

aws s3api head-object \
  --bucket rciis-data-af-south-1 \
  --key <object-key> \
  --query ReplicationStatus

Possible status values:

Status	Meaning
`COMPLETED`	Object successfully replicated
`PENDING`	Replication in progress
`FAILED`	Replication failed — check IAM permissions and bucket policy
`REPLICA`	This object is itself a replica (on the destination bucket)

IAM Role Issues¶

aws iam get-role --role-name s3-replication-role
aws iam get-role-policy --role-name s3-replication-role --policy-name S3ReplicationPolicy

Key Files¶

On-prem cluster:

File	Description
`apps/infra/s3-replicator/on-prem/extra/ceph-to-aws-pipeline.yaml`	Ceph→AWS Camel K Integration CR
`apps/infra/s3-replicator/on-prem/extra/ceph-bucket-notification.yaml`	CephBucketTopic + CephBucketNotification CRDs
`apps/infra/s3-replicator/on-prem/extra/ceph-s3-user.yaml`	CephObjectStoreUser + ObjectBucketClaim
`apps/infra/s3-replicator/on-prem/extra/prometheus-rules.yaml`	PrometheusRule (Camel K metrics, on-prem alerts)
`apps/infra/secrets/on-prem/s3-replicator/`	SOPS-encrypted AWS credentials

AWS cluster:

File	Description
`apps/infra/s3-replicator/aws/extra/aws-to-ceph-pipeline.yaml`	AWS→Ceph Camel K Integration CR
`apps/infra/s3-replicator/aws/extra/prometheus-rules.yaml`	PrometheusRule (Camel K metrics, AWS-side alerts)
`apps/infra/secrets/aws/s3-replicator/`	SOPS-encrypted Ceph RGW + AWS credentials

On-Prem ↔ On-Prem (Ceph multi-site):

No manifest files in the s3-replicator namespace. Multi-site replication is configured entirely via Rook CRDs in the rook-ceph namespace on each cluster:

Resource	Namespace	Description
`CephObjectRealm/rciis`	`rook-ceph`	Multi-site realm (shared across clusters)
`CephObjectZoneGroup/rciis-sites`	`rook-ceph`	Zonegroup containing both zones
`CephObjectZone/site-a`	`rook-ceph`	Master zone (Cluster A)
`CephObjectZone/site-b`	`rook-ceph`	Secondary zone (Cluster B)
`CephObjectStore/multisite-store`	`rook-ceph`	RGW object store referencing the local zone
`Secret/rciis-keys`	`rook-ceph`	Auto-generated realm keys (copy to secondary cluster)