7.1 S3 Object Storage Replication¶
This section covers replicating S3 object storage data between environments for disaster recovery, backup, and data residency compliance. Three replication scenarios are supported depending on where your source and destination buckets are hosted.
Replication Scenarios¶
| Scenario | Source | Destination | Mechanism | Use Case |
|---|---|---|---|---|
| On-Prem ↔ AWS | Ceph RGW (on-prem) | AWS S3 | Apache Camel K (event-driven, bidirectional) | Real-time sync between on-prem and cloud |
| On-Prem ↔ On-Prem | Ceph RGW (Cluster A) | Ceph RGW (Cluster B) | Ceph RGW Multi-Site | Active-active replication between on-prem clusters |
| AWS → AWS | AWS S3 (Region A) | AWS S3 (Region B) | AWS Cross-Region Replication | Cross-region redundancy within AWS |
Which Scenario Do I Need?¶
-
Both source and destination are AWS S3?
- Use the AWS → AWS tab. This is fully managed by AWS with near real-time replication and requires no additional infrastructure.
-
Source is on-prem Ceph and destination is AWS S3 (or vice versa)?
- Use the On-Prem ↔ AWS tab. Two Apache Camel K integrations replicate objects bidirectionally in near real-time using event-driven triggers (Kafka for Ceph events, SQS for AWS events). Each integration uses native S3 components (
aws2-s3,aws2-sqs) — no shell commands or external CLIs.
- Use the On-Prem ↔ AWS tab. Two Apache Camel K integrations replicate objects bidirectionally in near real-time using event-driven triggers (Kafka for Ceph events, SQS for AWS events). Each integration uses native S3 components (
-
Both source and destination are on-prem Ceph clusters?
- Use the On-Prem ↔ On-Prem tab. Ceph RGW multi-site replication provides native, asynchronous, bidirectional replication between two Rook-Ceph clusters using realms, zonegroups, and zones. Requires network connectivity between the RGW endpoints on each cluster.
Note
All three scenarios use event-driven / asynchronous replication — objects are replicated within seconds to minutes of being written. Ceph multi-site uses internal change logs for incremental sync. AWS CRR replicates objects in near real-time as they are written.
Key Differences¶
| Feature | Camel K (On-Prem ↔ AWS) | Ceph Multi-Site (On-Prem ↔ On-Prem) | AWS CRR |
|---|---|---|---|
| Sync model | Event-driven (near real-time) | Async log-based (near real-time) | Near real-time (event-driven) |
| Direction | Bidirectional | Bidirectional (active-active) | Unidirectional (or bidirectional with two rules) |
| Infrastructure | Camel K Integration CRs + Kafka + SQS | Rook-Ceph CRDs (Realm, ZoneGroup, Zone) | Fully managed by AWS |
| RPO | Seconds to minutes | Seconds to minutes | Minutes |
| Initial backfill | Manual (run aws s3 sync before enabling) |
Automatic (full sync on zone join) | Requires S3 Batch Replication |
| Monitoring | Camel Micrometer metrics + PodMonitor | radosgw-admin sync status + Ceph metrics |
CloudWatch metrics |
| Network | On-prem → AWS S3 egress; AWS → on-prem RGW egress | RGW-to-RGW HTTP/S between clusters | AWS internal |
| Cost | Compute + egress + SQS | Ceph cluster resources + egress bandwidth | S3 replication + storage |
Architecture¶
Kubernetes Cluster (on-prem) Kubernetes Cluster (AWS af-south-1)
┌─────────────────────────────────────────┐ ┌──────────────────────────────────────┐
│ │ │ │
│ Ceph RGW (ceph-objectstore) │ │ S3 Bucket │
│ │ │ │ rciis-ceph-replica-* │
│ │ bucket notification │ │ │ │
│ ▼ │ │ │ S3 event notification │
│ Kafka (Strimzi) │ │ ▼ │
│ (ceph-bucket-notifications topic) │ │ SQS Queue │
│ │ │ │ (rciis-s3-replication-events) │
│ ▼ │ │ │ │
│ ┌───────────────────────────────┐ │ │ ▼ │
│ │ Camel K Integration │ │ │ ┌───────────────────────────────┐ │
│ │ (ceph-to-aws-replicator) │──────┼── aws2-s3 ──────>│ │ Camel K Integration │ │
│ │ loop: userIdentity check │ │ putObject │ │ (aws-to-ceph-replicator) │ │
│ └───────────────────────────────┘ │ │ │ loop: userIdentity check │ │
│ │ │ └──────────────┬────────────────┘ │
│ │ │ │ │
│ Ceph RGW <───────────────────────────┼── aws2-s3 ───────┼─────────────────┘ │
│ (receives replicated objects) │ putObject │ │
│ │ │ Camel K Operator │
│ Camel K Operator │ │ PrometheusRule (alerts) │
│ PrometheusRule (alerts) │ │ │
└─────────────────────────────────────────┘ └──────────────────────────────────────┘
Each Camel K integration is co-located with its source data to minimize cross-WAN traffic. Data crosses the network exactly once per object per direction.
Direction 1 — Ceph → AWS (integration runs on-prem): Ceph RGW emits bucket notifications on object create/delete. These are published to a Kafka topic via CephBucketTopic and CephBucketNotification CRDs. The ceph-to-aws-replicator Camel K Integration consumes from Kafka locally, downloads the object from Ceph via the aws2-s3 component (local read, no WAN), and uploads it to AWS S3 via a second aws2-s3 producer (single WAN hop). Deletes invoke the deleteObject operation on AWS.
Direction 2 — AWS → Ceph (integration runs in AWS): S3 event notifications on the AWS bucket publish to an SQS queue. The aws-to-ceph-replicator Camel K Integration consumes from SQS locally in the AWS cluster, downloads the object from S3 via aws2-s3 (local read, no WAN), and uploads it to Ceph RGW via aws2-s3 with uriEndpointOverride pointing to the on-prem RGW endpoint (single WAN hop). Deletes invoke deleteObject on Ceph.
Loop Prevention via userIdentity.principalId: Both Ceph RGW and AWS S3 include the writer's identity (userIdentity.principalId) in the event notification payload. Each integration checks this field before processing:
- Ceph → AWS (on-prem): If
principalIdiss3-replicator(the Ceph user that the AWS-side replicator writes as), the event is skipped — it was a replicated write from AWS. - AWS → Ceph (AWS): If
principalIdmatches the AWS IAM userrciis-ceph-s3-replicator(the identity that the on-prem replicator writes as), the event is skipped — it was a replicated write from Ceph.
This approach requires zero extra API calls (no headObject), has no race conditions, works for both creates and deletes, and does not modify the object or its metadata.
Cluster A (master zone) Cluster B (secondary zone)
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ │ │ │
│ Rook-Ceph │ │ Rook-Ceph │
│ ┌────────────────────────┐ │ │ ┌────────────────────────┐ │
│ │ CephObjectRealm │ │ │ │ CephObjectRealm │ │
│ │ CephObjectZoneGroup │ │ │ │ (pull from master) │ │
│ │ CephObjectZone: zoneA │ │ async │ │ CephObjectZone: zoneB │ │
│ │ CephObjectStore │ │ log-based │ │ CephObjectStore │ │
│ │ │ │ replication │ │ │ │
│ │ RGW instances ◄─────┼──┼───────────────┼──┼──► RGW instances │ │
│ │ │ │ │ │ │ │
│ └────────────────────────┘ │ │ └────────────────────────┘ │
│ │ │ │
└──────────────────────────────┘ └──────────────────────────────┘
▲ ▲
└──────── Cross-cluster network ───────────────┘
(VPN / peering / ingress)
Ceph RGW multi-site replication is a native Ceph feature that replicates data asynchronously between two (or more) Ceph clusters. Both clusters participate in the same realm and zonegroup, each hosting a separate zone. Replication is bidirectional (active-active) by default — both zones accept reads and writes, with changes propagated via internal change logs.
The first zone created becomes the master zone (handles metadata operations and period updates). The second zone is the secondary zone and pulls the realm configuration from the master's RGW endpoint. After initial setup, data flows bidirectionally.
AWS Region A (source) AWS Region B (destination)
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ │ │ │
│ S3 Bucket (versioned) │ AWS CRR │ S3 Bucket (versioned) │
│ rciis-data-af-south-1 │ ─────────────>│ rciis-data-eu-west-1 │
│ │ near │ │
│ │ real-time │ │
└──────────────────────────────┘ └──────────────────────────────┘
│
│ IAM Role:
│ s3-replication-role
│ (assumed by S3 service)
AWS S3 handles replication automatically. When an object is written to the source bucket, S3 replicates it to the destination bucket asynchronously (typically within 15 minutes, often faster).
Note
S3 Standard storage class already replicates objects across a minimum of 3 Availability Zones within a single region. CRR is for replicating data to a different region or a different AWS account — you do not need it for AZ-level redundancy.
Prerequisites¶
On-prem Kubernetes cluster (runs ceph-to-aws-replicator):
- Rook-Ceph cluster running with
ceph-objectstorehealthy s3-replicatornamespace added toallowUsersInNamespacesin the CephCluster values- Strimzi Kafka cluster running (e.g.
kafka-rciis-prodin therciis-prodnamespace) - Apache Camel K operator deployed (watches the
s3-replicatornamespace) - KSOPS plugin available for SOPS secret decryption
- Prometheus Operator installed (for alerting rules)
- Network egress to AWS S3 (
af-south-1)
AWS Kubernetes cluster (runs aws-to-ceph-replicator):
- Kubernetes cluster running in the same region as the S3 bucket (
af-south-1) - Apache Camel K operator deployed (watches the
s3-replicatornamespace) - KSOPS plugin available for SOPS secret decryption
- Prometheus Operator installed (for alerting rules)
- Network connectivity to the on-prem Ceph RGW endpoint (VPN, peering, or public ingress with TLS)
AWS managed services:
- S3 bucket created in the target region with versioning and encryption enabled
- SQS queue for receiving S3 event notifications
- SQS dead-letter queue for failed messages
- S3 event notification configuration pointing to the SQS queue
- IAM user with least-privilege access to S3 and SQS
Both clusters:
- Rook-Ceph cluster running with a healthy Ceph cluster
- Rook operator v1.4+ (multi-site support)
Master cluster (Cluster A):
- RGW endpoint reachable from Cluster B (for realm pull and data sync)
- The realm, zonegroup, and master zone CRDs are created here first
Secondary cluster (Cluster B):
- RGW endpoint reachable from Cluster A (for bidirectional sync)
- The realm keys secret from Cluster A must be copied to this cluster before pulling the realm
Network connectivity:
Both clusters' RGW endpoints must be reachable from each other (bidirectional). This is required for the RGW sync agents to transfer data and metadata.
| Method | Description |
|---|---|
| VPN tunnel | Site-to-site VPN between the two cluster networks |
| Network peering | Direct network peering if on the same WAN |
| Ingress / LoadBalancer | Expose RGW on both clusters via an ingress controller or LoadBalancer service with TLS |
Warning
If exposing Ceph RGW over the internet via ingress, always use TLS and restrict access by source IP or mTLS.
Firewall rules — bidirectional connectivity between both clusters:
| Source | Destination | Port | Protocol |
|---|---|---|---|
| Cluster A RGW | Cluster B RGW ingress | 443 | TCP (HTTPS) |
| Cluster B RGW | Cluster A RGW ingress | 443 | TCP (HTTPS) |
- Versioning enabled on both source and destination buckets (required by CRR)
- IAM role with permissions for S3 to replicate on your behalf
- Both buckets must exist before configuring replication
Destination Setup¶
1. Create the S3 Bucket¶
aws s3api create-bucket \
--bucket rciis-ceph-replica-af-south-1 \
--region af-south-1 \
--create-bucket-configuration LocationConstraint=af-south-1
# Enable versioning
aws s3api put-bucket-versioning \
--bucket rciis-ceph-replica-af-south-1 \
--versioning-configuration Status=Enabled
# Block public access
aws s3api put-public-access-block \
--bucket rciis-ceph-replica-af-south-1 \
--public-access-block-configuration \
BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true
# Enable SSE-S3 encryption
aws s3api put-bucket-encryption \
--bucket rciis-ceph-replica-af-south-1 \
--server-side-encryption-configuration \
'{"Rules":[{"ApplyServerSideEncryptionByDefault":{"SSEAlgorithm":"AES256"}}]}'
2. Create SQS Queues¶
Create the main event queue and a dead-letter queue for failed messages.
# Create the dead-letter queue first
aws sqs create-queue \
--queue-name rciis-s3-replication-dlq \
--region af-south-1
# Get the DLQ ARN
DLQ_ARN=$(aws sqs get-queue-attributes \
--queue-url "https://sqs.af-south-1.amazonaws.com/$(aws sts get-caller-identity --query Account --output text)/rciis-s3-replication-dlq" \
--attribute-names QueueArn \
--query 'Attributes.QueueArn' --output text \
--region af-south-1)
# Create the main queue with DLQ redrive policy
aws sqs create-queue \
--queue-name rciis-s3-replication-events \
--region af-south-1 \
--attributes '{
"RedrivePolicy": "{\"deadLetterTargetArn\":\"'"$DLQ_ARN"'\",\"maxReceiveCount\":\"5\"}",
"VisibilityTimeout": "300"
}'
3. Configure S3 Event Notifications → SQS¶
First, set the SQS queue policy to allow S3 to publish to it:
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
QUEUE_ARN="arn:aws:sqs:af-south-1:${ACCOUNT_ID}:rciis-s3-replication-events"
aws sqs set-queue-attributes \
--queue-url "https://sqs.af-south-1.amazonaws.com/${ACCOUNT_ID}/rciis-s3-replication-events" \
--region af-south-1 \
--attributes '{
"Policy": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"Service\":\"s3.amazonaws.com\"},\"Action\":\"sqs:SendMessage\",\"Resource\":\"'"${QUEUE_ARN}"'\",\"Condition\":{\"ArnEquals\":{\"aws:SourceArn\":\"arn:aws:s3:::rciis-ceph-replica-af-south-1\"}}}]}"
}'
Then configure S3 to send event notifications to SQS:
aws s3api put-bucket-notification-configuration \
--bucket rciis-ceph-replica-af-south-1 \
--region af-south-1 \
--notification-configuration '{
"QueueConfigurations": [
{
"QueueArn": "'"${QUEUE_ARN}"'",
"Events": [
"s3:ObjectCreated:*",
"s3:ObjectRemoved:*"
]
}
]
}'
4. Create IAM User with S3 + SQS Permissions¶
aws iam create-user --user-name rciis-ceph-s3-replicator
cat > /tmp/s3-replicator-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3ReplicationAccess",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:GetObjectMetadata",
"s3:HeadObject",
"s3:DeleteObject",
"s3:ListBucket",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::rciis-ceph-replica-af-south-1",
"arn:aws:s3:::rciis-ceph-replica-af-south-1/*"
]
},
{
"Sid": "SQSConsumerAccess",
"Effect": "Allow",
"Action": [
"sqs:ReceiveMessage",
"sqs:DeleteMessage",
"sqs:GetQueueAttributes",
"sqs:GetQueueUrl"
],
"Resource": "arn:aws:sqs:af-south-1:*:rciis-s3-replication-events"
},
{
"Sid": "SQSDLQAccess",
"Effect": "Allow",
"Action": [
"sqs:SendMessage"
],
"Resource": "arn:aws:sqs:af-south-1:*:rciis-s3-replication-dlq"
}
]
}
EOF
aws iam put-user-policy \
--user-name rciis-ceph-s3-replicator \
--policy-name S3ReplicatorAccess \
--policy-document file:///tmp/s3-replicator-policy.json
5. Generate Access Key¶
Ceph multi-site does not require separate "destination setup" — the realm, zonegroup, and zone configuration on both clusters handles replication automatically. See the Kubernetes Manifests section for the full CRD configuration.
Both clusters need their RGW endpoints exposed externally. On each cluster, create an ingress or LoadBalancer for the RGW service:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ceph-rgw
namespace: rook-ceph
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "0"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
spec:
tls:
- hosts:
- rgw.cluster-a.example.com # or rgw.cluster-b.example.com
secretName: rgw-tls
rules:
- host: rgw.cluster-a.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: rook-ceph-rgw-multisite-store
port:
number: 443
Tip
Set proxy-body-size: "0" to allow unlimited upload size through the ingress. The default limit is typically too small for object storage.
1. Create Source and Destination Buckets¶
# Source bucket (af-south-1)
aws s3api create-bucket \
--bucket rciis-data-af-south-1 \
--region af-south-1 \
--create-bucket-configuration LocationConstraint=af-south-1
# Destination bucket (eu-west-1)
aws s3api create-bucket \
--bucket rciis-data-eu-west-1 \
--region eu-west-1 \
--create-bucket-configuration LocationConstraint=eu-west-1
2. Enable Versioning on Both Buckets¶
aws s3api put-bucket-versioning \
--bucket rciis-data-af-south-1 \
--versioning-configuration Status=Enabled
aws s3api put-bucket-versioning \
--bucket rciis-data-eu-west-1 \
--versioning-configuration Status=Enabled
3. Create the IAM Replication Role¶
# Trust policy — allows S3 to assume this role
cat > /tmp/s3-replication-trust.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
aws iam create-role \
--role-name s3-replication-role \
--assume-role-policy-document file:///tmp/s3-replication-trust.json
# Permissions policy — read from source, write to destination
cat > /tmp/s3-replication-policy.json << 'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetReplicationConfiguration",
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::rciis-data-af-south-1"
},
{
"Effect": "Allow",
"Action": [
"s3:GetObjectVersionForReplication",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersionTagging"
],
"Resource": "arn:aws:s3:::rciis-data-af-south-1/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ReplicateObject",
"s3:ReplicateDelete",
"s3:ReplicateTags"
],
"Resource": "arn:aws:s3:::rciis-data-eu-west-1/*"
}
]
}
EOF
aws iam put-role-policy \
--role-name s3-replication-role \
--policy-name S3ReplicationPolicy \
--policy-document file:///tmp/s3-replication-policy.json
4. Configure Replication on the Source Bucket¶
# Get your AWS account ID
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
cat > /tmp/replication-config.json << EOF
{
"Role": "arn:aws:iam::${ACCOUNT_ID}:role/s3-replication-role",
"Rules": [
{
"ID": "replicate-all",
"Status": "Enabled",
"Priority": 1,
"Filter": {},
"Destination": {
"Bucket": "arn:aws:s3:::rciis-data-eu-west-1",
"StorageClass": "STANDARD"
},
"DeleteMarkerReplication": {
"Status": "Enabled"
}
}
]
}
EOF
aws s3api put-bucket-replication \
--bucket rciis-data-af-south-1 \
--replication-configuration file:///tmp/replication-config.json
Tip
The Filter: {} setting replicates all objects. To replicate only a subset, add a Prefix or Tag filter.
Cross-Account Replication¶
If the source and destination buckets are in different AWS accounts, add a bucket policy on the destination to allow the source account's replication role to write:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<SOURCE_ACCOUNT_ID>:role/s3-replication-role"
},
"Action": [
"s3:ReplicateObject",
"s3:ReplicateDelete",
"s3:ReplicateTags",
"s3:ObjectOwnerOverrideToBucketOwner"
],
"Resource": "arn:aws:s3:::rciis-data-eu-west-1/*"
}
]
}
Add AccessControlTranslation and Account to the replication rule destination to transfer object ownership:
"Destination": {
"Bucket": "arn:aws:s3:::rciis-data-eu-west-1",
"Account": "<DEST_ACCOUNT_ID>",
"AccessControlTranslation": {
"Owner": "Destination"
}
}
Pre-Existing Objects¶
Warning
CRR only replicates objects created after the replication rule is enabled. Pre-existing objects are not replicated automatically.
To replicate pre-existing objects, use S3 Batch Replication:
aws s3control create-job \
--account-id $ACCOUNT_ID \
--operation '{"S3ReplicateObject":{}}' \
--manifest-generator '{"S3JobManifestGenerator":{"ExpectedBucketOwner":"'$ACCOUNT_ID'","SourceS3BucketArn":"arn:aws:s3:::rciis-data-af-south-1","EnableManifestOutput":false,"Filter":{"EligibleForReplication":true}}}' \
--report '{"Enabled":true,"Bucket":"arn:aws:s3:::rciis-data-af-south-1","Format":"Report_CSV_20180820","ReportScope":"AllTasks","Prefix":"batch-replication-report"}' \
--priority 1 \
--role-arn arn:aws:iam::${ACCOUNT_ID}:role/s3-replication-role \
--confirmation-required \
--region af-south-1
Ceph Bucket Notification Setup¶
Note
This section applies only to the On-Prem ↔ AWS scenario. Skip this if you are using On-Prem → On-Prem or AWS → AWS replication.
Ceph RGW bucket notifications push object events to the Strimzi Kafka cluster. Two Rook CRs are required.
CephBucketTopic¶
Creates a Kafka-backed notification topic on the RGW:
apiVersion: ceph.rook.io/v1
kind: CephBucketTopic
metadata:
name: s3-replication-topic
namespace: rook-ceph
spec:
objectStoreName: ceph-objectstore
objectStoreNamespace: rook-ceph
endpoint:
kafka:
uri: kafka-rciis-prod-kafka-bootstrap.rciis-prod.svc:9093
useSSL: false
ackLevel: broker
CephBucketNotification¶
Attaches the topic to the replication bucket so that create and delete events are published:
apiVersion: ceph.rook.io/v1
kind: CephBucketNotification
metadata:
name: s3-replication-notification
namespace: rook-ceph
spec:
topic: s3-replication-topic
events:
- s3:ObjectCreated:*
- s3:ObjectRemoved:*
Note
The CephBucketNotification must be in the same namespace as the CephBucketTopic. The notification is automatically attached to all buckets associated with the object store. To limit it to specific buckets, add a filter field with prefix/suffix/regex rules.
Kubernetes Manifests¶
CephObjectStoreUser¶
Creates a Rook-managed S3 user whose credentials are auto-generated as a Kubernetes secret.
apiVersion: ceph.rook.io/v1
kind: CephObjectStoreUser
metadata:
name: s3-replicator
namespace: s3-replicator
spec:
store: ceph-objectstore
clusterNamespace: rook-ceph
displayName: s3-replicator
The auto-generated secret rook-ceph-object-user-ceph-objectstore-s3-replicator contains AccessKey and SecretKey fields.
ObjectBucketClaim¶
Dynamically provisions a bucket on Ceph for the replication source data.
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: s3-replication-test
namespace: s3-replicator
spec:
bucketName: s3-replication-test
storageClassName: ceph-bucket
Direction 1: Ceph → AWS Camel K Integration (on-prem cluster)¶
Location: apps/infra/s3-replicator/on-prem/extra/ceph-to-aws-pipeline.yaml
This integration runs in the on-prem Kubernetes cluster. It consumes bucket notifications from the local Kafka cluster, reads the object from Ceph locally (no WAN), and uploads it to AWS S3 (single WAN hop). The Camel K operator builds and deploys the integration pod automatically.
apiVersion: camel.apache.org/v1
kind: Integration
metadata:
name: ceph-to-aws-replicator
namespace: s3-replicator
annotations:
camel.apache.org/operator.id: camel-k-operator
labels:
app.kubernetes.io/name: ceph-to-aws-replicator
app.kubernetes.io/component: s3-replication
app.kubernetes.io/part-of: s3-replicator
spec:
dependencies:
- "camel:kafka"
- "camel:aws2-s3"
- "camel:jackson"
- "camel:jsonpath"
- "camel:bean"
- "camel:log"
- "camel:direct"
traits:
mount:
configs:
- "secret:rook-ceph-object-user-ceph-objectstore-s3-replicator"
- "secret:aws-s3-credentials"
container:
requestMemory: "256Mi"
requestCPU: "100m"
limitMemory: "512Mi"
limitCPU: "500m"
health:
enabled: true
livenessProbeEnabled: true
readinessProbeEnabled: true
prometheus:
enabled: true
podMonitor: true
flows:
# Main route: Kafka → event routing → replication
- from:
uri: "kafka:ceph-bucket-notifications"
parameters:
brokers: "kafka-rciis-prod-kafka-bootstrap.rciis-prod.svc:9093"
groupId: "s3-replicator-ceph-to-aws"
autoOffsetReset: "earliest"
autoCommitEnable: "true"
steps:
- unmarshal:
json:
library: Jackson
# Loop prevention: check userIdentity.principalId
- setHeader:
name: "CephPrincipalId"
jsonpath:
expression: "$.Records[0].userIdentity.principalId"
suppressExceptions: true
resultType: "java.lang.String"
- choice:
when:
- simple: "${header.CephPrincipalId} == 's3-replicator'"
steps:
- log: "Loop prevention: skipping event written by s3-replicator"
- stop: {}
# Extract event details
- setHeader:
name: "EventName"
jsonpath:
expression: "$.Records[0].eventName"
resultType: "java.lang.String"
- setHeader:
name: "SourceBucket"
jsonpath:
expression: "$.Records[0].s3.bucket.name"
resultType: "java.lang.String"
- setHeader:
name: "ObjectKey"
jsonpath:
expression: "$.Records[0].s3.object.key"
resultType: "java.lang.String"
- log: "Processing event=${header.EventName} bucket=${header.SourceBucket} key=${header.ObjectKey}"
# Route by event type
- choice:
when:
- simple: "${header.EventName} contains 'ObjectCreated'"
steps:
- to: "direct:replicate-to-aws"
- simple: "${header.EventName} contains 'ObjectRemoved'"
steps:
- to: "direct:delete-from-aws"
otherwise:
steps:
- log:
message: "Unknown event type ${header.EventName}, skipping"
loggingLevel: "WARN"
# Sub-route: Download from Ceph → Upload to AWS
- from:
uri: "direct:replicate-to-aws"
steps:
- setHeader:
name: "CamelAwsS3Key"
simple: "${header.ObjectKey}"
- setHeader:
name: "CamelAwsS3BucketName"
simple: "${header.SourceBucket}"
- to:
uri: "aws2-s3://ignored"
parameters:
operation: "getObject"
accessKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/AccessKey}}"
secretKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/SecretKey}}"
region: "us-east-1"
uriEndpointOverride: "http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80"
overrideEndpoint: "true"
forcePathStyle: "true"
autoCreateBucket: "false"
- log: "Downloaded ${header.ObjectKey} from Ceph (${header.CamelAwsS3ContentLength} bytes)"
- setHeader:
name: "CamelAwsS3Key"
simple: "${header.ObjectKey}"
- setHeader:
name: "CamelAwsS3BucketName"
simple: "${header.SourceBucket}"
- to:
uri: "aws2-s3://ignored"
parameters:
operation: "putObject"
accessKey: "{{secret:aws-s3-credentials/access-key}}"
secretKey: "{{secret:aws-s3-credentials/secret-key}}"
region: "af-south-1"
autoCreateBucket: "false"
- log: "Uploaded ${header.ObjectKey} to AWS S3"
# Sub-route: Delete from AWS
- from:
uri: "direct:delete-from-aws"
steps:
- setHeader:
name: "CamelAwsS3Key"
simple: "${header.ObjectKey}"
- setHeader:
name: "CamelAwsS3BucketName"
simple: "${header.SourceBucket}"
- to:
uri: "aws2-s3://ignored"
parameters:
operation: "deleteObject"
accessKey: "{{secret:aws-s3-credentials/access-key}}"
secretKey: "{{secret:aws-s3-credentials/secret-key}}"
region: "af-south-1"
autoCreateBucket: "false"
- log: "Deleted ${header.ObjectKey} from AWS S3"
Direction 2: AWS → Ceph Camel K Integration (AWS cluster)¶
Location: apps/infra/s3-replicator/aws/extra/aws-to-ceph-pipeline.yaml
This integration runs in the AWS Kubernetes cluster. It consumes S3 event notifications from SQS locally, reads the object from S3 locally (no WAN), and uploads it to Ceph RGW on-prem (single WAN hop). The uriEndpointOverride for the Ceph target must point to the externally reachable Ceph RGW endpoint.
apiVersion: camel.apache.org/v1
kind: Integration
metadata:
name: aws-to-ceph-replicator
namespace: s3-replicator
annotations:
camel.apache.org/operator.id: camel-k-operator
labels:
app.kubernetes.io/name: aws-to-ceph-replicator
app.kubernetes.io/component: s3-replication
app.kubernetes.io/part-of: s3-replicator
spec:
dependencies:
- "camel:aws2-sqs"
- "camel:aws2-s3"
- "camel:jackson"
- "camel:jsonpath"
- "camel:bean"
- "camel:log"
- "camel:direct"
traits:
mount:
configs:
- "secret:rook-ceph-object-user-ceph-objectstore-s3-replicator"
- "secret:aws-s3-credentials"
container:
requestMemory: "256Mi"
requestCPU: "100m"
limitMemory: "512Mi"
limitCPU: "500m"
health:
enabled: true
livenessProbeEnabled: true
readinessProbeEnabled: true
prometheus:
enabled: true
podMonitor: true
flows:
# Main route: SQS → event routing → replication
- from:
uri: "aws2-sqs:rciis-s3-replication-events"
parameters:
region: "af-south-1"
accessKey: "{{secret:aws-s3-credentials/access-key}}"
secretKey: "{{secret:aws-s3-credentials/secret-key}}"
waitTimeSeconds: "20"
maxMessagesPerPoll: "10"
deleteAfterRead: "true"
visibilityTimeout: "120"
steps:
- unmarshal:
json:
library: Jackson
# Loop prevention: check userIdentity.principalId
- setHeader:
name: "AwsPrincipalId"
jsonpath:
expression: "$.Records[0].userIdentity.principalId"
suppressExceptions: true
resultType: "java.lang.String"
- choice:
when:
- simple: "${header.AwsPrincipalId} == 'rciis-ceph-s3-replicator'"
steps:
- log: "Loop prevention: skipping event written by rciis-ceph-s3-replicator"
- stop: {}
# Extract event details
- setHeader:
name: "EventName"
jsonpath:
expression: "$.Records[0].eventName"
resultType: "java.lang.String"
- setHeader:
name: "SourceBucket"
jsonpath:
expression: "$.Records[0].s3.bucket.name"
resultType: "java.lang.String"
- setHeader:
name: "ObjectKey"
jsonpath:
expression: "$.Records[0].s3.object.key"
resultType: "java.lang.String"
- log: "Processing event=${header.EventName} bucket=${header.SourceBucket} key=${header.ObjectKey}"
# Route by event type
- choice:
when:
- simple: "${header.EventName} contains 'ObjectCreated'"
steps:
- to: "direct:replicate-to-ceph"
- simple: "${header.EventName} contains 'ObjectRemoved'"
steps:
- to: "direct:delete-from-ceph"
otherwise:
steps:
- log:
message: "Unknown event type ${header.EventName}, skipping"
loggingLevel: "WARN"
# Sub-route: Download from AWS → Upload to Ceph
- from:
uri: "direct:replicate-to-ceph"
steps:
- setHeader:
name: "CamelAwsS3Key"
simple: "${header.ObjectKey}"
- setHeader:
name: "CamelAwsS3BucketName"
simple: "${header.SourceBucket}"
- to:
uri: "aws2-s3://ignored"
parameters:
operation: "getObject"
accessKey: "{{secret:aws-s3-credentials/access-key}}"
secretKey: "{{secret:aws-s3-credentials/secret-key}}"
region: "af-south-1"
autoCreateBucket: "false"
- log: "Downloaded ${header.ObjectKey} from AWS S3 (${header.CamelAwsS3ContentLength} bytes)"
- setHeader:
name: "CamelAwsS3Key"
simple: "${header.ObjectKey}"
- setHeader:
name: "CamelAwsS3BucketName"
simple: "${header.SourceBucket}"
- to:
uri: "aws2-s3://ignored"
parameters:
operation: "putObject"
accessKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/AccessKey}}"
secretKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/SecretKey}}"
region: "us-east-1"
uriEndpointOverride: "http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80"
overrideEndpoint: "true"
forcePathStyle: "true"
autoCreateBucket: "false"
- log: "Uploaded ${header.ObjectKey} to Ceph RGW"
# Sub-route: Delete from Ceph
- from:
uri: "direct:delete-from-ceph"
steps:
- setHeader:
name: "CamelAwsS3Key"
simple: "${header.ObjectKey}"
- setHeader:
name: "CamelAwsS3BucketName"
simple: "${header.SourceBucket}"
- to:
uri: "aws2-s3://ignored"
parameters:
operation: "deleteObject"
accessKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/AccessKey}}"
secretKey: "{{secret:rook-ceph-object-user-ceph-objectstore-s3-replicator/SecretKey}}"
region: "us-east-1"
uriEndpointOverride: "http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80"
overrideEndpoint: "true"
forcePathStyle: "true"
autoCreateBucket: "false"
- log: "Deleted ${header.ObjectKey} from Ceph RGW"
How Loop Prevention Works
Both integrations use the userIdentity.principalId field from the S3 event notification payload to prevent infinite replication loops. This field is set by both Ceph RGW and AWS S3 to identify the user who performed the write.
End-to-end example (AWS → Ceph → stop):
- User uploads object to AWS S3 (writes as their own IAM user)
- S3 event → SQS →
aws-to-ceph-replicatorintegration - Integration checks
principalId— it is the user's IAM ID, notrciis-ceph-s3-replicator→ proceeds - Integration downloads from AWS via
aws2-s3:getObjectand uploads to Ceph viaaws2-s3:putObject(writes ass3-replicatorCeph user) - Ceph emits bucket notification → Kafka →
ceph-to-aws-replicatorintegration - Integration checks
principalId— it iss3-replicator→ event skipped
Advantages over metadata-based loop prevention:
- Zero extra API calls (no
headObjectneeded) - No race conditions between metadata check and copy
- Works for both creates and deletes
- Does not modify the object or its metadata
- Information is already present in the event payload
Camel K Operator — Both Clusters
The Camel K operator must be installed in both the on-prem and AWS Kubernetes clusters, each watching its local s3-replicator namespace. The camel.apache.org/operator.id annotation binds each Integration to the correct operator instance. The operator automatically builds a container image, creates a Deployment, and manages the pod lifecycle — no manual Deployment, Service, or ConfigMap resources are needed.
AWS → Ceph Network Connectivity
The aws-to-ceph-replicator integration running in the AWS cluster must be able to reach the on-prem Ceph RGW endpoint. The uriEndpointOverride in the Integration CR must point to the externally reachable RGW URL (e.g. https://rgw.example.com), not the cluster-internal service DNS. Ensure connectivity via VPN, network peering, or a TLS-secured ingress.
PrometheusRule¶
Alerts for both Camel K replication integrations. The Camel K operator creates PodMonitors automatically when prometheus.podMonitor: true is set in the Integration traits.
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: s3-replicator-alerts
namespace: s3-replicator
spec:
groups:
- name: s3-replicator
rules:
- alert: S3ReplicatorCephToAwsDown
expr: |
absent(up{namespace="s3-replicator", pod=~"ceph-to-aws-replicator.*"} == 1)
for: 5m
labels:
severity: critical
annotations:
summary: "Ceph → AWS replication integration is down"
- alert: S3ReplicatorAwsToCephDown
expr: |
absent(up{namespace="s3-replicator", pod=~"aws-to-ceph-replicator.*"} == 1)
for: 5m
labels:
severity: critical
annotations:
summary: "AWS → Ceph replication integration is down"
- alert: S3ReplicatorHighErrorRate
expr: |
sum(rate(camel_exchanges_failed_total{namespace="s3-replicator"}[5m])) > 0.1
for: 10m
labels:
severity: warning
annotations:
summary: "S3 replication error rate elevated"
- alert: S3ReplicatorHighLatency
expr: |
histogram_quantile(0.95,
sum(rate(camel_exchange_event_notifier_seconds_bucket{namespace="s3-replicator"}[5m])) by (le, routeId)
) > 30
for: 10m
labels:
severity: warning
annotations:
summary: "S3 replication latency is high"
- alert: S3ReplicatorKafkaConsumerLag
expr: |
kafka_consumergroup_lag{group="s3-replicator-ceph-to-aws", namespace="rciis-prod"} > 1000
for: 15m
labels:
severity: warning
annotations:
summary: "Ceph → AWS replication consumer lag is high"
SOPS-Encrypted Secrets¶
Each cluster needs its own set of SOPS-encrypted secrets.
On-prem cluster — needs AWS credentials (to write to AWS S3):
# apps/infra/secrets/on-prem/s3-replicator/secret-generator.yaml
apiVersion: viaduct.ai/v1
kind: ksops
metadata:
name: aws-s3-credentials-generator
annotations:
config.kubernetes.io/function: |
exec:
path: ksops
files:
- ./aws-s3-credentials.yaml
# apps/infra/secrets/on-prem/s3-replicator/aws-s3-credentials.yaml (before encryption)
apiVersion: v1
kind: Secret
metadata:
name: aws-s3-credentials
namespace: s3-replicator
type: Opaque
stringData:
access-key: "AKIA..."
secret-key: "..."
region: "af-south-1"
On the on-prem cluster, Ceph credentials are auto-generated by the CephObjectStoreUser CR as secret rook-ceph-object-user-ceph-objectstore-s3-replicator.
AWS cluster — needs both AWS credentials (to read from S3 and consume SQS) and Ceph RGW credentials (to write to Ceph):
# apps/infra/secrets/aws/s3-replicator/aws-s3-credentials.yaml (before encryption)
apiVersion: v1
kind: Secret
metadata:
name: aws-s3-credentials
namespace: s3-replicator
type: Opaque
stringData:
access-key: "AKIA..."
secret-key: "..."
region: "af-south-1"
sqs-queue-url: "https://sqs.af-south-1.amazonaws.com/ACCOUNT_ID/rciis-s3-replication-events"
# apps/infra/secrets/aws/s3-replicator/ceph-rgw-credentials.yaml (before encryption)
apiVersion: v1
kind: Secret
metadata:
name: ceph-rgw-credentials
namespace: s3-replicator
type: Opaque
stringData:
AccessKey: "..."
SecretKey: "..."
endpoint: "https://rgw.example.com"
Note
On the AWS cluster there is no Rook-Ceph operator, so the Ceph credentials cannot be auto-generated. They must be manually created as a SOPS-encrypted secret containing the AccessKey and SecretKey from the on-prem CephObjectStoreUser.
All secrets are mounted into the Camel K integration pods via the mount.configs trait and accessed using {{secret:name/key}} property placeholders in the route URIs.
Ceph multi-site replication is configured entirely via Rook CRDs. No CronJobs, external tools, or application-level replication pods are required — the RGW instances handle replication natively.
Master Cluster (Cluster A)¶
Create the realm, zonegroup, zone, and object store on the master cluster:
# Realm — top-level container for multi-site
apiVersion: ceph.rook.io/v1
kind: CephObjectRealm
metadata:
name: rciis
namespace: rook-ceph
---
# Zonegroup — collection of zones that replicate data
apiVersion: ceph.rook.io/v1
kind: CephObjectZoneGroup
metadata:
name: rciis-sites
namespace: rook-ceph
spec:
realm: rciis
---
# Master zone — first zone created becomes master
apiVersion: ceph.rook.io/v1
kind: CephObjectZone
metadata:
name: site-a
namespace: rook-ceph
spec:
zoneGroup: rciis-sites
metadataPool:
failureDomain: host
replicated:
size: 3
requireSafeReplicaSize: true
dataPool:
failureDomain: host
replicated:
size: 3
requireSafeReplicaSize: true
customEndpoints:
- "https://rgw.cluster-a.example.com"
preservePoolsOnDelete: true
---
# Object store — references the zone
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: multisite-store
namespace: rook-ceph
spec:
gateway:
port: 80
securePort: 443
instances: 2
sslCertificateRef: rgw-tls
zone:
name: site-a
Bootstrap Secret¶
After the realm is created on the master cluster, Rook auto-generates a secret rciis-keys containing the system user credentials. Export this secret and apply it to the secondary cluster:
# On the master cluster — export the realm keys
kubectl -n rook-ceph get secret rciis-keys -o yaml > rciis-keys.yaml
# Edit the namespace to match the secondary cluster's Rook namespace
# Then apply on the secondary cluster
kubectl apply -f rciis-keys.yaml
Secondary Cluster (Cluster B)¶
After applying the realm keys secret, create the realm (with pull), zone, and object store:
# Pull the realm configuration from the master cluster's RGW endpoint
apiVersion: ceph.rook.io/v1
kind: CephObjectRealm
metadata:
name: rciis
namespace: rook-ceph
spec:
pull:
endpoint: "https://rgw.cluster-a.example.com"
---
# Zonegroup — must match the master's zonegroup name
apiVersion: ceph.rook.io/v1
kind: CephObjectZoneGroup
metadata:
name: rciis-sites
namespace: rook-ceph
spec:
realm: rciis
---
# Secondary zone — joins the same zonegroup
apiVersion: ceph.rook.io/v1
kind: CephObjectZone
metadata:
name: site-b
namespace: rook-ceph
spec:
zoneGroup: rciis-sites
metadataPool:
failureDomain: host
replicated:
size: 3
requireSafeReplicaSize: true
dataPool:
failureDomain: host
replicated:
size: 3
requireSafeReplicaSize: true
customEndpoints:
- "https://rgw.cluster-b.example.com"
preservePoolsOnDelete: true
---
# Object store — references the secondary zone
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: multisite-store
namespace: rook-ceph
spec:
gateway:
port: 80
securePort: 443
instances: 2
sslCertificateRef: rgw-tls
zone:
name: site-b
Order of operations
The CRDs must be created in this order: Realm → ZoneGroup → Zone → ObjectStore. On the secondary cluster, the realm keys secret must exist before creating the CephObjectRealm with pull.
customEndpoints
The customEndpoints field must contain the externally reachable RGW URL for each zone. This is how the other zone knows where to send sync traffic. If omitted, Rook defaults to the internal ClusterIP service DNS, which is not reachable from another cluster.
No Kubernetes resources are required. AWS Cross-Region Replication is fully managed by AWS and is configured via the AWS CLI or Console (see Destination Setup).
Deployment¶
The s3-replicator is deployed across two clusters. Each cluster has its own FluxCD Kustomization for the s3-replicator.
On-prem cluster (Ceph → AWS):
# Sync the on-prem s3-replicator application
flux reconcile kustomization rciis-s3-replicator-on-prem
# Verify the integration is running
kubectl get integrations -n s3-replicator
# Expected:
# NAME PHASE KIT REPLICAS
# ceph-to-aws-replicator Running kit-xxxx 1
AWS cluster (AWS → Ceph):
# Sync the AWS s3-replicator application
flux reconcile kustomization rciis-s3-replicator-aws
# Verify the integration is running
kubectl get integrations -n s3-replicator
# Expected:
# NAME PHASE KIT REPLICAS
# aws-to-ceph-replicator Running kit-xxxx 1
Verify pods on each cluster:
# On-prem cluster
kubectl get pods -n s3-replicator
# Expected: ceph-to-aws-replicator-xxx 1/1 Running
# AWS cluster
kubectl get pods -n s3-replicator
# Expected: aws-to-ceph-replicator-xxx 1/1 Running
Tip
The Camel K operator first builds a container image for each Integration (the Building phase), then transitions to Running. The first build takes 1-3 minutes. Subsequent changes to the Integration CR trigger a new build automatically.
Multi-site replication is configured entirely via Rook CRDs in the rook-ceph namespace on each cluster. There is no separate FluxCD Kustomization — the CRDs are part of the Rook-Ceph cluster configuration.
Step 1 — Master cluster (Cluster A):
Apply the realm, zonegroup, zone, and object store CRDs:
Wait for the object store to be ready and the realm keys secret to be generated:
kubectl wait --for=jsonpath='{.status.phase}'=Ready \
cephobjectstore/multisite-store -n rook-ceph --timeout=300s
# Verify the realm keys secret exists
kubectl get secret rciis-keys -n rook-ceph
Step 2 — Export realm keys:
kubectl -n rook-ceph get secret rciis-keys -o yaml > rciis-keys.yaml
# Edit the file to update the namespace if needed, then apply on Cluster B
Step 3 — Secondary cluster (Cluster B):
Apply the realm keys secret first, then the CRDs:
kubectl apply -f rciis-keys.yaml -n rook-ceph
kubectl apply -f realm-pull.yaml -f zonegroup.yaml -f zone-b.yaml -f objectstore-b.yaml -n rook-ceph
Wait for the secondary object store and verify sync is established:
kubectl wait --for=jsonpath='{.status.phase}'=Ready \
cephobjectstore/multisite-store -n rook-ceph --timeout=300s
Step 4 — Verify sync status from either cluster:
# Exec into the Rook toolbox
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
radosgw-admin sync status
Expected output shows both zones with data is caught up with source when fully synced.
Tip
If the CRDs are managed via FluxCD as part of your Rook-Ceph cluster values, deploy Cluster A first, export the realm keys, apply them to Cluster B, and then deploy Cluster B. The realm keys secret is the only manual step — everything else is declarative.
No Kubernetes deployment needed. Replication is active as soon as the replication configuration is applied to the source bucket via put-bucket-replication (see Destination Setup).
Verify the replication configuration:
Testing & Verification¶
Test Ceph → AWS¶
Upload a file to Ceph and verify it appears in AWS within 60 seconds.
# Get Ceph S3 credentials
export AWS_ACCESS_KEY_ID=$(kubectl get secret \
rook-ceph-object-user-ceph-objectstore-s3-replicator \
-n s3-replicator -o jsonpath='{.data.AccessKey}' | base64 -d)
export AWS_SECRET_ACCESS_KEY=$(kubectl get secret \
rook-ceph-object-user-ceph-objectstore-s3-replicator \
-n s3-replicator -o jsonpath='{.data.SecretKey}' | base64 -d)
# Upload a test file to Ceph
echo "ceph-to-aws-test-$(date +%s)" > /tmp/test-ceph-to-aws.txt
aws s3 cp /tmp/test-ceph-to-aws.txt s3://s3-replication-test/test-ceph-to-aws.txt \
--endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80
# Wait and verify in AWS (should appear within 60 seconds)
sleep 60
aws s3 ls s3://rciis-ceph-replica-af-south-1/s3-replication-test/test-ceph-to-aws.txt \
--region af-south-1
Test AWS → Ceph¶
Upload a file to AWS and verify it appears in Ceph within 60 seconds.
echo "aws-to-ceph-test-$(date +%s)" > /tmp/test-aws-to-ceph.txt
aws s3 cp /tmp/test-aws-to-ceph.txt \
s3://rciis-ceph-replica-af-south-1/s3-replication-test/test-aws-to-ceph.txt \
--region af-south-1
sleep 60
aws s3 ls s3://s3-replication-test/test-aws-to-ceph.txt \
--endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80
Test Loop Prevention¶
Verify that replicated objects are not re-replicated in the opposite direction.
# Upload to Ceph (on-prem cluster), wait for it to arrive in AWS
aws s3 cp /tmp/test-ceph-to-aws.txt s3://s3-replication-test/loop-test.txt \
--endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80
sleep 60
# Verify the object exists in AWS
aws s3 ls s3://rciis-ceph-replica-af-south-1/s3-replication-test/loop-test.txt \
--region af-south-1
# On the AWS cluster, check the aws-to-ceph-replicator logs to confirm it skipped the event
# (switch kubectl context to the AWS cluster)
kubectl logs -l camel.apache.org/integration=aws-to-ceph-replicator \
-n s3-replicator --tail=50 | grep "Loop prevention"
# Expected: "Loop prevention: skipping event written by rciis-ceph-s3-replicator"
Test Delete Replication¶
# Delete from Ceph (on-prem cluster), verify deletion in AWS
aws s3 rm s3://s3-replication-test/test-ceph-to-aws.txt \
--endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80
sleep 60
aws s3 ls s3://rciis-ceph-replica-af-south-1/s3-replication-test/test-ceph-to-aws.txt \
--region af-south-1
# Expected: no output (object deleted)
Test Failure Recovery¶
# On the on-prem cluster: kill the Ceph→AWS replicator pod
kubectl delete pod -l camel.apache.org/integration=ceph-to-aws-replicator -n s3-replicator
# Upload a file while the pod is down
aws s3 cp /tmp/test-ceph-to-aws.txt s3://s3-replication-test/recovery-test.txt \
--endpoint-url http://rook-ceph-rgw-ceph-objectstore.rook-ceph.svc.cluster.local:80
# Wait for the pod to restart (the Camel K operator recreates it)
kubectl wait --for=condition=Ready \
pod -l camel.apache.org/integration=ceph-to-aws-replicator \
-n s3-replicator --timeout=120s
# The Kafka consumer group will resume from the last committed offset.
sleep 60
aws s3 ls s3://rciis-ceph-replica-af-south-1/s3-replication-test/recovery-test.txt \
--region af-south-1
Check Sync Status¶
Verify that both zones are in sync:
# From the Rook toolbox on either cluster
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
radosgw-admin sync status
Expected output when healthy:
realm rciis (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
zonegroup rciis-sites (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
zone site-a (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
metadata sync no sync (zone is master)
data sync source: site-b (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
Upload Test Object on Cluster A¶
# Create a user on the multisite object store (Cluster A)
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
radosgw-admin user create --uid=test-user --display-name="Test User" \
--rgw-zone=site-a
# Get credentials from the output and set environment variables
export AWS_ACCESS_KEY_ID="<access-key-from-output>"
export AWS_SECRET_ACCESS_KEY="<secret-key-from-output>"
# Upload via Cluster A's RGW endpoint
echo "multisite-replication-test-$(date +%s)" > /tmp/test-multisite.txt
aws s3 cp /tmp/test-multisite.txt s3://test-bucket/test-multisite.txt \
--endpoint-url https://rgw.cluster-a.example.com
Verify on Cluster B¶
Objects replicate asynchronously (typically within seconds to minutes):
# List via Cluster B's RGW endpoint (same credentials — users replicate with data)
aws s3 ls s3://test-bucket/ \
--endpoint-url https://rgw.cluster-b.example.com
Test Bidirectional Replication¶
Upload an object via Cluster B and verify it appears on Cluster A:
echo "reverse-replication-test-$(date +%s)" > /tmp/test-reverse.txt
aws s3 cp /tmp/test-reverse.txt s3://test-bucket/test-reverse.txt \
--endpoint-url https://rgw.cluster-b.example.com
# Wait a few seconds, then verify on Cluster A
sleep 30
aws s3 ls s3://test-bucket/test-reverse.txt \
--endpoint-url https://rgw.cluster-a.example.com
Verify Metadata Sync¶
Users, buckets, and ACLs replicate automatically between zones. Verify a user created on Cluster A exists on Cluster B:
Operations¶
Monitoring¶
| Alert | Condition | Severity |
|---|---|---|
S3ReplicatorCephToAwsDown |
Ceph→AWS integration pod is down | critical |
S3ReplicatorAwsToCephDown |
AWS→Ceph integration pod is down | critical |
S3ReplicatorHighErrorRate |
Camel exchange failure rate > 0.1/s for 10 min | warning |
S3ReplicatorHighLatency |
95th percentile exchange time > 30s for 10 min | warning |
S3ReplicatorKafkaConsumerLag |
Consumer lag > 1000 messages for 15 min | warning |
Each Camel K integration pod exposes Camel Micrometer metrics automatically via PodMonitors created by the Camel K operator. PrometheusRules are deployed in each cluster's s3-replicator namespace.
Scaling¶
The Camel K operator manages the pod lifecycle. To scale, you can either modify the Integration CR to increase replicas or create additional Integration CRs with different consumer group IDs for partitioned workloads. Each Kafka consumer within the same group joins the consumer group and processes partitions independently, so horizontal scaling is straightforward for Direction 1 (on-prem). Direction 2 (SQS, AWS cluster) also scales naturally since SQS supports multiple concurrent consumers.
Kafka Consumer Lag (on-prem cluster)¶
Monitor Kafka consumer lag for the s3-replicator-ceph-to-aws consumer group to detect replication backlog:
# On the on-prem cluster
kubectl exec -n rciis-prod kafka-rciis-prod-kraft-dual-role-0 -- \
bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
--describe --group s3-replicator-ceph-to-aws
SQS Queue Depth (AWS)¶
Check the AWS SQS queue for pending messages:
aws sqs get-queue-attributes \
--queue-url "https://sqs.af-south-1.amazonaws.com/ACCOUNT_ID/rciis-s3-replication-events" \
--attribute-names ApproximateNumberOfMessages \
--region af-south-1
Integration Status¶
Check the Camel K Integration status on each cluster:
# On-prem cluster
kubectl get integrations -n s3-replicator
kubectl describe integration ceph-to-aws-replicator -n s3-replicator
kubectl get integrationkit -n s3-replicator
# AWS cluster (switch kubectl context)
kubectl get integrations -n s3-replicator
kubectl describe integration aws-to-ceph-replicator -n s3-replicator
kubectl get integrationkit -n s3-replicator
Sync Status¶
Check the sync status from either cluster's Rook toolbox:
# Overall sync status
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
radosgw-admin sync status
# Detailed data sync status for a specific source zone
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
radosgw-admin data sync status --source-zone=site-b
Key fields in the output:
| Field | Meaning |
|---|---|
data is caught up with source |
All data is fully synced |
full sync: X/128 shards |
Full (initial) sync progress |
incremental sync: X/128 shards |
Incremental (ongoing) sync progress |
behind shards |
Number of shards that are behind — indicates replication lag |
Sync Performance Counters¶
# View RGW performance counters related to sync
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
ceph --admin-daemon /var/run/ceph/ceph-client.rgw.*.asok perf dump | \
jq '.data_sync'
Monitoring¶
Ceph RGW multi-site exposes sync metrics via the Ceph MGR Prometheus module. Key metrics:
| Metric | Description |
|---|---|
ceph_rgw_sync_status |
Overall sync status per zone |
ceph_data_sync_from_*_fetch_bytes_sum |
Total bytes fetched from remote zone |
ceph_data_sync_from_*_fetch_bytes_count |
Number of fetch operations from remote zone |
ceph_data_sync_from_*_poll_latency_sum |
Latency of data log polling |
These metrics are scraped automatically by the Prometheus instance monitoring the Rook-Ceph cluster.
RGW Instance Scaling¶
The number of RGW instances per zone is controlled by the gateway.instances field in the CephObjectStore CR. Increasing this scales both client request handling and sync throughput:
kubectl -n rook-ceph patch cephobjectstore multisite-store \
--type merge -p '{"spec":{"gateway":{"instances":3}}}'
Sync Throttling¶
If sync traffic is saturating the network link between clusters, configure sync throttling via Ceph config:
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
ceph config set client.rgw rgw_sync_data_inject_err_probability 0
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
ceph config set client.rgw rgw_data_sync_spawn_window 16
| Config Key | Default | Description |
|---|---|---|
rgw_data_sync_spawn_window |
16 | Max concurrent sync operations per shard |
rgw_meta_sync_spawn_window |
16 | Max concurrent metadata sync operations |
rgw_sync_lease_period |
120 | Sync lease period in seconds |
S3 Replication Metrics¶
Enable replication metrics on the source bucket by adding Metrics and ReplicationTime to the replication rule:
{
"Rules": [
{
"ID": "replicate-all",
"Status": "Enabled",
"Priority": 1,
"Filter": {},
"Destination": {
"Bucket": "arn:aws:s3:::rciis-data-eu-west-1",
"Metrics": {
"Status": "Enabled",
"EventThreshold": {
"Minutes": 15
}
},
"ReplicationTime": {
"Status": "Enabled",
"Time": {
"Minutes": 15
}
}
},
"DeleteMarkerReplication": {
"Status": "Enabled"
}
}
]
}
CloudWatch Metrics¶
With replication metrics enabled, the following CloudWatch metrics are available:
| Metric | Description |
|---|---|
ReplicationLatency |
Time to replicate objects to the destination |
OperationsPendingReplication |
Number of objects pending replication |
BytesPendingReplication |
Total bytes pending replication |
OperationsFailedReplication |
Number of objects that failed to replicate |
Troubleshooting¶
Auth Error (Ceph Side)¶
kubectl get cephobjectstoreuser s3-replicator -n s3-replicator -o yaml
kubectl get secret rook-ceph-object-user-ceph-objectstore-s3-replicator -n s3-replicator
kubectl get cephobjectstore ceph-objectstore -n rook-ceph \
-o jsonpath='{.spec.allowUsersInNamespaces}'
Auth Error (AWS Side)¶
kubectl get secret aws-s3-credentials -n s3-replicator
kubectl run -it --rm aws-test --image=amazon/aws-cli --restart=Never \
-n s3-replicator -- s3 ls s3://rciis-ceph-replica-af-south-1/ --region af-south-1
Ceph → AWS Integration Not Replicating (on-prem cluster)¶
# On the on-prem cluster:
# Check integration status
kubectl get integration ceph-to-aws-replicator -n s3-replicator
# Check integration pod logs
kubectl logs -l camel.apache.org/integration=ceph-to-aws-replicator \
-n s3-replicator --tail=100
# Verify Kafka topic has messages
kubectl exec -n rciis-prod kafka-rciis-prod-kraft-dual-role-0 -- \
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic ceph-bucket-notifications --from-beginning --max-messages 5
# Verify CephBucketTopic and CephBucketNotification are healthy
kubectl get cephbuckettopic -n rook-ceph
kubectl get cephbucketnotification -n rook-ceph
AWS → Ceph Integration Not Replicating (AWS cluster)¶
# On the AWS cluster (switch kubectl context):
# Check integration status
kubectl get integration aws-to-ceph-replicator -n s3-replicator
# Check integration pod logs
kubectl logs -l camel.apache.org/integration=aws-to-ceph-replicator \
-n s3-replicator --tail=100
# Verify SQS queue has messages
aws sqs get-queue-attributes \
--queue-url "https://sqs.af-south-1.amazonaws.com/ACCOUNT_ID/rciis-s3-replication-events" \
--attribute-names ApproximateNumberOfMessages \
--region af-south-1
# Check the DLQ for failed messages
aws sqs get-queue-attributes \
--queue-url "https://sqs.af-south-1.amazonaws.com/ACCOUNT_ID/rciis-s3-replication-dlq" \
--attribute-names ApproximateNumberOfMessages \
--region af-south-1
Integration Build Failures¶
If an Integration is stuck in Building or shows Error phase:
# Check the integration conditions
kubectl describe integration ceph-to-aws-replicator -n s3-replicator
# Check the IntegrationKit build logs
kubectl get integrationkit -n s3-replicator
kubectl logs -l camel.apache.org/component=operator -n <operator-namespace> --tail=200
# Common causes:
# - Missing Maven dependencies (check spec.dependencies)
# - Secret not found (check mount.configs references)
# - Operator not watching the namespace (check operator.id annotation)
Ceph RGW Endpoint Unreachable¶
Sync Status Shows "behind shards"¶
If radosgw-admin sync status shows shards that are behind, check the sync error log:
To retry failed sync operations:
# Reset the data sync marker for a specific source zone
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
radosgw-admin data sync init --source-zone=site-b
Warning
data sync init resets the sync state and triggers a full resync of data from the source zone. Only use this if incremental sync is consistently failing.
Remote RGW Endpoint Unreachable¶
Sync requires bidirectional HTTP/S connectivity between the RGW instances on each cluster. Test from the Rook toolbox:
# From Cluster A, test connectivity to Cluster B's RGW
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
curl -sv https://rgw.cluster-b.example.com
# From Cluster B, test connectivity to Cluster A's RGW
kubectl -n rook-ceph exec deploy/rook-ceph-tools -- \
curl -sv https://rgw.cluster-a.example.com
If the endpoint is unreachable, check:
- VPN tunnel or network peering is up
- Ingress controller or LoadBalancer is routing to the RGW service
- Firewall rules allow TCP 443 between clusters
- DNS resolution works from within the Rook toolbox
TLS Certificate Errors¶
If the remote RGW endpoint uses a certificate not trusted by the Ceph RGW sync agent:
# Check the RGW pod logs for TLS errors
kubectl -n rook-ceph logs -l app=rook-ceph-rgw --tail=100 | grep -i "ssl\|tls\|certificate"
To add a custom CA certificate to the RGW trust store, configure the CephObjectStore CR:
spec:
gateway:
sslCertificateRef: rgw-tls
caBundleRef: custom-ca-bundle # ConfigMap with the CA certificate
Realm Keys Secret Issues¶
If the secondary cluster cannot pull the realm configuration:
# Verify the realm keys secret exists on the secondary cluster
kubectl get secret rciis-keys -n rook-ceph
# Verify the secret contains the expected keys
kubectl get secret rciis-keys -n rook-ceph -o jsonpath='{.data}' | jq 'keys'
# Expected: ["access-key", "secret-key"]
If the secret is missing or corrupted, re-export it from the master cluster and re-apply.
Metadata Sync Failures¶
Metadata (users, buckets, ACLs) is synced by the master zone. If metadata sync is failing:
Common causes:
- Master zone's RGW endpoint is unreachable from the secondary
- Realm period has not been committed after configuration changes
To force a period update (run on the master cluster):
Object Store Stuck in "Progressing"¶
If the CephObjectStore on either cluster does not reach Ready:
kubectl describe cephobjectstore multisite-store -n rook-ceph
# Check Rook operator logs
kubectl logs -l app=rook-ceph-operator -n rook-ceph --tail=200 | grep -i multisite
Common causes:
- CRDs created out of order (must be: Realm -> ZoneGroup -> Zone -> ObjectStore)
- Realm keys secret missing on secondary cluster
- Zone name conflicts (each zone must have a unique name)
customEndpointsnot set or not reachable
Check Replication Configuration¶
Check Object Replication Status¶
aws s3api head-object \
--bucket rciis-data-af-south-1 \
--key <object-key> \
--query ReplicationStatus
Possible status values:
| Status | Meaning |
|---|---|
COMPLETED |
Object successfully replicated |
PENDING |
Replication in progress |
FAILED |
Replication failed — check IAM permissions and bucket policy |
REPLICA |
This object is itself a replica (on the destination bucket) |
IAM Role Issues¶
Key Files¶
On-prem cluster:
| File | Description |
|---|---|
apps/infra/s3-replicator/on-prem/extra/ceph-to-aws-pipeline.yaml |
Ceph→AWS Camel K Integration CR |
apps/infra/s3-replicator/on-prem/extra/ceph-bucket-notification.yaml |
CephBucketTopic + CephBucketNotification CRDs |
apps/infra/s3-replicator/on-prem/extra/ceph-s3-user.yaml |
CephObjectStoreUser + ObjectBucketClaim |
apps/infra/s3-replicator/on-prem/extra/prometheus-rules.yaml |
PrometheusRule (Camel K metrics, on-prem alerts) |
apps/infra/secrets/on-prem/s3-replicator/ |
SOPS-encrypted AWS credentials |
AWS cluster:
| File | Description |
|---|---|
apps/infra/s3-replicator/aws/extra/aws-to-ceph-pipeline.yaml |
AWS→Ceph Camel K Integration CR |
apps/infra/s3-replicator/aws/extra/prometheus-rules.yaml |
PrometheusRule (Camel K metrics, AWS-side alerts) |
apps/infra/secrets/aws/s3-replicator/ |
SOPS-encrypted Ceph RGW + AWS credentials |
On-Prem ↔ On-Prem (Ceph multi-site):
No manifest files in the s3-replicator namespace. Multi-site replication is configured entirely via Rook CRDs in the rook-ceph namespace on each cluster:
| Resource | Namespace | Description |
|---|---|---|
CephObjectRealm/rciis |
rook-ceph |
Multi-site realm (shared across clusters) |
CephObjectZoneGroup/rciis-sites |
rook-ceph |
Zonegroup containing both zones |
CephObjectZone/site-a |
rook-ceph |
Master zone (Cluster A) |
CephObjectZone/site-b |
rook-ceph |
Secondary zone (Cluster B) |
CephObjectStore/multisite-store |
rook-ceph |
RGW object store referencing the local zone |
Secret/rciis-keys |
rook-ceph |
Auto-generated realm keys (copy to secondary cluster) |