5.3.2 Data Services¶

Database and messaging operators manage the lifecycle of stateful data workloads. These operators handle provisioning, failover, backup, and upgrades of PostgreSQL clusters and Apache Kafka clusters respectively.

How to use this page

Each component has an Install section showing the Flux HelmRelease, a Configuration section with Helm values, and a Verify section to confirm it is working.

All code blocks are labelled with their file path in the repository. Select your target environment (AWS or Bare Metal) in any tab group — the choice syncs across the entire page.

Using the existing rciis-devops repository: All files already exist. Skip the mkdir and git add/git commit commands — they are for users building a new repository. Simply review the files, edit values for your environment, and push.
Building a new repository from scratch: Follow the mkdir, file creation, and git commands in order.
No Git access: Expand the "Alternative: Helm CLI" block under each Install section.

CloudNativePG¶

CloudNativePG is the Kubernetes operator for PostgreSQL. It manages the full lifecycle of PostgreSQL clusters including automated failover, continuous backup, rolling updates, and connection pooling. All PostgreSQL instances in the RCIIS platform (Grafana, Keycloak, application databases) are managed by this operator.

Install¶

The base HelmRelease tells Flux which chart to install. This file is shared across all environments — environment-specific settings are applied via patches (shown in the Configuration section).

Create the base directory and file:

mkdir -p flux/infra/base

Field	Value	Explanation
`chart`	`cloudnative-pg`	The Helm chart name from the CloudNativePG registry
`version`	`0.27.0`	Pinned chart version — update this to upgrade CloudNativePG
`sourceRef.name`	`cnpg`	References a `HelmRepository` CR pointing to `https://cloudnative-pg.github.io/charts`
`targetNamespace`	`cnpg-system`	Namespace where CloudNativePG operator runs
`crds: CreateReplace`	—	Automatically installs and updates CloudNativePG CRDs
`remediation.retries`	`3`	Flux retries up to 3 times if the install or upgrade fails

Save the following as flux/infra/base/cloudnative-pg.yaml:

flux/infra/base/cloudnative-pg.yaml

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: cloudnative-pg
  namespace: flux-system
spec:
  targetNamespace: cnpg-system
  interval: 30m
  chart:
    spec:
      chart: cloudnative-pg
      version: "0.27.0"
      sourceRef:
        kind: HelmRepository
        name: cnpg
        namespace: flux-system
  releaseName: cloudnative-pg
  install:
    createNamespace: true
    crds: CreateReplace
    remediation:
      retries: 3
  upgrade:
    crds: CreateReplace
    remediation:
      retries: 3
  values:
    replicaCount: 2
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        nodeTaintsPolicy: Honor
        labelSelector:
          matchLabels:
            app.kubernetes.io/name: cloudnative-pg
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 100m
        memory: 128Mi
    monitoring:
      podMonitorEnabled: true
      podMonitorAdditionalLabels:
        release: prometheus
    logLevel: info
    webhook:
      mutating:
        create: true
      validating:
        create: true
    securityContext:
      runAsNonRoot: true
      seccompProfile:
        type: RuntimeDefault
    config:
      create: true
      data:
        INHERITED_ANNOTATIONS: "cert-manager.io/*"
        INHERITED_LABELS: "app.kubernetes.io/*"

Alternative: Helm CLI

If you do not have Git access, install CloudNativePG directly:

helm repo add cnpg https://cloudnative-pg.github.io/charts
helm repo update
helm upgrade --install cloudnative-pg cnpg/cloudnative-pg \
  --namespace cnpg-system \
  --create-namespace \
  --version 0.27.0 \
  -f values.yaml

Configuration¶

The environment patch overrides the base HelmRelease with cluster-specific settings. The values file controls how CloudNativePG behaves. Select your environment below.

Create the environment overlay directory:

AWSBare MetalProxmox VMs

mkdir -p flux/infra/aws/cloudnative-pg

mkdir -p flux/infra/proxmox/cloudnative-pg

mkdir -p flux/infra/proxmox/cloudnative-pg

Environment Patch¶

The patch file sets resource limits and replica counts appropriate for each environment. AWS reduces replicas and resources for cost optimization. Bare Metal uses the base defaults.

Save the following as the patch file for your environment:

AWSBare MetalProxmox VMs

On AWS, CloudNativePG resources are reduced to optimize cloud costs while maintaining operator functionality.

flux/infra/aws/cloudnative-pg/patch.yaml

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: cloudnative-pg
spec:
  values:
    replicaCount: 1
    topologySpreadConstraints: []
    resources:
      requests:
        cpu: 25m
        memory: 64Mi
      limits:
        cpu: 250m
        memory: 256Mi

Setting	Value	Why
`replicaCount`	`1`	Single operator instance reduces AWS costs
`topologySpreadConstraints`	`[]`	Clears topology spread — not needed for single replica
`resources.requests`	25m/64Mi	Minimal resource footprint for AWS
`resources.limits`	250m/256Mi	Caps resource usage for cost control

On Bare Metal, CloudNativePG uses the base configuration with full HA capabilities.

flux/infra/proxmox/cloudnative-pg/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../base/cloudnative-pg.yaml

No environment patch needed. The base HelmRelease provides 2 operator replicas with topology spread constraints and full monitoring enabled.

On Bare Metal, CloudNativePG uses the base configuration with full HA capabilities.

flux/infra/proxmox/cloudnative-pg/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - ../../base/cloudnative-pg.yaml

No environment patch needed. The base HelmRelease provides 2 operator replicas with topology spread constraints and full monitoring enabled.

Helm Values¶

The base HelmRelease already includes comprehensive Helm values. If you need to customize further for your environment, reference these key settings:

Setting	HA (Default)	Non-HA	Why
`replicaCount`	`2`	`1`	Single vs. multiple operator instances
`topologySpreadConstraints`	Enabled	Disabled	Spreads operator pods across nodes
`monitoring.podMonitorEnabled`	`true`	`false`	Prometheus metrics for observability
`webhook.mutating/validating.create`	`true`	`true`	CRD validation webhooks (recommended)

Commit and Deploy¶

Once all files are in place, commit and push to trigger Flux deployment:

AWSBare MetalProxmox VMs

git add flux/infra/base/cloudnative-pg.yaml \
        flux/infra/aws/cloudnative-pg/
git commit -m "feat(cloudnative-pg): add CloudNativePG operator for AWS environment"
git push

git add flux/infra/base/cloudnative-pg.yaml \
        flux/infra/proxmox/cloudnative-pg/
git commit -m "feat(cloudnative-pg): add CloudNativePG operator for bare metal environment"
git push

git add flux/infra/base/cloudnative-pg.yaml \
        flux/infra/proxmox/cloudnative-pg/
git commit -m "feat(cloudnative-pg): add CloudNativePG operator for bare metal environment"
git push

Flux will detect the new commit and begin deploying CloudNativePG. To trigger an immediate sync instead of waiting for the next poll interval:

flux reconcile kustomization infra-cloudnative-pg -n flux-system --with-source

Verify¶

# Check operator is running
kubectl get pods -n cnpg-system

# Verify CRDs are installed
kubectl get crd clusters.postgresql.cnpg.io

# List any PostgreSQL clusters
kubectl get clusters -A

Creating PostgreSQL clusters

The CloudNativePG operator manages PostgreSQL clusters (not the operator itself). Individual PostgreSQL instances are created as Cluster CRs in application namespaces. See Identity Management for the Keycloak database example, or the Grafana PostgreSQL instance deployed alongside Prometheus.

Flux Operations¶

This component is managed by Flux as HelmRelease cloudnative-pg and Kustomization infra-cloudnative-pg.

Check whether the HelmRelease and Kustomization are in a Ready state:

flux get helmrelease cloudnative-pg -n flux-system

flux get kustomization infra-cloudnative-pg -n flux-system

Trigger an immediate sync — pulls the latest Git revision and re-applies the manifests. Use after pushing config changes or to verify a fix:

flux reconcile kustomization infra-cloudnative-pg -n flux-system --with-source

Trigger a Helm upgrade — re-runs the Helm install/upgrade for this release without waiting for the next interval. Use when the HelmRelease values have changed:

flux reconcile helmrelease cloudnative-pg -n flux-system

View recent Flux controller logs for this release — useful for diagnosing why a sync or upgrade failed:

flux logs --kind=HelmRelease --name=cloudnative-pg -n flux-system

Recovering a stalled HelmRelease

If the HelmRelease shows Stalled with RetriesExceeded, Flux will not retry automatically. Suspend and resume to clear the failure counter, then reconcile:

flux suspend helmrelease cloudnative-pg -n flux-system
flux resume helmrelease cloudnative-pg -n flux-system
flux reconcile kustomization infra-cloudnative-pg -n flux-system

Only run this after confirming the underlying issue (e.g. pod crash, timeout) has been resolved. See Maintenance — Recovering Stalled Resources for details.

Next: Continue to Strimzi below.

Strimzi¶

Strimzi is the Kubernetes operator for Apache Kafka. It manages Kafka clusters, topics, users, connectors, and bridges. The RCIIS ESB (Enterprise Service Bus) relies on Kafka for asynchronous messaging between customs systems.

Install¶

The base HelmRelease tells Flux which chart to install. This file is shared across all environments — environment-specific settings are applied via patches (shown in the Configuration section).

Create the base directory and file:

mkdir -p flux/infra/base

Field	Value	Explanation
`chart`	`strimzi-kafka-operator`	The Helm chart name from the Strimzi registry
`version`	`0.47.0`	Pinned chart version — update this to upgrade Strimzi
`sourceRef.name`	`strimzi`	References a `HelmRepository` CR pointing to `https://strimzi.io/charts`
`targetNamespace`	`strimzi-operator`	Namespace where Strimzi operator runs
`dependsOn`	`prometheus`	Ensures Prometheus is deployed before Strimzi metrics are configured
`crds: CreateReplace`	—	Automatically installs and updates Strimzi CRDs
`remediation.retries`	`3`	Flux retries up to 3 times if the install or upgrade fails

Save the following as flux/infra/base/strimzi.yaml:

flux/infra/base/strimzi.yaml

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: strimzi
  namespace: flux-system
spec:
  dependsOn:
    - name: prometheus
  targetNamespace: strimzi-operator
  interval: 30m
  chart:
    spec:
      chart: strimzi-kafka-operator
      version: "0.47.0"
      sourceRef:
        kind: HelmRepository
        name: strimzi
        namespace: flux-system
  releaseName: strimzi
  install:
    createNamespace: true
    crds: CreateReplace
    remediation:
      retries: 3
  upgrade:
    crds: CreateReplace
    remediation:
      retries: 3
  values:
    replicas: 2
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: DoNotSchedule
        nodeTaintsPolicy: Honor
        labelSelector:
          matchLabels:
            name: strimzi-cluster-operator
    serviceAccount: strimzi-cluster-operator
    resources:
      requests:
        cpu: 200m
        memory: 512Mi
      limits:
        cpu: 1000m
        memory: 1024Mi
    watchNamespaces: []
    watchAnyNamespace: true
    createGlobalResources: true
    createAggregateRoles: true
    featureGates: ""
    logLevel: INFO
    dashboards:
      enabled: true
      namespace: monitoring
      labels:
        grafana_dashboard: "1"

Alternative: Helm CLI

If you do not have Git access, install Strimzi directly:

helm repo add strimzi https://strimzi.io/charts
helm repo update
helm upgrade --install strimzi strimzi/strimzi-kafka-operator \
  --namespace strimzi-operator \
  --create-namespace \
  --version 0.47.0 \
  -f values.yaml

Configuration¶

The environment patch overrides the base HelmRelease with cluster-specific settings. The values file controls how Strimzi behaves. Select your environment below.

Create the environment overlay directory:

AWSBare MetalProxmox VMs

mkdir -p flux/infra/aws/strimzi

mkdir -p flux/infra/proxmox/strimzi

mkdir -p flux/infra/proxmox/strimzi

Environment Patch¶

The patch file sets resource limits and replica counts appropriate for each environment. AWS reduces replicas and resources for cost optimization. Bare Metal uses the base defaults.

Save the following as the patch file for your environment:

AWSBare MetalProxmox VMs

On AWS, Strimzi resources are reduced to optimize cloud costs while maintaining operator functionality.

flux/infra/aws/strimzi/patch.yaml

apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
  name: strimzi
spec:
  values:
    replicas: 1
    topologySpreadConstraints: []
    resources:
      requests:
        cpu: 50m
        memory: 256Mi
      limits:
        cpu: 500m
        memory: 512Mi

Setting	Value	Why
`replicas`	`1`	Single operator instance reduces AWS costs
`topologySpreadConstraints`	`[]`	Clears topology spread — not needed for single replica
`resources.requests`	50m/256Mi	Minimal resource footprint for AWS
`resources.limits`	500m/512Mi	Caps resource usage for cost control

On Bare Metal, Strimzi uses the base configuration with full HA capabilities.

flux/infra/proxmox/strimzi/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - namespace.yaml
  - ../../base/strimzi.yaml
  - bridge-metrics.yaml
  - cluster-operator-metrics.yaml
  - entity-operator-metrics.yaml
  - kafka-resource-metrics.yaml

No environment patch needed. The base HelmRelease provides 2 operator replicas with topology spread constraints and full monitoring enabled.

On Bare Metal, Strimzi uses the base configuration with full HA capabilities.

flux/infra/proxmox/strimzi/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - namespace.yaml
  - ../../base/strimzi.yaml
  - bridge-metrics.yaml
  - cluster-operator-metrics.yaml
  - entity-operator-metrics.yaml
  - kafka-resource-metrics.yaml

No environment patch needed. The base HelmRelease provides 2 operator replicas with topology spread constraints and full monitoring enabled.

Helm Values¶

The base HelmRelease already includes comprehensive Helm values. If you need to customize further for your environment, reference these key settings:

Setting	HA (Default)	Non-HA	Why
`replicas`	`2`	`1`	Single vs. multiple operator instances
`topologySpreadConstraints`	Enabled	Disabled	Spreads operator pods across nodes
`dashboards.enabled`	`true`	`false`	Grafana dashboards for Kafka observability
`resources`	200m/512Mi req, 1000m/1Gi lim	100m/256Mi req, 500m/384Mi lim	Resource scaling with replica count

Extra Manifests¶

Strimzi includes Prometheus monitoring definitions as separate manifests. These are deployed from the environment-specific directories alongside the Helm chart configuration.

AWSBare MetalProxmox VMs

The AWS environment includes PodMonitor resources that enable Prometheus to scrape Strimzi metrics. Save each of the following in flux/infra/aws/strimzi/:

bridge-metrics.yaml — Metrics for KafkaBridge components:

flux/infra/aws/strimzi/bridge-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: bridge-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      strimzi.io/kind: KafkaBridge
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: rest-api

cluster-operator-metrics.yaml — Metrics for the Strimzi cluster operator:

flux/infra/aws/strimzi/cluster-operator-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: cluster-operator-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      strimzi.io/kind: cluster-operator
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: http

entity-operator-metrics.yaml — Metrics for entity operator (users and topics):

flux/infra/aws/strimzi/entity-operator-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: entity-operator-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: entity-operator
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: healthcheck

kafka-resource-metrics.yaml — Metrics for Kafka clusters and related resources:

flux/infra/aws/strimzi/kafka-resource-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: kafka-resources-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchExpressions:
      - key: "strimzi.io/kind"
        operator: In
        values: ["Kafka", "KafkaConnect", "KafkaMirrorMaker2"]
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: tcp-prometheus
    relabelings:
    - separator: ;
      regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
      replacement: $1
      action: labelmap
    - sourceLabels: [__meta_kubernetes_namespace]
      separator: ;
      regex: (.*)
      targetLabel: namespace
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_name]
      separator: ;
      regex: (.*)
      targetLabel: kubernetes_pod_name
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_node_name]
      separator: ;
      regex: (.*)
      targetLabel: node_name
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_host_ip]
      separator: ;
      regex: (.*)
      targetLabel: node_ip
      replacement: $1
      action: replace

The Bare Metal environment includes the same PodMonitor resources. Save each of the following in flux/infra/proxmox/strimzi/:

bridge-metrics.yaml:

flux/infra/proxmox/strimzi/bridge-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: bridge-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      strimzi.io/kind: KafkaBridge
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: rest-api

cluster-operator-metrics.yaml:

flux/infra/proxmox/strimzi/cluster-operator-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: cluster-operator-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      strimzi.io/kind: cluster-operator
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: http

entity-operator-metrics.yaml:

flux/infra/proxmox/strimzi/entity-operator-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: entity-operator-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: entity-operator
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: healthcheck

kafka-resource-metrics.yaml:

flux/infra/proxmox/strimzi/kafka-resource-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: kafka-resources-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchExpressions:
      - key: "strimzi.io/kind"
        operator: In
        values: ["Kafka", "KafkaConnect", "KafkaMirrorMaker2"]
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: tcp-prometheus
    relabelings:
    - separator: ;
      regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
      replacement: $1
      action: labelmap
    - sourceLabels: [__meta_kubernetes_namespace]
      separator: ;
      regex: (.*)
      targetLabel: namespace
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_name]
      separator: ;
      regex: (.*)
      targetLabel: kubernetes_pod_name
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_node_name]
      separator: ;
      regex: (.*)
      targetLabel: node_name
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_host_ip]
      separator: ;
      regex: (.*)
      targetLabel: node_ip
      replacement: $1
      action: replace

The Bare Metal environment includes the same PodMonitor resources. Save each of the following in flux/infra/proxmox/strimzi/:

bridge-metrics.yaml:

flux/infra/proxmox/strimzi/bridge-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: bridge-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      strimzi.io/kind: KafkaBridge
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: rest-api

cluster-operator-metrics.yaml:

flux/infra/proxmox/strimzi/cluster-operator-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: cluster-operator-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      strimzi.io/kind: cluster-operator
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: http

entity-operator-metrics.yaml:

flux/infra/proxmox/strimzi/entity-operator-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: entity-operator-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: entity-operator
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: healthcheck

kafka-resource-metrics.yaml:

flux/infra/proxmox/strimzi/kafka-resource-metrics.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: kafka-resources-metrics
  namespace: strimzi-operator
  labels:
    app: strimzi
    release: prometheus
spec:
  selector:
    matchExpressions:
      - key: "strimzi.io/kind"
        operator: In
        values: ["Kafka", "KafkaConnect", "KafkaMirrorMaker2"]
  namespaceSelector:
    matchNames:
      - rciis-prod
  podMetricsEndpoints:
  - path: /metrics
    port: tcp-prometheus
    relabelings:
    - separator: ;
      regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
      replacement: $1
      action: labelmap
    - sourceLabels: [__meta_kubernetes_namespace]
      separator: ;
      regex: (.*)
      targetLabel: namespace
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_name]
      separator: ;
      regex: (.*)
      targetLabel: kubernetes_pod_name
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_node_name]
      separator: ;
      regex: (.*)
      targetLabel: node_name
      replacement: $1
      action: replace
    - sourceLabels: [__meta_kubernetes_pod_host_ip]
      separator: ;
      regex: (.*)
      targetLabel: node_ip
      replacement: $1
      action: replace

PodMonitor resources

The PodMonitor CRs tell Prometheus to scrape metrics from Strimzi components. These are automatically deployed via Kustomize when you include them in the environment directory. They require the Prometheus operator to be installed (see Observability for details).

Commit and Deploy¶

Once all files are in place, commit and push to trigger Flux deployment:

AWSBare MetalProxmox VMs

git add flux/infra/base/strimzi.yaml \
        flux/infra/aws/strimzi/
git commit -m "feat(strimzi): add Strimzi Kafka operator for AWS environment"
git push

git add flux/infra/base/strimzi.yaml \
        flux/infra/proxmox/strimzi/
git commit -m "feat(strimzi): add Strimzi Kafka operator for bare metal environment"
git push

git add flux/infra/base/strimzi.yaml \
        flux/infra/proxmox/strimzi/
git commit -m "feat(strimzi): add Strimzi Kafka operator for bare metal environment"
git push

Flux will detect the new commit and begin deploying Strimzi. To trigger an immediate sync instead of waiting for the next poll interval:

flux reconcile kustomization infra-strimzi -n flux-system --with-source

Verify¶

# Check operator is running
kubectl get pods -n strimzi-operator

# Verify CRDs are installed
kubectl get crd kafkas.kafka.strimzi.io

# List any Kafka clusters
kubectl get kafka -A

Flux Operations¶

This component is managed by Flux as HelmRelease strimzi and Kustomization infra-strimzi.

Check whether the HelmRelease and Kustomization are in a Ready state:

flux get helmrelease strimzi -n flux-system

flux get kustomization infra-strimzi -n flux-system

Trigger an immediate sync — pulls the latest Git revision and re-applies the manifests. Use after pushing config changes or to verify a fix:

flux reconcile kustomization infra-strimzi -n flux-system --with-source

Trigger a Helm upgrade — re-runs the Helm install/upgrade for this release without waiting for the next interval. Use when the HelmRelease values have changed:

flux reconcile helmrelease strimzi -n flux-system

View recent Flux controller logs for this release — useful for diagnosing why a sync or upgrade failed:

flux logs --kind=HelmRelease --name=strimzi -n flux-system

Recovering a stalled HelmRelease

If the HelmRelease shows Stalled with RetriesExceeded, Flux will not retry automatically. Suspend and resume to clear the failure counter, then reconcile:

flux suspend helmrelease strimzi -n flux-system
flux resume helmrelease strimzi -n flux-system
flux reconcile kustomization infra-strimzi -n flux-system

Only run this after confirming the underlying issue (e.g. pod crash, timeout) has been resolved. See Maintenance — Recovering Stalled Resources for details.

Next Steps¶

Data services are now configured. Proceed to 5.3.3 Backup & Scheduling to set up automated backup and retention policies for PostgreSQL and Kafka.