Skip to content

5.2.1 Secure the Cluster

With the Kubernetes cluster bootstrapped and healthy, the next step is to deploy the security stack. These tools provide defence in depth — layered controls that protect the cluster from supply chain attacks, runtime threats, policy violations, and unauthorised access.

How to Use This Section

Follow these pages in order to deploy the full security stack:

  1. Policy Engine — Deploy Kyverno and core policies first. It acts as the admission control gate for all subsequent workloads.
  2. Vulnerability Scanning — Deploy the Trivy Operator to establish a vulnerability baseline.
  3. Runtime Threat Detection — Deploy Falco and Tracee for runtime monitoring.
  4. Troubleshooting — Reference guide for diagnosing common issues.

Recommended rollout approach:

  • Start all Kyverno policies in Audit mode
  • Baseline for 1 week, reviewing PolicyReport violations
  • Switch policies to Enforce mode one at a time after confirming no false positives
  • For non-production environments: skip Tracee and use Non-HA values to save resources

Security Layers

The RCIIS security stack is organised into five layers, each addressing a different attack surface:

Layer Tool Purpose How It Works
Supply Chain Security Sigstore / cosign + Kyverno Sign container images in CI; verify signatures at admission time cosign signs images in CI; Kyverno admission webhook verifies signatures before pods are scheduled
Admission Control Kyverno Block non-compliant resources before they reach the cluster Kubernetes admission webhook intercepts API requests and validates/mutates against YAML policy CRDs
Vulnerability Scanning Trivy Operator Detect CVEs, misconfigurations, and leaked secrets in running workloads Operator watches for new/updated workloads, spawns scan jobs, stores results as CRD reports
Runtime Threat Detection Falco + Tracee Detect anomalous behaviour at the syscall and kernel level eBPF programs attached to the kernel intercept syscalls and match against rule/policy definitions
Identity & Access Keycloak Centralised authentication (OIDC/SAML) for all platform and application access OIDC provider issues tokens; Kubernetes API server and applications validate tokens for authN/authZ
Cryptographic Root of Trust HSM Hardware-backed key storage for CA signing keys and master encryption keys (provisioning in Phase 3) Signing keys never leave the HSM; cert-manager and SOPS reference keys via PKCS#11 or KMS API

Defence in Depth

                    Internet
                       |
               +-------+-------+
               | Cloudflare WAF |   <-- Layer 7 filtering, DDoS protection
               +-------+-------+
                       |
               +-------+-------+
               | Cilium CNI    |   <-- Network policies, pod-to-pod encryption
               +-------+-------+
                       |
               +-------+-------+
               | cosign/Sigstore|  <-- Image signature verification
               +-------+-------+
                       |
               +-------+-------+
               | Kyverno       |   <-- Admission control, image allow-lists
               +-------+-------+
                       |
               +-------+-------+
               | Trivy Operator|   <-- Continuous vulnerability scanning
               +-------+-------+
                       |
               +-------+-------+
               | Falco / Tracee|   <-- Runtime syscall & eBPF monitoring
               +-------+-------+
                       |
               +-------+-------+
               | Keycloak      |   <-- OIDC authentication, RBAC
               +-------+-------+
                       |
               +-------+-------+
               | HSM           |   <-- Hardware-backed key management
               +-------+-------+

Deployment Order

Security tools deploy in Wave 4 of the platform Flux ResourceSet — after the core infrastructure (Cilium, cert-manager, ingress, Flux, storage operator) but before application-layer services (storage cluster, databases, observability). This ensures every subsequent workload is policy-enforced and monitored from the start.

See the wave diagram for the full deployment sequence.

Wave Tool Reason
4 Kyverno Admission controller — must be active before workloads in Waves 5-6 are deployed
4 Trivy Operator Scan baseline images; results feed into Kyverno image verification policies
4 Falco Runtime syscall detection — active before application workloads arrive
4 Tracee Supplements Falco with eBPF-based forensic capture
6 Keycloak Identity provider — requires CloudNativePG (Wave 5) for its PostgreSQL database
6 HSM integration Requires Keycloak and cert-manager; configures HSM-backed keys (HSM itself provisioned in Phase 3)

Prerequisites

Before deploying security tools (Wave 4), the following must be operational:

  • [x] Talos cluster is bootstrapped and all nodes are Ready (Verify Cluster Health)
  • [x] kubectl and helm are installed and configured (Install Tooling)
  • [x] Wave 1: Cilium CNI is operational with Hubble enabled
  • [x] Wave 1: cert-manager is running (for Kyverno webhook certificates)
  • [x] Wave 3: Flux is deployed (manages security tool lifecycle)
  • [x] SOPS and Age keys are configured for secret encryption (Credential Management)

Namespace Strategy

All security tools are deployed into dedicated namespaces:

Tool Namespace
Kyverno kyverno
Trivy Operator trivy-system
Falco falco
Tracee tracee
Keycloak keycloak