8.2 Security Audit¶
Before handing over the environment, verify that every security control is deployed, configured correctly, and functioning. This audit should be performed after all Phase 5 (Install Platform Services) security tools are installed and before external access is enabled.
Pre-Handover Security Checklist¶
| # | Check | Tool | Pass Criteria | Common Failure Modes |
|---|---|---|---|---|
| 1 | Kyverno admission controller is running | Kyverno | 3 replicas healthy, webhook registered | Pods pending due to resource limits; webhook not recreated after restart |
| 2 | Policy-violating resources are blocked | Kyverno | Test pod with privileged=true is rejected | Policy in Audit mode instead of Enforce; namespace excluded from webhook |
| 3 | Unsigned images are rejected | cosign + Kyverno | Test pod with unsigned image is denied by admission webhook | Public key mismatch; webhookTimeoutSeconds too low for signature verification |
| 4 | Trivy Operator is scanning | Trivy | VulnerabilityReports exist for all namespaces | Scan jobs stuck Pending (insufficient resources); DB download failure |
| 5 | No CRITICAL CVEs in production images | Trivy | criticalCount = 0 across all VulnerabilityReports |
New CVE disclosed after last scan; image not yet rescanned |
| 6 | Falco is detecting runtime events | Falco | Test shell exec triggers alert | eBPF probe load failure on Talos; modern_ebpf driver not set |
| 7 | Tracee eBPF programs are loaded | Tracee | DaemonSet running, events captured | hostPID not enabled; kernel BTF not available |
| 8 | Keycloak OIDC is functional | Keycloak | Weave GitOps SSO login succeeds | Realm not configured; TLS certificate mismatch |
| 9 | Kubernetes OIDC auth works | Keycloak | kubectl with OIDC token succeeds |
API server OIDC flags not set; issuer URL mismatch |
| 10 | HSM is connected (if applicable) | HSM | cert-manager issues a test certificate via HSM | PKCS#11 library path incorrect; HSM partition locked |
| 11 | Encryption at rest is enabled | Talos/KMS | Disk encryption verified per model | Encryption key not provisioned; wrong partition encrypted |
| 12 | Network policies are enforced | Cilium | Default-deny policies exist, test cross-namespace traffic is blocked | Kyverno generate policy not active; Cilium policy enforcement disabled |
| 13 | RBAC is properly scoped | Kubernetes | No cluster-admin bindings for application service accounts |
Helm chart installs ClusterRoleBinding with excessive permissions |
| 14 | CIS Kubernetes Benchmark scan passes | Trivy | No CRITICAL findings in clustercompliancereports |
Compliance scanning not enabled; scan not yet run |
| 15 | All required Kyverno policies in Enforce mode | Kyverno | Image allow-list, pod security, resource limits policies are Enforce |
Policies still in Audit from initial rollout; PolicyExceptions masking issues |
Verification Procedures¶
1. Kyverno¶
Confirm the admission controller is running:
kubectl -n kyverno get pods
# Expect: 3 admission-controller pods Running, 2 background-controller pods Running
kubectl get validatingwebhookconfigurations | grep kyverno
# Expect: kyverno-resource-validating-webhook-cfg
Test policy enforcement:
# Create a policy-violating pod (should be rejected)
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: security-audit-test-privileged
namespace: default
spec:
containers:
- name: test
image: harbor.devops.africa/rciis/test:latest
securityContext:
privileged: true
EOF
# Expected: Error from server: admission webhook denied the request
# Clean up
kubectl delete pod security-audit-test-privileged --ignore-not-found
Review policy reports:
# Check for existing violations in audit-mode policies
kubectl get policyreports -A --no-headers | wc -l
kubectl get clusterpolicyreports
# Detailed violations
kubectl get policyreports -A -o json | \
jq -r '.items[] | select(.summary.fail > 0) |
"\(.metadata.namespace): \(.summary.fail) failures"'
2. Image Signature Verification¶
Verify the image verification policy is active:
kubectl get clusterpolicy verify-image-signatures -o yaml | grep validationFailureAction
# Expected: validationFailureAction: Enforce
Test that unsigned images are rejected:
# Try deploying an unsigned image from Harbor (should be rejected)
kubectl run audit-unsigned-test \
--image=harbor.devops.africa/rciis/test:unsigned \
--restart=Never 2>&1
# Expected: Error from server: admission webhook denied the request:
# image signature verification failed
# Clean up
kubectl delete pod audit-unsigned-test --ignore-not-found
Verify a signed image is accepted:
# Verify a signature exists on a deployed image
cosign verify --key cosign.pub harbor.devops.africa/rciis/myapp:v1.2.3
# Expected: Verification for harbor.devops.africa/rciis/myapp:v1.2.3 --
# The following checks were performed:
# - The cosign claims were validated
# - The signatures were verified against the specified public key
3. Trivy Operator¶
Confirm the operator is running:
Check scan coverage:
# List all vulnerability reports
kubectl get vulnerabilityreports -A --no-headers | wc -l
# Check for CRITICAL findings
kubectl get vulnerabilityreports -A -o json | \
jq -r '.items[] | select(.report.summary.criticalCount > 0) |
"\(.metadata.namespace)/\(.metadata.labels["trivy-operator.resource.name"]): \(.report.summary.criticalCount) CRITICAL"'
# Expected: No output (zero critical vulnerabilities)
Check config audit:
kubectl get configauditreports -A -o json | \
jq -r '.items[] | select(.report.summary.criticalCount > 0) |
"\(.metadata.namespace)/\(.metadata.labels["trivy-operator.resource.name"]): \(.report.summary.criticalCount) CRITICAL misconfigs"'
Check for exposed secrets in images:
4. Falco¶
Confirm Falco is running with eBPF driver:
kubectl -n falco get pods
# Expect: One Falco pod per node (DaemonSet)
kubectl -n falco logs -l app.kubernetes.io/name=falco | grep -i "driver"
# Expected: "eBPF probe loaded successfully" or similar
Trigger a test alert:
# Exec into a test pod — should trigger "Shell Spawned in Container"
kubectl run security-audit-test --image=busybox --rm -it --restart=Never -- /bin/sh -c "echo audit-test && exit"
# Check Falco logs for the alert
kubectl -n falco logs -l app.kubernetes.io/name=falco --tail=30 | grep -i "shell"
# Expected: Alert line containing "Shell spawned in container"
Verify Falcosidekick is forwarding alerts:
kubectl -n falco get pods -l app.kubernetes.io/name=falcosidekick
# Expect: Running
# Check Falcosidekick metrics
kubectl -n falco port-forward svc/falco-falcosidekick 2801:2801 &
curl -s http://localhost:2801/metrics | grep falcosidekick_outputs
5. Tracee¶
Confirm Tracee DaemonSet is running:
kubectl -n tracee get ds
# Expect: DESIRED = CURRENT = READY (one per node)
kubectl -n tracee logs -l app.kubernetes.io/name=tracee --tail=10
# Expected: No errors, events being captured
Verify eBPF programs loaded:
kubectl -n tracee logs -l app.kubernetes.io/name=tracee | grep -i "loaded\|attached"
# Expected: Multiple lines confirming eBPF program attachment
6. Keycloak¶
Confirm the Keycloak Operator and instance are running:
# Check the operator pod
kubectl -n keycloak get pods -l app.kubernetes.io/name=keycloak-operator
# Expected: 1 pod Running
# Check the Keycloak CR status
kubectl -n keycloak get keycloak rciis-keycloak
# Expected: READY = true
# Check the Keycloak instance pods
kubectl -n keycloak get pods -l app=keycloak
# Expected: 2 pods Running (HA)
# Test OIDC discovery endpoint
curl -s https://auth.rciis.eac.int/realms/rciis/.well-known/openid-configuration | jq .issuer
# Expected: "https://auth.rciis.eac.int/realms/rciis"
Test Weave GitOps SSO login:
# Access the Weave GitOps dashboard and verify OIDC login via Keycloak
# Expected: Browser opens, Keycloak login page appears, login succeeds
Test Kubernetes OIDC auth:
# Requires kubelogin/oidc-login kubectl plugin
kubectl oidc-login get-token \
--oidc-issuer-url=https://auth.rciis.eac.int/realms/rciis \
--oidc-client-id=kubernetes
# Expected: Token retrieved, kubectl commands work with OIDC identity
Verify RBAC mapping:
# As an OIDC-authenticated user with platform-admin role
kubectl auth can-i '*' '*' --all-namespaces
# Expected: yes
# As an OIDC-authenticated user with auditor role
kubectl auth can-i get pods --all-namespaces
# Expected: yes
kubectl auth can-i delete pods --all-namespaces
# Expected: no
7. HSM (if applicable)¶
Verify HSM connectivity:
# Test PKCS#11 connectivity (from a pod with the PKCS#11 library)
pkcs11-tool --module /usr/lib/libCryptoki2.so --list-slots
# Expected: Slot listing showing the HSM partition
pkcs11-tool --module /usr/lib/libCryptoki2.so --list-objects --type privkey
# Expected: CA signing key and other managed keys listed
# Test PKCS#11 connectivity (from a pod with the PKCS#11 library)
pkcs11-tool --module /usr/lib/libCryptoki2.so --list-slots
# Expected: Slot listing showing the HSM partition
pkcs11-tool --module /usr/lib/libCryptoki2.so --list-objects --type privkey
# Expected: CA signing key and other managed keys listed
Test certificate issuance via HSM:
# Create a test certificate request
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: security-audit-test-cert
namespace: default
spec:
secretName: security-audit-test-tls
issuerRef:
name: hsm-ca-issuer
kind: ClusterIssuer
commonName: security-audit-test.rciis.eac.int
dnsNames:
- security-audit-test.rciis.eac.int
duration: 1h
EOF
# Check certificate status
kubectl get certificate security-audit-test-cert
# Expected: READY = True
# Clean up
kubectl delete certificate security-audit-test-cert
kubectl delete secret security-audit-test-tls
8. Encryption at Rest¶
# Verify Talos disk encryption is active (run on each node)
talosctl -n <node-ip> get systemdisk
# Expected: STATE and EPHEMERAL partitions show encryption enabled
9. Network Policies¶
# Verify default-deny policies exist in application namespaces
kubectl get networkpolicies -A | grep default-deny
# Expected: default-deny-all in each namespace (generated by Kyverno)
# Test cross-namespace traffic is blocked
kubectl run nettest --image=busybox --rm -it --restart=Never -n default -- \
wget -qO- --timeout=3 http://keycloak-http.keycloak:80
# Expected: timeout (blocked by default-deny)
10. RBAC Audit¶
# Check for overly permissive ClusterRoleBindings
kubectl get clusterrolebindings -o json | \
jq -r '.items[] | select(.roleRef.name == "cluster-admin") |
"\(.metadata.name): \(.subjects[].kind)/\(.subjects[].name)"'
# Review: Only system accounts and platform-admin OIDC group should have cluster-admin
11. CIS Kubernetes Benchmark¶
# Run a CIS compliance scan (if not already scheduled)
# Trivy Operator runs this automatically if configured in Helm values
# Check compliance report exists
kubectl get clustercompliancereports
# Expected: k8s-cis report exists
# Review results
kubectl get clustercompliancereports k8s-cis -o json | \
jq '.status.summaryReport | {failCount, passCount}'
# Expected: failCount = 0 for CRITICAL findings
# List all failing controls
kubectl get clustercompliancereports k8s-cis -o json | \
jq -r '.status.summaryReport.controlCheck[] |
select(.totalFail > 0) | "\(.id) \(.name): \(.totalFail) failures (\(.severity))"'
12. Kyverno Policy Enforcement Mode¶
# List all ClusterPolicies and their enforcement mode
kubectl get clusterpolicies -o custom-columns=\
NAME:.metadata.name,\
ACTION:.spec.validationFailureAction,\
BACKGROUND:.spec.background
# Required policies in Enforce mode:
# restrict-image-registries → Enforce
# enforce-pod-security-restricted → Enforce
# require-resource-limits → Enforce
# verify-image-signatures → Enforce (if cosign is configured)
# Verify no required policy is still in Audit
kubectl get clusterpolicies -o json | \
jq -r '.items[] |
select(.spec.validationFailureAction == "Audit") |
"\(.metadata.name): WARNING — still in Audit mode"'
Audit Report¶
Document the results in a security audit report with:
- Date and auditor name
- Pass/fail status for each check above
- List of any exceptions or accepted risks (with rationale)
- Remediation plan for any failures
- Sign-off by the project security lead
This report forms part of the Handover Checklist.