9.2 Certificate Rotation¶
This page documents the certificate inventory for the RCIIS platform, rotation procedures, and integration with HSM-backed issuers.
Certificate Inventory¶
| Certificate | Issuer | Validity | Auto-Rotation | Location |
|---|---|---|---|---|
| Kubernetes API server TLS | Talos (internal CA) | 1 year | Yes (Talos manages) | Talos machine secrets |
| Kubelet serving certificate | Talos (internal CA) | 1 year | Yes (rotate-server-certificates: true) |
Talos machine secrets |
| etcd peer and client TLS | Talos (internal CA) | 1 year | Yes (Talos manages) | Talos machine secrets |
| Talos API mTLS | Talos (internal CA) | 10 years | No (manual renewal) | Talos machine secrets |
| Cilium agent certificates | Cilium (internal CA) | Configurable | Yes (Cilium manages) | Cilium secrets |
| Application ingress TLS | cert-manager (Let's Encrypt or internal CA) | 90 days (LE) / configurable | Yes (cert-manager renews) | Kubernetes Secrets |
| Keycloak ingress TLS | cert-manager | 90 days (LE) / configurable | Yes (cert-manager renews) | keycloak namespace Secret |
| Keycloak realm signing key | Keycloak (internal or HSM) | Configurable | Manual rotation | Keycloak database or HSM |
| Weave GitOps server TLS | cert-manager | 90 days (LE) / configurable | Yes (cert-manager renews) | flux-system namespace Secret |
Talos-Managed Certificates¶
Talos automatically manages all control plane certificates (API server, kubelet, etcd). These rotate automatically and require no manual intervention under normal conditions.
Verify Certificate Expiry¶
# Check API server certificate
talosctl -n <control-plane-ip> get certificate apiserver
# Check all certificate statuses
talosctl -n <control-plane-ip> get certificates
Talos API Certificate Renewal¶
The Talos API certificate has a 10-year validity. If it approaches expiry or needs emergency rotation:
# Regenerate Talos secrets (includes new CA and certificates)
talhelper gensecret > talsecret.sops.yaml
sops -e -i talsecret.sops.yaml
# Regenerate machine configs with the new secrets
talhelper genconfig
# Apply new configs to each node (rolling)
talhelper gencommand apply --extra-flags="--mode=staged"
# Then reboot each node to pick up the new certificates
talosctl -n <node-ip> reboot
Warning
Regenerating Talos secrets creates a new CA. All nodes must be reconfigured with the new secrets. This is a disruptive operation — plan a maintenance window. See Talos Upgrades for the rolling update procedure.
cert-manager Managed Certificates¶
cert-manager handles automatic renewal for all application and infrastructure TLS certificates.
Check Certificate Status¶
# List all certificates and their status
kubectl get certificates -A
# Check a specific certificate
kubectl describe certificate <name> -n <namespace>
# Check upcoming renewals (certificates expiring within 30 days)
kubectl get certificates -A -o json | \
jq -r '.items[] | select(.status.renewalTime != null) |
"\(.metadata.namespace)/\(.metadata.name): renews at \(.status.renewalTime)"'
Manual Renewal¶
If a certificate needs immediate renewal:
# Delete the certificate secret — cert-manager will re-issue
kubectl delete secret <secret-name> -n <namespace>
# Or trigger renewal via cmctl
cmctl renew <certificate-name> -n <namespace>
Troubleshoot Failed Renewals¶
# Check cert-manager logs
kubectl -n cert-manager logs -l app=cert-manager --tail=50
# Check certificate request status
kubectl get certificaterequests -A
kubectl describe certificaterequest <name> -n <namespace>
# Check ACME order status (for Let's Encrypt)
kubectl get orders -A
kubectl get challenges -A
HSM-Backed Certificate Rotation¶
When cert-manager is configured with an HSM-backed issuer (see Key Management), certificate rotation involves the HSM for signing operations.
How It Works¶
- cert-manager detects a certificate approaching expiry
- cert-manager generates a new CSR (Certificate Signing Request)
- The CSR is sent to the HSM-backed issuer
- The HSM signs the certificate using the CA private key (which never leaves the HSM)
- The signed certificate is stored in the Kubernetes Secret
The HSM integration is transparent to the rotation process — cert-manager handles renewal automatically. The only difference is that signing operations are slower (network round-trip to HSM) compared to software-based signing.
Verify HSM-Backed Issuance¶
# Check the issuer status
kubectl get clusterissuer hsm-ca-issuer -o yaml
# Verify a certificate was signed by the HSM CA
kubectl get secret <tls-secret> -n <namespace> -o jsonpath='{.data.tls\.crt}' | \
base64 -d | openssl x509 -text -noout | grep "Issuer:"
# Expected: Issuer matches the HSM CA certificate subject
CA Certificate Rotation¶
If the HSM-backed CA certificate itself needs rotation (e.g., approaching expiry or key compromise):
- Generate a new CA key pair in CloudHSM
- Create a new CA certificate (self-signed or signed by an external root)
- Update the cert-manager ClusterIssuer to reference the new CA certificate
- Re-issue all certificates signed by the old CA
- Perform a key ceremony to generate a new CA key pair in the HSM
- Create and sign the new CA certificate
- Update the cert-manager ClusterIssuer
- Re-issue all certificates
The key ceremony must follow the same formal process as the initial HSM setup — witnessed, documented, and recorded.
- Perform a key ceremony to generate a new CA key pair in the HSM
- Create and sign the new CA certificate
- Update the cert-manager ClusterIssuer
- Re-issue all certificates
The key ceremony must follow the same formal process as the initial HSM setup — witnessed, documented, and recorded.
Keycloak Signing Key Rotation¶
Keycloak uses RSA or EC keys to sign JWT tokens. These keys should be rotated periodically.
Rotate Realm Keys¶
- In the Keycloak admin console: Realm Settings > Keys > Providers
- Add a new key provider (RSA or EC) with a higher priority than the existing one
- The new key becomes the active signing key; the old key remains for verification of existing tokens
- After the old tokens expire (determined by token lifespan settings), remove the old key provider
HSM-Backed Key Rotation¶
If Keycloak signing keys are stored in the HSM:
- Generate a new signing key in the HSM via
pkcs11-toolor the HSM vendor SDK - Update the Keycloak PKCS#11 keystore configuration to reference the new key alias
- Restart Keycloak pods to pick up the new key
- The old key remains in the HSM for verification until all old tokens expire
- After the grace period, mark the old key as inactive in the HSM
Rotation Schedule¶
| Certificate Type | Rotation Frequency | Method | Downtime |
|---|---|---|---|
| Application TLS (Let's Encrypt) | Every 60 days (auto) | cert-manager auto-renewal | None |
| Application TLS (internal CA) | Configurable (recommend 90 days) | cert-manager auto-renewal | None |
| Keycloak realm signing key | Every 6 months | Manual via admin console or API | None (graceful rotation) |
| HSM CA certificate | Every 3–5 years | Key ceremony + cert-manager update | Brief (minutes) during rollout |
| Talos control plane certs | Every year (auto) | Talos auto-renewal | None |
| Talos API certificate | Every 10 years | Manual secret regeneration | Rolling reboot required |
Monitoring¶
Set up alerts for certificate expiry:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: certificate-expiry-alerts
namespace: cert-manager
spec:
groups:
- name: cert-manager
rules:
- alert: CertificateExpiringSoon
expr: >
certmanager_certificate_expiration_timestamp_seconds - time() < 7 * 24 * 3600
for: 1h
labels:
severity: warning
annotations:
summary: "Certificate {{ $labels.namespace }}/{{ $labels.name }} expires in < 7 days"
- alert: CertificateExpiryCritical
expr: >
certmanager_certificate_expiration_timestamp_seconds - time() < 24 * 3600
for: 10m
labels:
severity: critical
annotations:
summary: "Certificate {{ $labels.namespace }}/{{ $labels.name }} expires in < 24 hours"