3.3 Encryption & HSM Provisioning¶
Encryption protects data at rest (stored on disk) and in transit (moving between nodes). Talos provides encryption in transit by default for all control plane communication.
Encryption in Transit (All Deployment Models)¶
The following protocols provide encryption in transit automatically — no configuration needed:
| Protocol | Port | Encryption | Notes |
|---|---|---|---|
| Kubernetes API | 6443 | TLS | Built into kube-apiserver |
| Talos API | 50000 | mTLS | Mutual TLS between talosctl and nodes |
| etcd | 2379-2380 | TLS | Peer and client communication |
| Cilium GENEVE | 6081 | Optional (WireGuard) | Enable in Cilium config for pod-to-pod encryption |
Talos automatically generates and manages all TLS certificates for the Kubernetes API, etcd, and the Talos API. No manual certificate configuration is needed.
Encryption in the Terraform project is applied at the storage layer (EBS volumes) and is enabled by default across all modules. This section explains how encryption is configured and where to customise it.
Encryption at Rest¶
All EBS volumes are encrypted using the encrypted = true property. This is set in two places:
Root Volumes (in modules/aws/compute/)¶
The root volume is defined inline with each EC2 instance:
root_block_device {
volume_type = "gp3"
volume_size = var.root_volume_size
delete_on_termination = true
encrypted = true
}
encrypted = true— encrypts the OS root volumedelete_on_termination = true— root volume is destroyed with the instance
Data Volumes (in modules/aws/ebs/)¶
Data volumes are created as standalone EBS resources:
resource "aws_ebs_volume" "data" {
availability_zone = var.availability_zone
size = var.volume_size
type = "gp3"
encrypted = true
tags = {
Name = "talos-worker-${count.index + 1}"
}
}
Encryption Key¶
When encrypted = true is set without specifying a kmsKeyId, AWS uses the default EBS encryption key (aws/ebs) managed by AWS KMS. This provides:
- AES-256 encryption
- Automatic key rotation managed by AWS
- No additional cost beyond standard EBS pricing
Note
To use a customer-managed KMS key (CMK), you would add a kms_key_id argument to the volume definitions in both the compute module (terraform/modules/aws/compute/main.tf) and the ebs module (terraform/modules/aws/ebs/main.tf). This is not currently configured in the Terraform modules.
Encryption in Transit¶
The following protocols provide encryption in transit by default:
| Protocol | Port | Encryption | Notes |
|---|---|---|---|
| Kubernetes API | 6443 | TLS | Built into kube-apiserver |
| Talos API | 50000 | mTLS | Mutual TLS between talosctl and nodes |
| etcd | 2379-2380 | TLS | Peer and client communication |
| Cilium GENEVE | 6081 | Optional (WireGuard) | Enable in Cilium config for pod-to-pod encryption |
Talos automatically generates and manages the TLS certificates for the Kubernetes API, etcd, and the Talos API. No manual certificate configuration is needed at the infrastructure layer.
Customisation Summary¶
| What to Change | Where | How |
|---|---|---|
| Use a customer-managed KMS key | terraform/modules/aws/compute/main.tf, terraform/modules/aws/ebs/main.tf |
Add kms_key_id = "<arn>" to volume definitions |
| Disable encryption (not recommended) | terraform/modules/aws/compute/main.tf, terraform/modules/aws/ebs/main.tf |
Set encrypted = false |
| Enable Cilium WireGuard encryption | Cilium Helm values | Set encryption.enabled = true in Cilium config |
Encryption at Rest¶
Bare metal encryption at rest is handled by Talos's built-in LUKS2 support for the STATE and EPHEMERAL partitions.
Enable Disk Encryption¶
Add an encryption patch to your Talos machine configuration:
machine:
systemDiskEncryption:
state:
provider: luks2
keys:
- nodeID: {}
slot: 0
ephemeral:
provider: luks2
keys:
- nodeID: {}
slot: 0
This encrypts: - STATE partition — contains machine configuration and etcd data - EPHEMERAL partition — contains pod data, logs, containerd storage
The encryption key is derived from the node's unique hardware identity (nodeID), meaning:
- The disk is automatically unlocked on the same hardware
- The disk cannot be read if moved to a different server
- No manual key entry is required at boot
Data Disks¶
Additional data disks (for CSI storage) can be encrypted by the CSI driver or at the OS level. Rook-Ceph and Longhorn both support encrypted volumes.
Key Management¶
| Key Type | Managed By | Storage |
|---|---|---|
| LUKS2 disk encryption | Talos (nodeID-derived) | Hardware-bound |
| TLS certificates (API, etcd) | Talos | Machine secrets (STATE partition) |
| Kubernetes secrets | Kubernetes | etcd (encrypted at rest if configured) |
Encryption at Rest¶
Proxmox VMs can use two layers of disk encryption:
Option 1: Talos LUKS2 (Recommended)¶
Same as bare metal — enable LUKS2 in the Talos machine configuration via a config patch:
machine:
systemDiskEncryption:
state:
provider: luks2
keys:
- nodeID: {}
slot: 0
ephemeral:
provider: luks2
keys:
- nodeID: {}
slot: 0
Add this patch via the control_plane_patches or worker_patches variable in your .tfvars:
control_plane_patches = [
file("patches/encryption.yaml")
]
worker_patches = [
file("patches/encryption.yaml")
]
Option 2: Proxmox Storage-Level Encryption¶
If using ZFS as the Proxmox storage backend, enable native ZFS encryption on the pool. This encrypts all VM disk images transparently:
Note
ZFS encryption is managed at the Proxmox host level, not by Terraform. The VMs are unaware of the encryption — they see unencrypted block devices.
Option 3: Both (Defence in Depth)¶
Use ZFS encryption at the Proxmox layer AND LUKS2 at the Talos layer. This provides: - Protection if the Proxmox host is compromised (LUKS2) - Protection if VM disk images are extracted (ZFS encryption)
Customisation Summary¶
| What to Change | Where | How |
|---|---|---|
| Enable Talos disk encryption | .tfvars patches |
Add LUKS2 machine config patch |
| Enable Proxmox storage encryption | Proxmox host | ZFS encryption on storage pool |
| Enable pod-to-pod encryption | Cilium Helm values | encryption.enabled = true (WireGuard) |
Provision the HSM¶
The encryption configurations above use software-managed keys (AWS KMS default key, Talos nodeID-derived LUKS keys). For environments that require FIPS 140-2 Level 3 compliance or hardware-protected key storage, provision an HSM during the infrastructure build phase so it is ready when platform services need it.
Provisioning vs Integration
This section covers provisioning the HSM device or service — making it physically or logically available. The Kubernetes integration (configuring cert-manager, Keycloak, and the KMS plugin to use HSM-backed keys) happens later in Key Management & HSM Integration after those services are deployed.
Why HSM¶
| Without HSM | With HSM |
|---|---|
| CA signing keys stored as Kubernetes Secrets (base64-encoded, etcd-backed) | CA signing keys stored in tamper-resistant hardware |
| Key extraction possible if etcd or node is compromised | Key extraction is physically impossible — keys never leave the HSM |
| Compliance gap for FIPS 140-2/3 requirements | FIPS 140-2 Level 3 (or higher) certified |
| Master encryption keys managed in software | Master encryption keys protected by HSM hardware |
Key Inventory¶
Identify which keys benefit from HSM protection and their current storage:
| Key | Current Storage | HSM Integration | Priority |
|---|---|---|---|
| TLS CA signing key (cert-manager) | Kubernetes Secret | PKCS#11 or KMS signer | High |
| Keycloak realm signing key | Keycloak database | Java PKCS#11 keystore | High |
| Kubernetes API CA root key | Talos machine secrets | External CA mode (HSM-backed root) | High |
| Kubernetes Secrets encryption key | Talos machine secrets | KMS v2 plugin (PKCS#11 to HSM) | Medium |
| EBS/disk encryption master key | AWS KMS default / nodeID-derived | CloudHSM Custom Key Store / PKCS#11 | Medium |
| SOPS master key (Age) | Age key file on operator workstation | HSM key wrapping | Medium |
HSM Provisioning by Deployment Model¶
AWS CloudHSM¶
AWS CloudHSM provides dedicated FIPS 140-2 Level 3 validated HSM instances within your VPC.
Cluster architecture:
| Component | Configuration |
|---|---|
| HSM cluster | 2 HSMs across 2 AZs (HA) |
| Subnet placement | Private subnets (same as Kubernetes workers) |
| Security group | Allow port 2223-2225 from worker node subnets |
| Backup | Automatic daily backups to S3 (encrypted) |
Provision the CloudHSM cluster:
# Create the CloudHSM cluster
aws cloudhsmv2 create-cluster \
--hsm-type hsm1.medium \
--subnet-ids subnet-xxxx subnet-yyyy
# Wait for cluster to initialise, then create the first HSM
aws cloudhsmv2 create-hsm \
--cluster-id cluster-xxxxxxxxxxxx \
--availability-zone af-south-1a
# Create a second HSM for HA
aws cloudhsmv2 create-hsm \
--cluster-id cluster-xxxxxxxxxxxx \
--availability-zone af-south-1b
Initialise the cluster:
- Download the cluster CSR and sign it with your organisation's root CA
- Upload the signed certificate and CA chain
- Activate the cluster and set the Crypto Officer (CO) password
# Download CSR
aws cloudhsmv2 describe-clusters --filters clusterIds=cluster-xxxxxxxxxxxx \
--query 'Clusters[0].Certificates.ClusterCsr' --output text > cluster.csr
# After signing, initialise
aws cloudhsmv2 initialize-cluster \
--cluster-id cluster-xxxxxxxxxxxx \
--signed-cert file://signed-cert.pem \
--trust-anchor file://ca-chain.pem
Install the CloudHSM client on worker nodes:
For Talos-based nodes, the CloudHSM PKCS#11 library must be available as a container sidecar or init container. The recommended approach is to build a container image with the CloudHSM client SDK and mount the PKCS#11 socket into pods that need HSM access (cert-manager, Keycloak).
CloudHSM Custom Key Store for KMS:
Link CloudHSM to AWS KMS for transparent HSM-backed encryption of EBS volumes:
aws kms create-custom-key-store \
--custom-key-store-name rciis-hsm-keystore \
--cloud-hsm-cluster-id cluster-xxxxxxxxxxxx \
--key-store-password <CO-password> \
--trust-anchor-certificate file://ca-chain.pem
aws kms connect-custom-key-store \
--custom-key-store-id cks-xxxxxxxxxxxx
Then create a KMS key backed by the Custom Key Store and reference it in the EBS volume definitions. Add the kms_key_id to the volume blocks in terraform/modules/aws/compute/main.tf and terraform/modules/aws/ebs/main.tf as described in the Encryption Key section above.
Cost considerations:
| Component | Cost (approximate) |
|---|---|
| CloudHSM instance | ~$1.50/hr per HSM |
| Minimum HA setup (2 HSMs) | ~$2,200/month |
Cost
CloudHSM is a significant cost. Evaluate whether the compliance and security requirements justify the expense. For non-production environments, consider software-based alternatives (SoftHSM2) for development and testing.
On-Premises HSM¶
For bare metal deployments, a network-attached or PCIe HSM provides the hardware root of trust.
HSM options:
| HSM | Form Factor | FIPS Level | Approximate Cost |
|---|---|---|---|
| Thales Luna Network HSM | 1U rack-mount, network-attached | FIPS 140-2 Level 3 | $$$$ |
| Securosys Primus HSM | 1U rack-mount, network-attached | FIPS 140-2 Level 3 | $$$ |
| Nitrokey HSM 2 | USB token | FIPS 140-2 Level 2 | $ (budget option) |
| YubiHSM 2 | USB module | FIPS 140-2 Level 2 | $ (budget option) |
Network HSM setup (Thales Luna / Securosys):
- Physical installation: Rack-mount the HSM in a physically secured cabinet with restricted access
- Network configuration: Connect the HSM to the management VLAN and assign a static IP
- Initialise the HSM: Perform the key ceremony — initialise the HSM, set SO (Security Officer) and partition passwords, create a partition for RCIIS keys
- Client configuration: Install the PKCS#11 client library on machines that need HSM access
- Firewall rules: Allow the HSM's management port (typically TCP 1792 for Thales, varies by vendor) from Kubernetes worker node IPs only
Key ceremony:
A key ceremony is the formal process of initialising the HSM and generating the root keys. It should be:
- Witnessed by at least two authorised personnel
- Documented with photographs and signed logs
- Performed in a physically secure location
- Recorded: who generated which keys, key IDs, backup token holders
Store the ceremony documentation securely offline. The HSM SO PIN and partition passwords should be split across multiple custodians using M-of-N secret sharing (built into most enterprise HSMs).
USB HSM (Nitrokey/YubiHSM) for budget deployments:
For smaller deployments where a full network HSM is not justified:
- Connect the USB HSM to a dedicated management server (not a Kubernetes node)
- Run
pkcs11-toolor the vendor SDK to initialise the token - Expose the PKCS#11 interface to the cluster via a network proxy (e.g.,
p11-kitremote)
Note
USB HSMs do not provide the same level of availability as network HSMs. If the server hosting the USB HSM goes down, all HSM-dependent operations (certificate signing, key operations) will fail. Plan for this in your DR procedures.
HSM for Proxmox Environments¶
Proxmox environments use the same on-premises HSM options as Bare Metal — an HSM is a physical device and cannot be virtualised.
Network HSM (recommended):
Follow the Bare Metal instructions for network HSM setup. Ensure the HSM is on a VLAN reachable by the Proxmox Kubernetes worker VMs.
USB HSM with Proxmox passthrough:
If using a USB HSM (Nitrokey, YubiHSM), you can pass the USB device through to a specific Proxmox VM:
# Identify the USB device on the Proxmox host
lsusb | grep -i "nitrokey\|yubico"
# Example output: Bus 001 Device 004: ID 20a0:4230 Nitrokey
# Add USB passthrough to a VM (e.g., VM ID 200)
qm set 200 -usb0 host=20a0:4230
Warning
USB passthrough binds the HSM to a single VM. If that VM migrates to another Proxmox node, the USB device does not follow. For HA environments, use a network-attached HSM instead.
Software HSM for development:
For non-production Proxmox environments, use SoftHSM2 as a development stand-in:
# Install SoftHSM2 (inside a container or on the management VM)
apt-get install softhsm2
# Initialise a token
softhsm2-util --init-token --slot 0 --label "rciis-dev" --so-pin 1234 --pin 5678
SoftHSM2 provides the same PKCS#11 interface as a hardware HSM, allowing you to develop and test integrations without hardware.
HSM Operations¶
| Operation | Frequency | Procedure |
|---|---|---|
| Key backup | After key generation, then monthly | Export encrypted key backup to secure offline storage |
| Firmware update | As released by vendor | Schedule maintenance window, update one HSM at a time (HA) |
| PIN rotation | Quarterly | Rotate partition PINs, update SOPS-encrypted secrets |
| Audit log review | Weekly | Review HSM audit logs for unauthorised access attempts |
| Capacity planning | Quarterly | Monitor key slot usage, plan for additional partitions |
Next Step
Once the HSM is provisioned and the platform services are deployed (Phase 5), configure the Kubernetes integrations in Key Management & HSM Integration:
- cert-manager issuer with HSM-backed CA signing key
- Keycloak realm signing keys in HSM
- Kubernetes PKI with HSM-backed root CA (external CA mode)
- KMS v2 plugin for Secrets encryption at rest
- SOPS master key wrapping