Skip to content

2.3 Prepare Provisioning Config

Set up and customise the infrastructure-as-code configuration before deploying. Select your deployment model to see the relevant project structure and variables.

Project Structure

The Terraform project lives at terraform/cluster/aws/ with the following structure:

terraform/cluster/aws/
├── main.tf                     # Entry point — composes all modules
├── variables.tf                # All configurable variables with defaults
├── outputs.tf                  # Deployment outputs
├── aws.tfvars                  # RCIIS environment config values
└── schematic.yaml              # Talos Image Factory extensions config

terraform/modules/aws/
├── network/                    # VPC, subnets, IGW, NAT, route tables, security groups
│   ├── main.tf
│   ├── security_groups.tf
│   ├── variables.tf
│   └── outputs.tf
├── iam/                        # IAM role, instance profile, CCM/LBC policies
│   ├── main.tf
│   ├── variables.tf
│   └── outputs.tf
├── ebs/                        # Encrypted gp3 EBS volumes
│   ├── main.tf
│   ├── variables.tf
│   └── outputs.tf
├── compute/                    # EC2 instances + volume attachments
│   ├── main.tf
│   ├── variables.tf
│   └── outputs.tf
├── loadbalancer/               # NLB + target groups + listeners
│   ├── main.tf
│   ├── variables.tf
│   └── outputs.tf
└── sqs/                        # SQS event queues + dead-letter queue
    ├── main.tf
    ├── variables.tf
    └── outputs.tf

Providers

The project uses the AWS provider declared in main.tf:

terraform/cluster/aws/main.tf
terraform {
  required_version = ">= 1.5"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 6.34"
    }
  }
}

To download providers and initialise:

cd terraform/cluster/aws
terraform init

Variables (variables.tf)

All configurable parameters are defined in variables.tf with types, defaults, and validation rules. Key sections:

Cluster Identity

Variable Default Description
cluster_name "rciis-aws" Kubernetes cluster name, used in resource naming and tags
environment "rciis" Environment label (rciis, testing, staging, prod)

AWS Region

Variable Default Description
region "af-south-1" AWS region for all resources
availability_zones ["af-south-1a"] AZs for resource distribution
aws_profile null AWS CLI profile name (null = use env vars or instance role)

Network

Variable Default Description
vpc_cidr "10.0.0.0/16" VPC address space
public_subnet_cidrs ["10.0.1.0/24"] One public subnet per AZ (NAT gateways, NLB)
private_subnet_cidrs ["10.0.11.0/24"] One private subnet per AZ (Kubernetes nodes)
enable_nat_gateway true Create NAT gateways for private subnet internet access
single_nat_gateway false Share one NAT gateway across AZs (cost saving)

Compute

Variable Default Description
control_plane_count 3 Number of control plane nodes (1 = non-HA, 3 or 5 = HA)
worker_count 5 Number of worker nodes
control_plane_instance_type "t3.large" EC2 instance type for control plane
worker_instance_type "t3.xlarge" EC2 instance type for workers
enable_public_ip false Assign public IPs (false when using NAT)

Storage

Variable Default Description
root_volume_size 20 GB OS root EBS volume (gp3)
control_plane_volume_size 100 GB CP data volume (gp3, 0 = no separate volume)
worker_volume_size 200 GB Worker data volume (gp3)

Load Balancer

Variable Default Description
nlb_internal true Internal NLB (true) or internet-facing (false)
enable_deletion_protection false Prevent accidental NLB deletion
enable_cross_zone_load_balancing true Distribute traffic across all AZs

Security

Variable Default Description
allowed_admin_cidrs [] CIDR ranges allowed to reach K8s API + Talos API from outside the VPC
enable_ssm false Attach SSM policy to instance role
enable_nodeport false Open NodePort range (30000-32767) on workers
cni_type "cilium" Container Network Interface plugin

Tags

All resources are tagged via common_tags and provider-level default_tags:

Tag Value
ManagedBy terraform
Environment Value of environment variable
Project rciis
Cluster Value of cluster_name variable

Environment Overrides

Create environment-specific configurations as .tfvars files. The RCIIS environment file is at terraform/cluster/aws/aws.tfvars:

terraform/cluster/aws/aws.tfvars
# Cluster identity
cluster_name = "rciis-aws"
environment  = "rciis"

# AWS — Cape Town, single AZ to minimize cost
region             = "af-south-1"
aws_profile        = "cbt"
availability_zones = ["af-south-1a"]

# Network — single AZ, single NAT
vpc_cidr             = "10.2.0.0/16"
public_subnet_cidrs  = ["10.2.1.0/24"]
private_subnet_cidrs = ["10.2.11.0/24"]
enable_nat_gateway   = true
single_nat_gateway   = true

# Compute — non-HA: 1 CP + 1 worker
control_plane_count        = 1
worker_count               = 1
control_plane_instance_type = "t3.large"
worker_instance_type        = "t3.xlarge"

# Admin access — restrict K8s API + Talos API to allowed IPs
allowed_admin_cidrs = [
  "196.45.28.20/32",
  "169.0.211.5/32",
  "105.245.234.139/32",
]

# ClusterMesh peer — Proxmox on-prem public IP
clustermesh_peer_cidrs = [
  "197.245.173.242/32",
]

Usage

cd terraform/cluster/aws

# Initialise providers
terraform init

# Preview changes
terraform plan -var-file=aws.tfvars

# Apply (creates all infrastructure)
terraform apply -var-file=aws.tfvars

# Destroy
terraform destroy -var-file=aws.tfvars

CLI Variable Overrides

Any variable can be overridden at deploy time via the -var flag:

Example Description
-var="environment=staging" Override environment name
-var="region=eu-west-1" Override AWS region
-var="control_plane_count=3" Override CP node count
-var="worker_count=3" Override worker node count
-var="control_plane_instance_type=m6i.xlarge" Override CP instance type
-var="worker_instance_type=m6i.2xlarge" Override worker instance type

Configuration Approach

Bare metal environments typically do not use IaC for the physical servers themselves. Configuration centres on:

  1. Network planningIP assignments, VLANs, gateways
  2. Talos machine configs — generated with talosctl and applied to each node
  3. Load balancer config — HAProxy or similar for K8s API access

Cluster Naming Convention

Bare metal cluster names use the hosting country as the site identifier. The format is rciis-<country> — for example, rciis-kenya, rciis-ghana, or rciis-nigeria. This distinguishes each physical deployment site and aligns with the multi-site replication topology.

Proxmox VM Deployments

Proxmox VM environments use the same configuration approach as bare metal. The only difference is that VMs are provisioned via Terraform (see the Proxmox VMs tab) rather than physical hardware. Once the Talos nodes are running, all Kubernetes-level configuration is identical.

Network Plan

Create a network plan documenting IP assignments for all nodes. This is the equivalent of a .tfvars file for bare metal:

Parameter Example Value Description
Cluster name rciis-kenya Site name — uses the hosting country (e.g., rciis-kenya, rciis-ghana)
Kubernetes VLAN VLAN 30 Data plane network
Subnet 192.168.30.0/24 Node network
Gateway 192.168.30.1 Default gateway
DNS servers 1.1.1.1, 192.168.10.17 DNS resolvers
VIP address 192.168.30.30 Kubernetes API VIP (HAProxy or Keepalived)
CP node IPs .31, .32, .33 Control plane static IPs
Worker node IPs .34.39 Worker static IPs
Management VLAN VLAN 10 IPMI/BMC network

Talos Machine Config Generation

Generate base machine configs using talosctl:

talosctl gen config <cluster-name> https://<vip-or-lb-ip>:6443 \
  --output-dir _out \
  --with-docs=false \
  --with-examples=false

This creates: - _out/controlplane.yaml — control plane machine config - _out/worker.yaml — worker machine config - _out/talosconfig — admin client config

Key Configuration Decisions

Decision Options Notes
Control plane count 1 (non-HA) or 3+ (HA) 3 recommended for production
CNI Cilium, Calico, Flannel Cilium recommended; set cni_name: none in config
kube-proxy Enabled or disabled Disable when using Cilium kube-proxy replacement
Install disk /dev/sda, /dev/nvme0n1 Depends on server hardware
Encryption LUKS or none Optional for STATE and EPHEMERAL partitions

Project Structure

The Terraform project lives at terraform/cluster/ with the following structure:

terraform/cluster/
├── main.tf                    # Root module — providers, CP + worker + Talos modules
├── variables.tf               # All configurable variables with validation
├── outputs.tf                 # Cluster info, credentials, quick start guide
└── envs/
    ├── testing.tfvars         # Testing environment (3 CP + 3 workers)
    └── proxmox.tfvars            # Proxmox environment (3 CP + 3 workers)

terraform/modules/
├── proxmox/vm/                # VM module — clone from template, cloud-init
│   ├── main.tf
│   ├── variables.tf
│   └── outputs.tf
└── talos/proxmox/             # Talos module — machine config, bootstrap
    ├── main.tf
    ├── variables.tf
    └── outputs.tf

Providers

The project uses two Terraform providers declared in main.tf:

required_providers {
  proxmox = {
    source  = "bpg/proxmox"
    version = "~> 0.86.0"
  }
  talos = {
    source  = "siderolabs/talos"
    version = "~> 0.9.0"
  }
}

To download providers:

cd terraform/cluster
terraform init

Variables

All configurable parameters are defined in variables.tf with validation rules.

Proxmox Connection

Variable Default Description
proxmox_endpoint (required) Proxmox API URL (e.g., https://192.168.30.225:8006)
proxmox_api_token (required, sensitive) API token: USER@REALM!TOKENID=SECRET
proxmox_insecure false Allow insecure TLS to Proxmox
proxmox_ssh_username "root" SSH user for Proxmox node
proxmox_node (required) Proxmox node name (e.g., pve2)

Cluster

Variable Default Description
cluster_name "rciis-proxmox" Kubernetes cluster name
control_plane_count 3 CP nodes (must be odd for etcd quorum)
worker_count 3 Worker nodes
template_vm_id (required) VM ID of the Talos template to clone

Control Plane VMs

Variable Default Description
control_plane_vm_id_start 8000 Starting VM ID for CP nodes
control_plane_cpu_cores 4 CPU cores per CP (min 2)
control_plane_memory_mb 8192 RAM in MB per CP (min 2048)
control_plane_disk_size_gb 100 OS disk per CP (min 10)

Worker VMs

Variable Default Description
worker_vm_id_start 8100 Starting VM ID for workers
worker_cpu_cores 4 CPU cores per worker (min 2)
worker_memory_mb 8192 RAM in MB per worker (min 4096)
worker_disk_size_gb 100 OS disk per worker (min 10)
worker_data_disk_size_gb 100 Additional data disk (0 to disable)

Network

Variable Default Description
network_bridge "vmbr0" Proxmox network bridge
network_vlan_id null VLAN ID (null = untagged)
ipv4_gateway (required) Gateway address (e.g., 192.168.30.1)
dns_servers ["8.8.8.8", "8.8.4.4"] DNS resolvers
control_plane_ips (required) Static IPs in CIDR (e.g., ["192.168.30.31/24"])
worker_ips [] Static IPs in CIDR

High Availability

Variable Default Description
control_plane_vip null Virtual IP for kube-vip HA
control_plane_vip_interface "eth0" NIC for VIP

Talos

Variable Default Description
talos_version "v1.10.0" Talos OS version
kubernetes_version "1.31.0" Kubernetes version
cni_name "none" CNI (none for external Cilium, flannel for built-in)
disable_kube_proxy true Disable for Cilium kube-proxy replacement
install_disk "/dev/sda" Disk for Talos installation

Environment Overrides

Create a .tfvars file per environment under envs/. Example (envs/proxmox.tfvars):

cluster_name        = "rciis-proxmox"
control_plane_count = 3
worker_count        = 3

control_plane_vm_id_start = 601
worker_vm_id_start        = 604
template_vm_id = 9000

control_plane_cpu_cores    = 2
control_plane_memory_mb    = 2048
control_plane_disk_size_gb = 50

worker_cpu_cores         = 4
worker_memory_mb         = 4096
worker_disk_size_gb      = 50
worker_data_disk_size_gb = 50

storage_pool   = "local-lvm"
network_bridge = "vmbr0"
ipv4_gateway   = "192.168.30.1"
dns_servers    = ["1.1.1.1", "192.168.10.17"]

control_plane_ips = [
  "192.168.30.31/24",
  "192.168.30.32/24",
  "192.168.30.33/24"
]

worker_ips = [
  "192.168.30.34/24",
  "192.168.30.35/24",
  "192.168.30.36/24"
]

control_plane_vip           = "192.168.30.30"
control_plane_vip_interface = "eth0"

talos_version      = "v1.12.0"
kubernetes_version = "1.33.7"
cni_name           = "none"
disable_kube_proxy = true

tags = ["rciis", "proxmox"]

Usage

cd terraform/cluster

# Preview changes
terraform plan -var-file=envs/proxmox.tfvars

# Apply (creates VMs, generates Talos configs, bootstraps cluster)
terraform apply -var-file=envs/proxmox.tfvars

# Destroy
terraform destroy -var-file=envs/proxmox.tfvars

Note

Set the API token via environment variable rather than in the tfvars file:

export TF_VAR_proxmox_api_token="root@pam!IaC=<your-secret>"
export TF_VAR_proxmox_endpoint="https://192.168.30.225:8006"
export TF_VAR_proxmox_node="pve2"