Your EKS Pods Are Eating Your VPC
The Problem
EKS VPC CNI gives every pod a real VPC IP. No overlay, no encapsulation — just raw VPC addresses. That simplicity is great until you realise your pods are competing for the same address space as your nodes, load balancers, and RDS instances.
A /24 gives you 251 usable IPs. Subtract nodes, ENI reservations, AWS-internal addresses, and anything else living in that subnet, and a cluster with a few dozen nodes can exhaust a single AZ before you notice. Kubernetes surfaces this as Insufficient pods scheduling failures or cryptic ENI attachment errors that are annoying to debug the first time.
The fix is custom networking: move pod IPs off the primary CIDR entirely, onto a dedicated secondary address space. Nodes keep their primary IPs. Pods get their own room.
This post covers the complete setup using Terraform and the official AWS EKS module.
How It Works
With custom networking enabled, the VPC CNI uses ENIConfig objects to decide where to provision pod ENIs. You create one ENIConfig per AZ, each pointing at a subnet in your secondary CIDR. When a node boots, it reads its AZ label, finds the matching ENIConfig, and allocates pod IPs from that subnet — completely separate from the node’s primary network interface.
The result: two clean CIDR domains.
- Primary CIDR → node IPs, infrastructure, control plane traffic
- Secondary CIDR → pod IPs only
flowchart TB
subgraph VPC["VPC"]
P["Primary CIDR<br/>Node address space"]
S["Secondary CIDR<br/>Pod address space"]
subgraph NS["Node Subnets (primary CIDR)"]
N1["node-subnet-a"]
N2["node-subnet-b"]
N3["node-subnet-c"]
end
subgraph PS["Pod Subnets (secondary CIDR)"]
P1["pod-subnet-a"]
P2["pod-subnet-b"]
P3["pod-subnet-c"]
end
end
subgraph EKS["EKS Cluster"]
CNI["aws-vpc-cni<br/>custom networking + prefix delegation"]
NG["Managed Node Groups"]
NODE["Node in AZ-a"]
ENI["ENIConfig objects<br/>AZ → pod subnet"]
POD["Pod scheduled on node"]
end
P --> NS
S --> PS
NS --> NG
NG --> NODE
CNI --> NODE
PS --> ENI
ENI --> CNI
NODE -->|"primary ENI → node IP<br/>from node-subnet-a"| N1
NODE -->|"pod ENIs → pod IPs<br/>from pod-subnet-a via ENIConfig"| P1
NODE --> POD
Why Prefix Delegation Too
While you are here, turn on prefix delegation. Without it, the CNI allocates individual secondary IPs one at a time, which creates a lot of EC2 API churn and wastes IP space at typical pod densities.
With prefix delegation, the CNI attaches /28 prefix blocks (16 IPs each) to ENIs instead of individual addresses. The difference in practice:
- At 10 pods per node, you typically need one prefix
- At 20–30 pods, two or three prefixes
- Scale-out is faster, wasted headroom is lower
The tuning knob is WARM_IP_TARGET. Set it to 5 and IPAMD keeps five free pod IPs ready, adding a new /28 prefix whenever that buffer drains. Capacity grows in prefix-sized steps, which aligns well with real burst patterns.
The Terraform Setup
Two things to configure:
- The
vpc-cniaddon with the right environment variables - One
ENIConfigobject per pod subnet, keyed by AZ
1. Configure the VPC CNI Addon
Inside your module "eks" block, configure the vpc-cni addon with before_compute = true. This ensures CNI behaviour is established before any node group launches — otherwise nodes boot with default networking and you have to recycle them.
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 21.0"
cluster_name = var.cluster_name
cluster_version = "1.32"
vpc_id = var.vpc_id
subnet_ids = concat(var.node_subnet_ids, var.pod_subnet_ids)
cluster_addons = {
vpc-cni = {
most_recent = true
before_compute = true
configuration_values = jsonencode({
env = {
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG = "true"
ENI_CONFIG_LABEL_DEF = "topology.kubernetes.io/zone"
ENABLE_PREFIX_DELEGATION = "true"
WARM_IP_TARGET = "5"
}
})
}
}
}
What each variable does:
| Variable | What it does |
|---|---|
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=true | Activates ENIConfig-based custom networking |
ENI_CONFIG_LABEL_DEF | Which node label the CNI reads to pick the right ENIConfig |
ENABLE_PREFIX_DELEGATION=true | Allocates /28 prefix blocks instead of individual IPs |
WARM_IP_TARGET=5 | Keeps five free pod IPs ready per node |
Note on
subnet_ids: pass both node and pod subnets here. The module uses this for cluster-level networking context. Node groups should be pinned to node subnets only via their ownsubnet_ids.
2. Create ENIConfig Objects
One ENIConfig per pod subnet. The metadata.name must match the AZ name exactly — that is how the CNI resolves it from the node label.
variable "pod_subnet_ids" {
description = "Subnet IDs for pod networking (one per AZ, from secondary CIDR)"
type = list(string)
}
variable "pod_security_group_ids" {
description = "Security groups to attach to pod ENIs"
type = list(string)
}
data "aws_subnet" "pod" {
for_each = toset(var.pod_subnet_ids)
id = each.value
}
resource "kubernetes_manifest" "eni_config" {
for_each = data.aws_subnet.pod
manifest = {
apiVersion = "crd.k8s.amazonaws.com/v1alpha1"
kind = "ENIConfig"
metadata = {
name = each.value.availability_zone
}
spec = {
subnet = each.value.id
securityGroups = var.pod_security_group_ids
}
}
}
The mapping is: node in eu-central-1a → looks up ENIConfig named eu-central-1a → allocates from the pod subnet in eu-central-1a. Simple, but it only works if the names match exactly.
3. Wire the Kubernetes Provider
The kubernetes_manifest resource needs a provider wired to your cluster. The standard pattern:
data "aws_eks_cluster" "this" {
name = module.eks.cluster_name
}
data "aws_eks_cluster_auth" "this" {
name = module.eks.cluster_name
}
provider "kubernetes" {
host = data.aws_eks_cluster.this.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.this.token
}
If this provider is misconfigured, the addon config applies fine but ENIConfig creation fails silently, leaving you with a cluster that has custom networking enabled but no AZ mappings. This is the most common cause of partial rollouts.
Rollout Sequence
The order matters. Get it wrong and you end up with nodes that booted before ENIConfig existed, meaning they stay on default networking until recycled.
sequenceDiagram autonumber participant TF as Terraform participant EKS as EKS Control Plane participant CNI as aws-vpc-cni (aws-node) participant K8S as Kubernetes API participant NODE as Node EC2 participant NS as Node Subnet (Primary CIDR) participant PS as Pod Subnet (Secondary CIDR) TF->>EKS: Apply vpc-cni addon config<br/>CUSTOM_NETWORK_CFG=true, PREFIX_DELEGATION=true TF->>K8S: Create ENIConfig per AZ<br/>(subnet + security group mapping) EKS->>NODE: Launch node in AZ-a NODE->>NS: Primary ENI attaches<br/>node gets IP from primary CIDR NODE->>CNI: aws-node starts on node CNI->>K8S: Read node AZ label CNI->>K8S: Resolve ENIConfig named AZ-a K8S-->>CNI: ENIConfig → pod-subnet-a + SG CNI->>PS: Attach pod ENI prefix from pod-subnet-a CNI-->>NODE: Node ready — pod IPs from secondary CIDR
In practice:
- Prerequisites first — secondary CIDR associated with the VPC, pod subnets created per AZ, route tables and security groups in place
- Apply addon config — this sets CNI behaviour on the EKS side
- Apply ENIConfig resources — this activates the AZ → subnet mapping
- For existing clusters — recycle node groups so nodes boot with the new config
Existing nodes will not switch over on their own. The CNI locks in ENI allocation at boot time. A node replacement is required.
Verification
After nodes are up:
# 1. Check addon environment variables are applied
kubectl -n kube-system get ds aws-node -o yaml | grep -A2 "CUSTOM_NETWORK\|PREFIX_DELEGATION\|WARM_IP"
# 2. Confirm ENIConfig objects exist for each AZ
kubectl get eniconfig
# 3. Check node AZ labels
kubectl get nodes -L topology.kubernetes.io/zone
# 4. Verify pod IPs are from the secondary CIDR
kubectl get pods -A -o wide | awk '{print $7}' | sort -u
What you want to see: pod IPs that fall in your secondary CIDR (e.g. 100.64.x.x), one ENIConfig per active AZ, and no IP-assignment errors in node events.
Common Failure Modes
ENIConfig naming mismatch
The CNI resolves ENIConfig by reading the node’s AZ label and looking for an object with that exact name. If ENI_CONFIG_LABEL_DEF points at a label with a value like eu-central-1a but your ENIConfig is named differently, the lookup fails and the node falls back to default networking. Always derive the name from data.aws_subnet.availability_zone — do not hardcode it.
Missing AZ coverage
If you have nodes in three AZs but only two ENIConfigs, the third AZ will silently use default networking. Check that your pod subnet list covers every AZ in your node groups.
Partial rollout from provider misconfiguration
The addon config and the ENIConfig resources are applied by two different providers (AWS and Kubernetes). If the Kubernetes provider fails to authenticate, addon config goes through but ENIConfig creation fails. The symptom is custom networking enabled at the addon level but nodes that behave as if it is not. Always validate both sides.
Pod subnet too small under burst
Prefix delegation helps, but /28 blocks still pile up under sustained churn. Monitor pod subnet utilisation and size subnets with headroom for your realistic burst ceiling — not just steady-state density.
The Payoff
Once this is running, you have two independent capacity pools. Pod growth does not threaten your node addressing. IP exhaustion stops being a surprise incident. And because everything lives in Terraform state, there are no scripts, no manual apply sequences, and no drift to clean up after an incident.
Treat the CNI configuration as a capacity control plane — because that is what it is.