Your EKS Pods Are Eating Your VPC

The Problem

EKS VPC CNI gives every pod a real VPC IP. No overlay, no encapsulation — just raw VPC addresses. That simplicity is great until you realise your pods are competing for the same address space as your nodes, load balancers, and RDS instances.

A /24 gives you 251 usable IPs. Subtract nodes, ENI reservations, AWS-internal addresses, and anything else living in that subnet, and a cluster with a few dozen nodes can exhaust a single AZ before you notice. Kubernetes surfaces this as Insufficient pods scheduling failures or cryptic ENI attachment errors that are annoying to debug the first time.

The fix is custom networking: move pod IPs off the primary CIDR entirely, onto a dedicated secondary address space. Nodes keep their primary IPs. Pods get their own room.

This post covers the complete setup using Terraform and the official AWS EKS module.


How It Works

With custom networking enabled, the VPC CNI uses ENIConfig objects to decide where to provision pod ENIs. You create one ENIConfig per AZ, each pointing at a subnet in your secondary CIDR. When a node boots, it reads its AZ label, finds the matching ENIConfig, and allocates pod IPs from that subnet — completely separate from the node’s primary network interface.

The result: two clean CIDR domains.

  • Primary CIDR → node IPs, infrastructure, control plane traffic
  • Secondary CIDR → pod IPs only
flowchart TB
  subgraph VPC["VPC"]
    P["Primary CIDR<br/>Node address space"]
    S["Secondary CIDR<br/>Pod address space"]

    subgraph NS["Node Subnets (primary CIDR)"]
      N1["node-subnet-a"]
      N2["node-subnet-b"]
      N3["node-subnet-c"]
    end

    subgraph PS["Pod Subnets (secondary CIDR)"]
      P1["pod-subnet-a"]
      P2["pod-subnet-b"]
      P3["pod-subnet-c"]
    end
  end

  subgraph EKS["EKS Cluster"]
    CNI["aws-vpc-cni<br/>custom networking + prefix delegation"]
    NG["Managed Node Groups"]
    NODE["Node in AZ-a"]
    ENI["ENIConfig objects<br/>AZ → pod subnet"]
    POD["Pod scheduled on node"]
  end

  P --> NS
  S --> PS
  NS --> NG
  NG --> NODE
  CNI --> NODE
  PS --> ENI
  ENI --> CNI
  NODE -->|"primary ENI → node IP<br/>from node-subnet-a"| N1
  NODE -->|"pod ENIs → pod IPs<br/>from pod-subnet-a via ENIConfig"| P1
  NODE --> POD

Why Prefix Delegation Too

While you are here, turn on prefix delegation. Without it, the CNI allocates individual secondary IPs one at a time, which creates a lot of EC2 API churn and wastes IP space at typical pod densities.

With prefix delegation, the CNI attaches /28 prefix blocks (16 IPs each) to ENIs instead of individual addresses. The difference in practice:

  • At 10 pods per node, you typically need one prefix
  • At 20–30 pods, two or three prefixes
  • Scale-out is faster, wasted headroom is lower

The tuning knob is WARM_IP_TARGET. Set it to 5 and IPAMD keeps five free pod IPs ready, adding a new /28 prefix whenever that buffer drains. Capacity grows in prefix-sized steps, which aligns well with real burst patterns.


The Terraform Setup

Two things to configure:

  1. The vpc-cni addon with the right environment variables
  2. One ENIConfig object per pod subnet, keyed by AZ

1. Configure the VPC CNI Addon

Inside your module "eks" block, configure the vpc-cni addon with before_compute = true. This ensures CNI behaviour is established before any node group launches — otherwise nodes boot with default networking and you have to recycle them.

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 21.0"

  cluster_name    = var.cluster_name
  cluster_version = "1.32"

  vpc_id     = var.vpc_id
  subnet_ids = concat(var.node_subnet_ids, var.pod_subnet_ids)

  cluster_addons = {
    vpc-cni = {
      most_recent    = true
      before_compute = true
      configuration_values = jsonencode({
        env = {
          AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG = "true"
          ENI_CONFIG_LABEL_DEF               = "topology.kubernetes.io/zone"
          ENABLE_PREFIX_DELEGATION           = "true"
          WARM_IP_TARGET                     = "5"
        }
      })
    }
  }
}

What each variable does:

VariableWhat it does
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=trueActivates ENIConfig-based custom networking
ENI_CONFIG_LABEL_DEFWhich node label the CNI reads to pick the right ENIConfig
ENABLE_PREFIX_DELEGATION=trueAllocates /28 prefix blocks instead of individual IPs
WARM_IP_TARGET=5Keeps five free pod IPs ready per node

Note on subnet_ids: pass both node and pod subnets here. The module uses this for cluster-level networking context. Node groups should be pinned to node subnets only via their own subnet_ids.

2. Create ENIConfig Objects

One ENIConfig per pod subnet. The metadata.name must match the AZ name exactly — that is how the CNI resolves it from the node label.

variable "pod_subnet_ids" {
  description = "Subnet IDs for pod networking (one per AZ, from secondary CIDR)"
  type        = list(string)
}

variable "pod_security_group_ids" {
  description = "Security groups to attach to pod ENIs"
  type        = list(string)
}

data "aws_subnet" "pod" {
  for_each = toset(var.pod_subnet_ids)
  id       = each.value
}

resource "kubernetes_manifest" "eni_config" {
  for_each = data.aws_subnet.pod

  manifest = {
    apiVersion = "crd.k8s.amazonaws.com/v1alpha1"
    kind       = "ENIConfig"
    metadata = {
      name = each.value.availability_zone
    }
    spec = {
      subnet         = each.value.id
      securityGroups = var.pod_security_group_ids
    }
  }
}

The mapping is: node in eu-central-1a → looks up ENIConfig named eu-central-1a → allocates from the pod subnet in eu-central-1a. Simple, but it only works if the names match exactly.

3. Wire the Kubernetes Provider

The kubernetes_manifest resource needs a provider wired to your cluster. The standard pattern:

data "aws_eks_cluster" "this" {
  name = module.eks.cluster_name
}

data "aws_eks_cluster_auth" "this" {
  name = module.eks.cluster_name
}

provider "kubernetes" {
  host                   = data.aws_eks_cluster.this.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.this.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.this.token
}

If this provider is misconfigured, the addon config applies fine but ENIConfig creation fails silently, leaving you with a cluster that has custom networking enabled but no AZ mappings. This is the most common cause of partial rollouts.


Rollout Sequence

The order matters. Get it wrong and you end up with nodes that booted before ENIConfig existed, meaning they stay on default networking until recycled.

sequenceDiagram
  autonumber
  participant TF as Terraform
  participant EKS as EKS Control Plane
  participant CNI as aws-vpc-cni (aws-node)
  participant K8S as Kubernetes API
  participant NODE as Node EC2
  participant NS as Node Subnet (Primary CIDR)
  participant PS as Pod Subnet (Secondary CIDR)

  TF->>EKS: Apply vpc-cni addon config<br/>CUSTOM_NETWORK_CFG=true, PREFIX_DELEGATION=true
  TF->>K8S: Create ENIConfig per AZ<br/>(subnet + security group mapping)

  EKS->>NODE: Launch node in AZ-a
  NODE->>NS: Primary ENI attaches<br/>node gets IP from primary CIDR

  NODE->>CNI: aws-node starts on node
  CNI->>K8S: Read node AZ label
  CNI->>K8S: Resolve ENIConfig named AZ-a
  K8S-->>CNI: ENIConfig → pod-subnet-a + SG

  CNI->>PS: Attach pod ENI prefix from pod-subnet-a
  CNI-->>NODE: Node ready — pod IPs from secondary CIDR

In practice:

  1. Prerequisites first — secondary CIDR associated with the VPC, pod subnets created per AZ, route tables and security groups in place
  2. Apply addon config — this sets CNI behaviour on the EKS side
  3. Apply ENIConfig resources — this activates the AZ → subnet mapping
  4. For existing clusters — recycle node groups so nodes boot with the new config

Existing nodes will not switch over on their own. The CNI locks in ENI allocation at boot time. A node replacement is required.


Verification

After nodes are up:

# 1. Check addon environment variables are applied
kubectl -n kube-system get ds aws-node -o yaml | grep -A2 "CUSTOM_NETWORK\|PREFIX_DELEGATION\|WARM_IP"

# 2. Confirm ENIConfig objects exist for each AZ
kubectl get eniconfig

# 3. Check node AZ labels
kubectl get nodes -L topology.kubernetes.io/zone

# 4. Verify pod IPs are from the secondary CIDR
kubectl get pods -A -o wide | awk '{print $7}' | sort -u

What you want to see: pod IPs that fall in your secondary CIDR (e.g. 100.64.x.x), one ENIConfig per active AZ, and no IP-assignment errors in node events.


Common Failure Modes

ENIConfig naming mismatch

The CNI resolves ENIConfig by reading the node’s AZ label and looking for an object with that exact name. If ENI_CONFIG_LABEL_DEF points at a label with a value like eu-central-1a but your ENIConfig is named differently, the lookup fails and the node falls back to default networking. Always derive the name from data.aws_subnet.availability_zone — do not hardcode it.

Missing AZ coverage

If you have nodes in three AZs but only two ENIConfigs, the third AZ will silently use default networking. Check that your pod subnet list covers every AZ in your node groups.

Partial rollout from provider misconfiguration

The addon config and the ENIConfig resources are applied by two different providers (AWS and Kubernetes). If the Kubernetes provider fails to authenticate, addon config goes through but ENIConfig creation fails. The symptom is custom networking enabled at the addon level but nodes that behave as if it is not. Always validate both sides.

Pod subnet too small under burst

Prefix delegation helps, but /28 blocks still pile up under sustained churn. Monitor pod subnet utilisation and size subnets with headroom for your realistic burst ceiling — not just steady-state density.


The Payoff

Once this is running, you have two independent capacity pools. Pod growth does not threaten your node addressing. IP exhaustion stops being a surprise incident. And because everything lives in Terraform state, there are no scripts, no manual apply sequences, and no drift to clean up after an incident.

Treat the CNI configuration as a capacity control plane — because that is what it is.


References

← All posts