Recipes

Common production add-on combinations. Each recipe is a standalone main.tf — drop it into your infrastructure repo alongside your EKS cluster definition.

Recipe 1 — Minimal Production Cluster

What it installs: Metrics Server, Cluster Autoscaler, AWS Load Balancer Controller, External DNS, cert-manager, External Secrets

The smallest set of add-ons that makes an EKS cluster production-ready:

Workloads can scale (HPA + node autoscaling)
Services get load balancers and DNS records automatically
TLS certificates are provisioned and renewed automatically
Secrets are pulled from AWS Secrets Manager — no plaintext in manifests

module "eks_addons" {
  source = "git::https://github.com/clouddrove/terraform-aws-eks-addons.git?ref=0.0.7"

  eks_cluster_name    = module.eks.cluster_name
  data_plane_wait_arn = module.eks.data_plane_wait_arn

  tags = {
    Environment = "production"
    ManagedBy   = "terraform"
  }

  # HPA and kubectl top support
  metrics_server = true

  # Node group autoscaling
  cluster_autoscaler = true
  cluster_autoscaler_helm_config = {
    version = "9.29.0"
  }

  # ALB/NLB provisioning for Ingress and Service resources
  aws_load_balancer_controller = true

  # Automatic Route 53 records from Ingress/Service hostnames
  external_dns = true
  external_dns_iampolicy_json_content = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = ["route53:ChangeResourceRecordSets"]
        Resource = "arn:aws:route53:::hostedzone/${var.hosted_zone_id}"
      },
      {
        Effect   = "Allow"
        Action   = ["route53:ListHostedZones", "route53:ListResourceRecordSets"]
        Resource = "*"
      }
    ]
  })

  # Automatic TLS certs via Let's Encrypt
  certification_manager = true

  # Sync secrets from AWS Secrets Manager → Kubernetes Secrets
  external_secrets = true
  external_secrets_iampolicy_json_content = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = ["secretsmanager:GetSecretValue", "secretsmanager:DescribeSecret"]
        Resource = "arn:aws:secretsmanager:${var.region}:${var.account_id}:secret:${var.env}/*"
      }
    ]
  })
}

Recipe 2 — Full Observability Stack

What it installs: Prometheus, Grafana, Loki, FluentBit, Kube State Metrics, Node Termination Handler

Metrics, logs, and dashboards — all in-cluster:

Prometheus scrapes all pods and nodes
Loki aggregates logs from all pods via FluentBit
Grafana visualizes both metrics and logs with a single datasource config
Kube State Metrics provides Kubernetes object-level metrics (pod restarts, deployment status, etc.)

module "eks_addons" {
  source = "git::https://github.com/clouddrove/terraform-aws-eks-addons.git?ref=0.0.7"

  eks_cluster_name    = module.eks.cluster_name
  data_plane_wait_arn = module.eks.data_plane_wait_arn

  tags = { Environment = var.env, ManagedBy = "terraform" }

  # Required by Grafana for its ingress
  aws_load_balancer_controller = true
  metrics_server               = true

  # Metrics collection and alerting
  prometheus = true
  prometheus_helm_config = {
    version = "25.11.0"
    values  = [
      <<-EOT
      alertmanager:
        enabled: true
      server:
        retention: "15d"
        persistentVolume:
          size: 50Gi
      EOT
    ]
  }

  # Dashboards — exposed via ALB Ingress
  grafana = true
  grafana_helm_config = {
    version = "7.2.5"
    values  = [
      <<-EOT
      datasources:
        datasources.yaml:
          apiVersion: 1
          datasources:
            - name: Prometheus
              type: prometheus
              url: http://prometheus-server
              isDefault: true
            - name: Loki
              type: loki
              url: http://loki:3100
      grafana.ini:
        auth.anonymous:
          enabled: false
      EOT
    ]
  }

  # Log aggregation
  loki = true
  loki_helm_config = {
    version = "5.43.3"
    values  = [
      <<-EOT
      loki:
        commonConfig:
          replication_factor: 1
        storage:
          type: filesystem
      singleBinary:
        replicas: 1
      EOT
    ]
  }

  # Log forwarding from every pod to Loki + CloudWatch
  fluent_bit = true
  fluent_bit_helm_config = {
    version = "0.43.0"
    values  = [
      <<-EOT
      config:
        outputs: |
          [OUTPUT]
            Name  loki
            Match *
            Host  loki.monitoring.svc.cluster.local
            Port  3100
            Labels job=fluentbit
          [OUTPUT]
            Name                cloudwatch_logs
            Match               *
            region              ${var.region}
            log_group_name      /eks/${var.cluster_name}/application
            log_stream_prefix   pod-
            auto_create_group   true
      EOT
    ]
  }

  # Kubernetes object metrics for Prometheus
  kube_state_metrics = true

  # Alert on Spot interruptions
  aws_node_termination_handler = true

  # Pod restart alerting (Slack/PagerDuty notifications on OOMKilled, CrashLoopBackOff)
  k8s_pod_restart_info_collector = true
}

Recipe 3 — Karpenter Autoscaling

What it installs: Karpenter, Metrics Server, AWS Load Balancer Controller

Karpenter replaces Cluster Autoscaler with a faster, more flexible node provisioner. It provisions the right instance type for each workload — no pre-defined node groups required.

module "eks_addons" {
  source = "git::https://github.com/clouddrove/terraform-aws-eks-addons.git?ref=0.0.7"

  eks_cluster_name    = module.eks.cluster_name
  data_plane_wait_arn = module.eks.data_plane_wait_arn

  tags = { Environment = var.env, ManagedBy = "terraform" }

  metrics_server               = true
  aws_load_balancer_controller = true

  # Karpenter — node provisioner
  karpenter = true
  karpenter_helm_config = {
    version = "0.35.0"
    values  = [
      <<-EOT
      settings:
        clusterName: ${var.cluster_name}
        clusterEndpoint: ${var.cluster_endpoint}
        interruptionQueue: ${var.karpenter_interruption_queue}
      EOT
    ]
  }

  # Custom IAM policy — restrict which instance types Karpenter can launch
  karpenter_iampolicy_json_content = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = [
          "ec2:RunInstances", "ec2:TerminateInstances",
          "ec2:DescribeInstances", "ec2:DescribeInstanceTypes",
          "ec2:DescribeSubnets", "ec2:DescribeSecurityGroups",
          "ec2:DescribeLaunchTemplates", "ec2:CreateLaunchTemplate",
          "ec2:DeleteLaunchTemplate", "ec2:CreateFleet",
          "ec2:CreateTags", "iam:PassRole",
          "ssm:GetParameter"
        ]
        Resource = "*"
      }
    ]
  })
}

After applying, create a Karpenter NodePool and EC2NodeClass manifest:

# karpenter-nodepool.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      nodeClassRef:
        name: default
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
  limits:
    cpu: 1000
  disruption:
    consolidationPolicy: WhenUnderutilized
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2
  role: KarpenterNodeRole-${cluster_name}
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: ${cluster_name}
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: ${cluster_name}

Recipe 4 — Service Mesh with Istio

What it installs: AWS Load Balancer Controller, Istio Ingress, Kiali, Calico

Mutual TLS between services, traffic management, and observability with Kiali. Calico provides NetworkPolicy enforcement.

module "eks_addons" {
  source = "git::https://github.com/clouddrove/terraform-aws-eks-addons.git?ref=0.0.7"

  eks_cluster_name    = module.eks.cluster_name
  data_plane_wait_arn = module.eks.data_plane_wait_arn

  tags = { Environment = var.env, ManagedBy = "terraform" }

  metrics_server = true

  # Required by Istio Ingress and Grafana
  aws_load_balancer_controller = true

  # Istio ingress gateway
  istio_ingress = true
  istio_ingress_helm_config = {
    version = "1.20.0"
  }
  # Paths to Istio Gateway and VirtualService manifests
  istio_manifests = {
    istio_ingress_manifest_file_path = ["${path.module}/manifests/istio-ingress.yaml"]
    istio_gateway_manifest_file_path = ["${path.module}/manifests/istio-gateway.yaml"]
  }

  # Kiali service mesh dashboard (requires istio_ingress = true)
  kiali_server = true
  kiali_manifests = {
    kiali_virtualservice_file_path = "${path.module}/manifests/kiali-virtualservice.yaml"
  }

  # Calico network policy (eBPF dataplane)
  calico_tigera = true
  calico_tigera_helm_config = {
    version = "3.27.0"
  }

  # Reload pods when ConfigMaps or Secrets change
  reloader = true
}

Namespace labelling

For Istio sidecar injection, label your application namespaces:

kubectl label namespace my-app istio-injection=enabled

Recipe 5 — Secrets + Backup

What it installs: External Secrets, Velero, Reloader

For teams that need audit-compliant secret management and cluster disaster recovery:

module "eks_addons" {
  source = "git::https://github.com/clouddrove/terraform-aws-eks-addons.git?ref=0.0.7"

  eks_cluster_name    = module.eks.cluster_name
  data_plane_wait_arn = module.eks.data_plane_wait_arn

  tags = { Environment = var.env, ManagedBy = "terraform" }

  # Sync secrets from Secrets Manager → Kubernetes Secrets
  external_secrets = true

  # Backup cluster resources + PVs to S3
  velero = true
  velero_helm_config = {
    version = "6.0.0"
    values  = [
      <<-EOT
      configuration:
        backupStorageLocation:
          - name: default
            provider: aws
            bucket: ${var.velero_bucket}
            config:
              region: ${var.region}
        volumeSnapshotLocation:
          - name: default
            provider: aws
            config:
              region: ${var.region}
      initContainers:
        - name: velero-plugin-for-aws
          image: velero/velero-plugin-for-aws:v1.9.0
          volumeMounts:
            - mountPath: /target
              name: plugins
      EOT
    ]
  }

  # Automatically restart pods when secrets are updated by External Secrets
  reloader = true
}

Velero IAM policy — the module creates an IRSA role. To use a custom policy:

velero_iampolicy_json_content = jsonencode({
  Version = "2012-10-17"
  Statement = [
    {
      Effect   = "Allow"
      Action   = ["s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket"]
      Resource = [
        "arn:aws:s3:::${var.velero_bucket}",
        "arn:aws:s3:::${var.velero_bucket}/*"
      ]
    },
    {
      Effect   = "Allow"
      Action   = ["ec2:CreateSnapshot", "ec2:DeleteSnapshot", "ec2:DescribeSnapshots",
                  "ec2:CreateTags", "ec2:DescribeVolumes"]
      Resource = "*"
    }
  ]
})

Variable Reference

Variable	Type	Default	Description
`eks_cluster_name`	`string`	`""`	EKS cluster name
`data_plane_wait_arn`	`string`	`""`	ARN to wait on before installing
`manage_via_gitops`	`bool`	`false`	Skip Helm installs; create IRSA only
`tags`	`map(any)`	`{}`	Tags for IAM resources
`irsa_iam_role_path`	`any`	`{}`	IAM role path for IRSA roles
`irsa_iam_permissions_boundary`	`any`	`{}`	Permissions boundary for IRSA roles

Reference: terraform-aws-eks-addons →

Recipe 1 — Minimal Production Cluster​

Recipe 2 — Full Observability Stack​

Recipe 3 — Karpenter Autoscaling​

Recipe 4 — Service Mesh with Istio​

Recipe 5 — Secrets + Backup​

Variable Reference​