Terraform Stacks: Managing Multi-Environment Infrastructure at Scale

Terraform Stacks for Multi-Environment Infrastructure

Terraform Stacks multi-environment management represents a paradigm shift in how teams organize and deploy infrastructure across multiple environments. Released as a core feature in Terraform 1.8+, Stacks provide a native orchestration layer that coordinates multiple Terraform configurations, handles cross-stack dependencies, and enables consistent deployments from development through production.

This guide walks you through adopting Terraform Stacks to replace fragile workspace-based or directory-based multi-environment patterns. Moreover, you will learn how to design reusable components, implement deployment orchestration, and maintain drift detection across all your environments.

The Problem with Traditional Multi-Environment Patterns

Most teams manage multiple environments using one of three approaches: workspaces, directory duplication, or wrapper scripts. Each has significant drawbacks. Workspaces share state backends and make it easy to accidentally apply changes to the wrong environment. Directory duplication leads to configuration drift between environments. Wrapper scripts add complexity without proper dependency management.

Terraform Stacks solve these problems by introducing a declarative orchestration layer above individual Terraform configurations. Each stack defines which components to deploy, in what order, and with what environment-specific variables.

Multi-environment cloud infrastructure management
Orchestrating infrastructure deployments across multiple environments

Terraform Stacks Architecture

A Stack consists of components, deployments, and orchestration rules. Components are reusable Terraform configurations — your VPC module, your EKS cluster module, your RDS module. Deployments define where and how those components get applied. Additionally, orchestration rules specify the order of operations and dependencies between components.

# stacks/platform/components.tfstack.hcl
# Define reusable components

component "networking" {
  source = "./modules/networking"

  inputs = {
    vpc_cidr       = var.vpc_cidr
    environment    = var.environment
    azs            = var.availability_zones
    enable_nat     = var.environment != "dev"
    enable_vpn     = var.environment == "production"
  }
}

component "kubernetes" {
  source = "./modules/eks-cluster"

  inputs = {
    cluster_name    = "${var.project}-${var.environment}"
    vpc_id          = component.networking.vpc_id
    subnet_ids      = component.networking.private_subnet_ids
    node_count      = var.node_count
    instance_types  = var.instance_types
    k8s_version     = var.kubernetes_version
  }
}

component "database" {
  source = "./modules/rds-aurora"

  inputs = {
    cluster_name     = "${var.project}-${var.environment}-db"
    vpc_id           = component.networking.vpc_id
    subnet_ids       = component.networking.database_subnet_ids
    instance_class   = var.db_instance_class
    multi_az         = var.environment == "production"
    backup_retention = var.environment == "production" ? 30 : 7
  }
}

component "monitoring" {
  source = "./modules/observability"

  inputs = {
    cluster_endpoint = component.kubernetes.cluster_endpoint
    db_endpoint      = component.database.cluster_endpoint
    environment      = var.environment
    alert_endpoints  = var.alert_endpoints
  }
}

Defining Deployments per Environment

# stacks/platform/deployments.tfdeploy.hcl

deployment "dev" {
  inputs = {
    environment        = "dev"
    project            = "platform"
    vpc_cidr           = "10.0.0.0/16"
    availability_zones = ["us-east-1a", "us-east-1b"]
    node_count         = 2
    instance_types     = ["t3.medium"]
    kubernetes_version = "1.29"
    db_instance_class  = "db.t3.medium"
    alert_endpoints    = ["dev-alerts@company.com"]
  }
}

deployment "staging" {
  inputs = {
    environment        = "staging"
    project            = "platform"
    vpc_cidr           = "10.1.0.0/16"
    availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
    node_count         = 3
    instance_types     = ["t3.large"]
    kubernetes_version = "1.29"
    db_instance_class  = "db.r6g.large"
    alert_endpoints    = ["staging-alerts@company.com"]
  }
}

deployment "production" {
  inputs = {
    environment        = "production"
    project            = "platform"
    vpc_cidr           = "10.2.0.0/16"
    availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
    node_count         = 6
    instance_types     = ["m6i.xlarge", "m6i.2xlarge"]
    kubernetes_version = "1.28"
    db_instance_class  = "db.r6g.2xlarge"
    alert_endpoints    = ["prod-alerts@company.com", "oncall@company.com"]
  }
}

Orchestrated Deployment Pipelines

Therefore, Stacks enable progressive deployments where changes flow from dev to staging to production with automated validation gates between each environment. This pattern catches issues before they reach production.

# stacks/platform/orchestration.tfdeploy.hcl

orchestrate "progressive_rollout" {
  check {
    # Deploy to dev first
    condition = context.deployment == "dev"
    action    = "auto_approve"
  }

  check {
    # Staging requires dev to be healthy
    condition = context.deployment == "staging"
    depends_on = [deployment.dev]
    wait_for {
      health_check = "https://api.dev.company.com/health"
      timeout      = "5m"
    }
    action = "auto_approve"
  }

  check {
    # Production requires manual approval
    condition = context.deployment == "production"
    depends_on = [deployment.staging]
    wait_for {
      health_check = "https://api.staging.company.com/health"
      timeout      = "10m"
    }
    action = "manual_approve"
    notify = ["platform-team@company.com"]
  }
}
Cloud infrastructure deployment pipeline
Progressive deployment pipeline from dev through production

Drift Detection and Reconciliation

Consequently, one of the most valuable features of Terraform Stacks is built-in drift detection across all deployments. Instead of manually running terraform plan against each environment, Stacks continuously monitor for configuration drift and alert you when actual infrastructure diverges from desired state.

# Enable drift detection for all deployments
orchestrate "drift_detection" {
  schedule = "0 */6 * * *"  # Every 6 hours

  on_drift {
    severity = "high"
    action   = "notify"
    notify   = ["infrastructure-team@company.com"]
  }

  on_drift {
    severity = "critical"
    action   = "auto_reconcile"
    notify   = ["infrastructure-team@company.com", "oncall@company.com"]
  }
}

When NOT to Use Terraform Stacks

If you only manage a single environment or your infrastructure is simple enough to fit in one Terraform configuration, Stacks add unnecessary complexity. Additionally, Stacks require Terraform 1.8+ and HCP Terraform (or Terraform Enterprise) for full orchestration features — the open-source CLI alone provides limited Stack support.

Teams using Terragrunt or Pulumi may find those tools already solve their multi-environment needs sufficiently. Migration from an existing Terragrunt setup to Stacks is non-trivial and may not justify the effort unless you need the native drift detection and orchestration capabilities.

Infrastructure planning and architecture
Evaluating whether Terraform Stacks fit your infrastructure needs

Key Takeaways

Terraform Stacks multi-environment management brings declarative orchestration to infrastructure deployments. By defining components, deployments, and orchestration rules in a single Stack, you eliminate the fragile scripts and manual processes that typically coordinate multi-environment infrastructure. Furthermore, built-in drift detection ensures your environments stay consistent over time.

Start by identifying your most complex multi-environment workflow and modeling it as a Stack. For additional resources, consult the Terraform Stacks documentation and the HashiCorp blog on Stacks. You might also find our posts on Karpenter autoscaling and Kubernetes network policies helpful for managing the workloads running on your infrastructure.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top