Skip to main content

Terraform

One‑page refresher for day‑to‑day Terraform use (>= v1.5+ semantics still apply).

Core Concepts

  • Provider: Plugin that knows how to talk to an API (AWS, Kubernetes, etc.).
  • Resource: Creates/updates/destroys real infrastructure (aws_instance, kubernetes_deployment).
  • Data Source: Read‑only lookup of existing objects (data "aws_ami" ...).
  • Module: Reusable collection of .tf files with inputs/outputs.
  • State: JSON snapshot mapping resources to real world objects; enables diff & drift detection.
  • Backend: Where state is stored (local, s3, gcs, azurerm, remote, etc.).
  • Plan: Execution preview (what will change) before apply.
  • Workspace: Named state instance within a configuration (often overused for envs; prefer directory/env segregation + separate backends).
  • Lock file (terraform.lock.hcl): Provider checksums & versions; commit it.
  • Graph: Dependency DAG of resources; Terraform calculates parallelism automatically.

Beginner Tips

Deeper mechanics that trip up quasi‑beginners (beyond install & basic commands):

  • Evaluation Phases: init (plugins/backends) → validate (syntax/type) → plan (refresh read + diff + create graph) → apply (exec create/modify/destroy in dependency order). Data sources resolve during plan; resources only act during apply.
  • Variable Precedence (high → low): CLI -var / -var-file > environment TF_VAR_name > *.auto.tfvars (alphabetical) > terraform.tfvars > defaults in variable blocks. Unset required variable without default errors at plan time.
  • Type Resolution: Variable type constraints & validation blocks run after aggregation of all sources. Terraform coerces simple types (numbers in strings) when obvious; complex mismatch fails fast. Unknown values propagate as (known after apply) in plans (e.g., from resources not yet created).
  • Locals: Single-pass computed expressions; no reactivity or mutation. Use locals for DRY naming, not for ordering side effects.
  • Resource Addressing: Format resource_type.name[ index | "key" ]. Changing a for_each key destroys+recreates that instance. Reordering a list used with count shifts indexes—prefer for_each with stable map/object keys to avoid churn.
  • for_each vs count: Use for_each for collections needing stable identity (maps, sets of strings); use count for simple N copies. Avoid count when element removal would shift indexes.
  • Implicit Dependencies: References (aws_vpc.main.id) create graph edges automatically. Only use depends_on for hidden dependencies (e.g., local-exec using file rendered elsewhere) or data sources requiring ordering.
  • Data Sources Timing: Evaluated during plan. If a data source must read something created in the same apply, either split applies or (since 1.2) add depends_on inside the data block to force ordering.
  • Graph Parallelism: Terraform parallelizes independent nodes (default parallelism=10, adjustable via -parallelism flag). Over-parallelization can hit API rate limits—tune per provider limits if needed.
  • State Locking: Backends like S3(DynamoDB), GCS, AzureRM implement locks. Always wait or deliberately release only after verifying no active run. Never copy an unlocked local state over remote manually.
  • Refresh Behavior: plan triggers a state refresh (unless -refresh=false). Use -refresh-only for drift reconciliation without proposing changes, then apply -refresh-only to persist without modifying infra.
  • Lifecycle ignore_changes: Masking attributes accepts external drift; use sparingly or you risk unmanaged drift. Prefer explicit configuration whenever possible.
  • Sensitive Values: Mark variable/outputs as sensitive to hide in CLI output; state still stores raw values—avoid storing secrets (use external secret managers).
  • Provider Plugins: Downloaded to .terraform/providers; versions pinned by terraform.lock.hcl. Delete .terraform to force re-download (not commit); do commit the lock file.
  • Provider Aliases: Use alias when needing multiple credentials/regions. Pass the aliased provider explicitly: module "x" { providers = { aws = aws.secondary } } to avoid inheriting default by accident.
  • Dynamic Blocks: Useful for nested repeatable blocks (e.g., ingress rules) but keep logic simple; heavy conditional complexity belongs in data shaping locals beforehand.
  • Resource Replacement Causes: Changing immutable arguments (e.g., engine_version for some DBs) triggers destroy/create; plan marks with -/+ . Use lifecycle.prevent_destroy for critical resources, but know it will block legitimate changes.
  • Imports: terraform import binds existing object to a resource address; always add the matching resource block first (or immediately after). Follow with a plan to ensure no unintended future replacement.
  • terraform console: Great for quickly testing expressions, type conversions, templatefile rendering, and decoding JSON without applies; it loads current state & variables.
  • Output Usage: Only expose what downstream modules or humans need. Over-exposed outputs risk leaking sensitive architecture details.
  • Large Refactors: Use terraform state mv & state rm/import to avoid destructive churn when renaming or splitting modules.
  • Backwards Compatibility: Updating module versions—read changelog; apply in lower environment with plan review; expect outputs/types possibly changing which can break dependents.

Keep these in mind to avoid subtle plan churn, unintended recreations, and state drift early in your Terraform adoption.

Standard Project Structure (Mono-Repo Example)

infrastructure/
envs/
prod/
main.tf # Root config: calls modules, sets backend (prod bucket/table)
variables.tf
outputs.tf
providers.tf # Required provider + version + features blocks
backend.tf # (Optional) or partial backend config via CLI flags
staging/
...
modules/
network/
main.tf
variables.tf
outputs.tf
README.md
app_service/
main.tf
variables.tf
outputs.tf
global/
iam/
... # Sometimes a separate root for global, rarely destroyed

Notes:

  • Separate root modules per environment (distinct state) vs workspaces for true identical replicas.
  • Keep modules small, single responsibility, version pinned when pulled from registry or git.

Must‑Know Commands

Init & Hygiene:

terraform init                 # Install providers, configure backend
terraform init -upgrade # Update provider versions within constraints
terraform fmt -recursive # Standard formatting
terraform validate # Static validation
terraform providers lock -platform=linux_amd64 -platform=darwin_arm64

Planning & Applying:

terraform plan                 # Preview (reads remote state & refreshes)
terraform plan -out=plan.bin # Save plan for a later apply
terraform apply # Plan + confirm + execute
terraform apply plan.bin # Apply previously saved plan (CI)
terraform destroy # Tear everything down (careful)

State & Drift:

terraform state list           # Show tracked resources
terraform state show <addr> # Inspect specific resource state
terraform refresh # (Deprecated shortcut) prefer plan for refresh
terraform plan -refresh-only # Only reconcile state (no changes applied)
terraform apply -refresh-only # Persist refreshed state

Import & Move:

terraform import <addr> <id>   # Attach existing infra to state
terraform state mv <src> <dst> # Refactor addresses (rename/move in code first)
terraform state rm <addr> # Detach orphan/no-longer-managed resource

Workspaces (use sparingly):

terraform workspace list
terraform workspace new feature-x
terraform workspace select prod

Debug & Inspect:

terraform console              # Evaluate expressions with loaded state & vars
terraform graph | dot -Tpng > graph.png
TF_LOG=TRACE terraform plan # Deep debug (usually not committed)

Backends & Remote State

S3 Backend (common):

terraform {
backend "s3" {
bucket = "company-terraform-state"
key = "network/prod/terraform.tfstate"
region = "us-east-1"
use_lockfile = true # Enable state locking
encrypt = true
}
}

Guidelines:

  • Enable locking (DynamoDB, GCS consistency lock, etc.).
  • Do not manually edit state files.
  • Use data sources or outputs + remote state (terraform_remote_state) only when necessary; prefer module composition first.

Modules

Principles:

  • Inputs (variables.tf) define configurable surface; outputs (outputs.tf) expose consumed values.
  • Avoid leaking provider details (e.g., pass region in root, not module).
  • Version pin external modules (registry):
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = var.project
cidr = var.vpc_cidr
azs = var.azs
}
  • For internal modules, use relative source: source = "../modules/network".

Variables & Outputs

Variables:

variable "project" {
type = string
description = "Project name prefix"
}
variable "tags" {
type = map(string)
default = {}
description = "Common resource tags"
}

Outputs:

output "vpc_id" {
value = aws_vpc.main.id
description = "ID of the VPC"
}

Loading values:

  1. terraform.tfvars or *.auto.tfvars (auto-loaded).
  2. CLI: -var="project=demo" -var-file=prod.tfvars.
  3. Environment: TF_VAR_project=demo.

Providers

terraform {
required_version = ">= 1.5, < 2.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.60"
}
}
}

provider "aws" {
region = var.region
default_tags {
tags = var.tags
}
}

Tips:

  • Centralize provider config in root; modules consume the implicit provider.
  • Use alias only when multiple credentials/regions needed.

Workspaces vs Separate Roots

Use separate root modules + backends for environments needing differing counts, features, or lifecycle policies. Workspaces fit homogeneous replicas (e.g., ephemeral feature environments) where only variable values differ.

Lifecycle & Meta-Arguments

Common:

resource "aws_s3_bucket" "logs" {
bucket = "${var.project}-logs"

lifecycle {
prevent_destroy = true # Protect critical resources
ignore_changes = [tags] # Accept external drift for specified attrs
}
}

depends_on rarely needed—Terraform infers graph via references.

Testing & Validation (Lightweight)

  • terraform validate for syntax.
  • Use terraform plan in CI to enforce review of infra changes (fail on non-empty diff for approval gates).
  • Consider policy-as-code (OPA / Sentinel) for guardrails (out of scope here).

Best Practices (Condensed)

  • One resource purpose per module; compose rather than monolith.
  • Pin versions (providers & modules) with compatible (~>) ranges.
  • Tag everything (cost, ownership, environment).
  • Store state remotely + locked + versioned bucket.
  • Keep secrets out of state (prefer external secret managers; some data sources may still leak values—sanity check outputs).
  • Review plans; never blindly apply in prod from local without CI logs.
  • Run terraform fmt & validate pre-commit (add a hook if desired).
  • Document module inputs/outputs (README + examples block).

Troubleshooting Quick Wins

Issue / Action:

  • Provider version conflict → Run terraform init -upgrade then commit updated lock file.
  • Stuck state lock → Verify no active run; remove lock in backend (DynamoDB row) only if safe.
  • Orphaned real resource (not in state) → terraform import then manage or delete via Terraform.
  • Wrong address after refactor → Use terraform state mv before apply to avoid destroy/create.
  • Plan wants to recreate large resource (e.g., RDS) → Check changed immutable arguments; consider lifecycle.ignore_changes sparingly.

Minimal CI Outline

  1. terraform fmt -check
  2. terraform init -backend-config=... (inject secrets via env)
  3. terraform validate
  4. terraform plan -out=plan.bin
  5. (Manual or PR comment) review diff
  6. terraform apply -auto-approve plan.bin (after approval)

Keep it lean: add only what you must manage; refactor modules when friction appears repeatedly.