Terraform

One‑page refresher for day‑to‑day Terraform use (>= v1.5+ semantics still apply).

Core Concepts

Provider: Plugin that knows how to talk to an API (AWS, Kubernetes, etc.).
Resource: Creates/updates/destroys real infrastructure (aws_instance, kubernetes_deployment).
Data Source: Read‑only lookup of existing objects (data "aws_ami" ...).
Module: Reusable collection of .tf files with inputs/outputs.
State: JSON snapshot mapping resources to real world objects; enables diff & drift detection.
Backend: Where state is stored (local, s3, gcs, azurerm, remote, etc.).
Plan: Execution preview (what will change) before apply.
Workspace: Named state instance within a configuration (often overused for envs; prefer directory/env segregation + separate backends).
Lock file (terraform.lock.hcl): Provider checksums & versions; commit it.
Graph: Dependency DAG of resources; Terraform calculates parallelism automatically.

Beginner Tips

Deeper mechanics that trip up quasi‑beginners (beyond install & basic commands):

Evaluation Phases: init (plugins/backends) → validate (syntax/type) → plan (refresh read + diff + create graph) → apply (exec create/modify/destroy in dependency order). Data sources resolve during plan; resources only act during apply.
Variable Precedence (high → low): CLI -var / -var-file > environment TF_VAR_name > *.auto.tfvars (alphabetical) > terraform.tfvars > defaults in variable blocks. Unset required variable without default errors at plan time.
Type Resolution: Variable type constraints & validation blocks run after aggregation of all sources. Terraform coerces simple types (numbers in strings) when obvious; complex mismatch fails fast. Unknown values propagate as (known after apply) in plans (e.g., from resources not yet created).
Locals: Single-pass computed expressions; no reactivity or mutation. Use locals for DRY naming, not for ordering side effects.
Resource Addressing: Format resource_type.name[ index | "key" ]. Changing a for_each key destroys+recreates that instance. Reordering a list used with count shifts indexes—prefer for_each with stable map/object keys to avoid churn.
for_each vs count: Use for_each for collections needing stable identity (maps, sets of strings); use count for simple N copies. Avoid count when element removal would shift indexes.
Implicit Dependencies: References (aws_vpc.main.id) create graph edges automatically. Only use depends_on for hidden dependencies (e.g., local-exec using file rendered elsewhere) or data sources requiring ordering.
Data Sources Timing: Evaluated during plan. If a data source must read something created in the same apply, either split applies or (since 1.2) add depends_on inside the data block to force ordering.
Graph Parallelism: Terraform parallelizes independent nodes (default parallelism=10, adjustable via -parallelism flag). Over-parallelization can hit API rate limits—tune per provider limits if needed.
State Locking: Backends like S3(DynamoDB), GCS, AzureRM implement locks. Always wait or deliberately release only after verifying no active run. Never copy an unlocked local state over remote manually.
Refresh Behavior: plan triggers a state refresh (unless -refresh=false). Use -refresh-only for drift reconciliation without proposing changes, then apply -refresh-only to persist without modifying infra.
Lifecycle ignore_changes: Masking attributes accepts external drift; use sparingly or you risk unmanaged drift. Prefer explicit configuration whenever possible.
Sensitive Values: Mark variable/outputs as sensitive to hide in CLI output; state still stores raw values—avoid storing secrets (use external secret managers).
Provider Plugins: Downloaded to .terraform/providers; versions pinned by terraform.lock.hcl. Delete .terraform to force re-download (not commit); do commit the lock file.
Provider Aliases: Use alias when needing multiple credentials/regions. Pass the aliased provider explicitly: module "x" { providers = { aws = aws.secondary } } to avoid inheriting default by accident.
Dynamic Blocks: Useful for nested repeatable blocks (e.g., ingress rules) but keep logic simple; heavy conditional complexity belongs in data shaping locals beforehand.
Resource Replacement Causes: Changing immutable arguments (e.g., engine_version for some DBs) triggers destroy/create; plan marks with -/+ . Use lifecycle.prevent_destroy for critical resources, but know it will block legitimate changes.
Imports: terraform import binds existing object to a resource address; always add the matching resource block first (or immediately after). Follow with a plan to ensure no unintended future replacement.
terraform console: Great for quickly testing expressions, type conversions, templatefile rendering, and decoding JSON without applies; it loads current state & variables.
Output Usage: Only expose what downstream modules or humans need. Over-exposed outputs risk leaking sensitive architecture details.
Large Refactors: Use terraform state mv & state rm/import to avoid destructive churn when renaming or splitting modules.
Backwards Compatibility: Updating module versions—read changelog; apply in lower environment with plan review; expect outputs/types possibly changing which can break dependents.

Keep these in mind to avoid subtle plan churn, unintended recreations, and state drift early in your Terraform adoption.

Standard Project Structure (Mono-Repo Example)

infrastructure/
    envs/
        prod/
            main.tf        # Root config: calls modules, sets backend (prod bucket/table)
            variables.tf
            outputs.tf
            providers.tf   # Required provider + version + features blocks
            backend.tf     # (Optional) or partial backend config via CLI flags
        staging/
            ...
    modules/
        network/
            main.tf
            variables.tf
            outputs.tf
            README.md
        app_service/
            main.tf
            variables.tf
            outputs.tf
    global/
        iam/
            ...            # Sometimes a separate root for global, rarely destroyed

Notes:

Separate root modules per environment (distinct state) vs workspaces for true identical replicas.
Keep modules small, single responsibility, version pinned when pulled from registry or git.

Must‑Know Commands

Init & Hygiene:

terraform init                 # Install providers, configure backend
terraform init -upgrade        # Update provider versions within constraints
terraform fmt -recursive       # Standard formatting
terraform validate             # Static validation
terraform providers lock -platform=linux_amd64 -platform=darwin_arm64

Planning & Applying:

terraform plan                 # Preview (reads remote state & refreshes)
terraform plan -out=plan.bin   # Save plan for a later apply
terraform apply                # Plan + confirm + execute
terraform apply plan.bin       # Apply previously saved plan (CI)
terraform destroy              # Tear everything down (careful)

State & Drift:

terraform state list           # Show tracked resources
terraform state show <addr>    # Inspect specific resource state
terraform refresh              # (Deprecated shortcut) prefer plan for refresh
terraform plan -refresh-only   # Only reconcile state (no changes applied)
terraform apply -refresh-only  # Persist refreshed state

Import & Move:

terraform import <addr> <id>   # Attach existing infra to state
terraform state mv <src> <dst> # Refactor addresses (rename/move in code first)
terraform state rm <addr>      # Detach orphan/no-longer-managed resource

Workspaces (use sparingly):

terraform workspace list
terraform workspace new feature-x
terraform workspace select prod

Debug & Inspect:

terraform console              # Evaluate expressions with loaded state & vars
terraform graph | dot -Tpng > graph.png
TF_LOG=TRACE terraform plan    # Deep debug (usually not committed)

Backends & Remote State

S3 Backend (common):

terraform {
    backend "s3" {
        bucket         = "company-terraform-state"
        key            = "network/prod/terraform.tfstate"
        region         = "us-east-1"
        use_lockfile   = true  # Enable state locking
        encrypt        = true
    }
}

Guidelines:

Enable locking (DynamoDB, GCS consistency lock, etc.).
Do not manually edit state files.
Use data sources or outputs + remote state (terraform_remote_state) only when necessary; prefer module composition first.

Modules

Principles:

Inputs (variables.tf) define configurable surface; outputs (outputs.tf) expose consumed values.
Avoid leaking provider details (e.g., pass region in root, not module).
Version pin external modules (registry):

module "vpc" {
    source  = "terraform-aws-modules/vpc/aws"
    version = "~> 5.0"
    name = var.project
    cidr = var.vpc_cidr
    azs  = var.azs
}

For internal modules, use relative source: source = "../modules/network".

Variables & Outputs

Variables:

variable "project" {
    type        = string
    description = "Project name prefix"
}
variable "tags" {
    type        = map(string)
    default     = {}
    description = "Common resource tags"
}

Outputs:

output "vpc_id" {
    value       = aws_vpc.main.id
    description = "ID of the VPC"
}

Loading values:

terraform.tfvars or *.auto.tfvars (auto-loaded).
CLI: -var="project=demo" -var-file=prod.tfvars.
Environment: TF_VAR_project=demo.

Providers

terraform {
    required_version = ">= 1.5, < 2.0"
    required_providers {
        aws = {
            source  = "hashicorp/aws"
            version = "~> 5.60"
        }
    }
}

provider "aws" {
    region  = var.region
    default_tags {
        tags = var.tags
    }
}

Tips:

Centralize provider config in root; modules consume the implicit provider.
Use alias only when multiple credentials/regions needed.

Workspaces vs Separate Roots

Use separate root modules + backends for environments needing differing counts, features, or lifecycle policies. Workspaces fit homogeneous replicas (e.g., ephemeral feature environments) where only variable values differ.

Lifecycle & Meta-Arguments

Common:

resource "aws_s3_bucket" "logs" {
    bucket = "${var.project}-logs"

    lifecycle {
        prevent_destroy = true    # Protect critical resources
        ignore_changes  = [tags]  # Accept external drift for specified attrs
    }
}

depends_on rarely needed—Terraform infers graph via references.

Testing & Validation (Lightweight)

terraform validate for syntax.
Use terraform plan in CI to enforce review of infra changes (fail on non-empty diff for approval gates).
Consider policy-as-code (OPA / Sentinel) for guardrails (out of scope here).

Best Practices (Condensed)

One resource purpose per module; compose rather than monolith.
Pin versions (providers & modules) with compatible (~>) ranges.
Tag everything (cost, ownership, environment).
Store state remotely + locked + versioned bucket.
Keep secrets out of state (prefer external secret managers; some data sources may still leak values—sanity check outputs).
Review plans; never blindly apply in prod from local without CI logs.
Run terraform fmt & validate pre-commit (add a hook if desired).
Document module inputs/outputs (README + examples block).

Troubleshooting Quick Wins

Issue / Action:

Provider version conflict → Run terraform init -upgrade then commit updated lock file.
Stuck state lock → Verify no active run; remove lock in backend (DynamoDB row) only if safe.
Orphaned real resource (not in state) → terraform import then manage or delete via Terraform.
Wrong address after refactor → Use terraform state mv before apply to avoid destroy/create.
Plan wants to recreate large resource (e.g., RDS) → Check changed immutable arguments; consider lifecycle.ignore_changes sparingly.

Minimal CI Outline

terraform fmt -check
terraform init -backend-config=... (inject secrets via env)
terraform validate
terraform plan -out=plan.bin
(Manual or PR comment) review diff
terraform apply -auto-approve plan.bin (after approval)

Reference Links

Registry: https://registry.terraform.io
Language: https://developer.hashicorp.com/terraform/language
AWS Provider Docs: https://registry.terraform.io/providers/hashicorp/aws/latest/docs

Keep it lean: add only what you must manage; refactor modules when friction appears repeatedly.

Core Concepts​

Beginner Tips​

Standard Project Structure (Mono-Repo Example)​

Must‑Know Commands​

Backends & Remote State​

Modules​

Variables & Outputs​

Providers​

Workspaces vs Separate Roots​

Lifecycle & Meta-Arguments​

Testing & Validation (Lightweight)​

Best Practices (Condensed)​

Troubleshooting Quick Wins​

Minimal CI Outline​

Reference Links​