Terraform
One‑page refresher for day‑to‑day Terraform use (>= v1.5+ semantics still apply).
Core Concepts
- Provider: Plugin that knows how to talk to an API (AWS, Kubernetes, etc.).
- Resource: Creates/updates/destroys real infrastructure (aws_instance, kubernetes_deployment).
- Data Source: Read‑only lookup of existing objects (data "aws_ami" ...).
- Module: Reusable collection of .tf files with inputs/outputs.
- State: JSON snapshot mapping resources to real world objects; enables diff & drift detection.
- Backend: Where state is stored (local, s3, gcs, azurerm, remote, etc.).
- Plan: Execution preview (what will change) before apply.
- Workspace: Named state instance within a configuration (often overused for envs; prefer directory/env segregation + separate backends).
- Lock file (terraform.lock.hcl): Provider checksums & versions; commit it.
- Graph: Dependency DAG of resources; Terraform calculates parallelism automatically.
Beginner Tips
Deeper mechanics that trip up quasi‑beginners (beyond install & basic commands):
- Evaluation Phases: init (plugins/backends) → validate (syntax/type) → plan (refresh read + diff + create graph) → apply (exec create/modify/destroy in dependency order). Data sources resolve during plan; resources only act during apply.
- Variable Precedence (high → low): CLI -var / -var-file > environment TF_VAR_name > *.auto.tfvars (alphabetical) > terraform.tfvars > defaults in variable blocks. Unset required variable without default errors at plan time.
- Type Resolution: Variable type constraints & validation blocks run after aggregation of all sources. Terraform coerces simple types (numbers in strings) when obvious; complex mismatch fails fast. Unknown values propagate as (known after apply) in plans (e.g., from resources not yet created).
- Locals: Single-pass computed expressions; no reactivity or mutation. Use locals for DRY naming, not for ordering side effects.
- Resource Addressing: Format resource_type.name[ index | "key" ]. Changing a for_each key destroys+recreates that instance. Reordering a list used with count shifts indexes—prefer for_each with stable map/object keys to avoid churn.
- for_each vs count: Use for_each for collections needing stable identity (maps, sets of strings); use count for simple N copies. Avoid count when element removal would shift indexes.
- Implicit Dependencies: References (aws_vpc.main.id) create graph edges automatically. Only use depends_on for hidden dependencies (e.g., local-exec using file rendered elsewhere) or data sources requiring ordering.
- Data Sources Timing: Evaluated during plan. If a data source must read something created in the same apply, either split applies or (since 1.2) add depends_on inside the data block to force ordering.
- Graph Parallelism: Terraform parallelizes independent nodes (default parallelism=10, adjustable via -parallelism flag). Over-parallelization can hit API rate limits—tune per provider limits if needed.
- State Locking: Backends like S3(DynamoDB), GCS, AzureRM implement locks. Always wait or deliberately release only after verifying no active run. Never copy an unlocked local state over remote manually.
- Refresh Behavior: plan triggers a state refresh (unless -refresh=false). Use -refresh-only for drift reconciliation without proposing changes, then apply -refresh-only to persist without modifying infra.
- Lifecycle ignore_changes: Masking attributes accepts external drift; use sparingly or you risk unmanaged drift. Prefer explicit configuration whenever possible.
- Sensitive Values: Mark variable/outputs as sensitive to hide in CLI output; state still stores raw values—avoid storing secrets (use external secret managers).
- Provider Plugins: Downloaded to .terraform/providers; versions pinned by terraform.lock.hcl. Delete .terraform to force re-download (not commit); do commit the lock file.
- Provider Aliases: Use alias when needing multiple credentials/regions. Pass the aliased provider explicitly:
module "x" { providers = { aws = aws.secondary } }to avoid inheriting default by accident. - Dynamic Blocks: Useful for nested repeatable blocks (e.g., ingress rules) but keep logic simple; heavy conditional complexity belongs in data shaping locals beforehand.
- Resource Replacement Causes: Changing immutable arguments (e.g., engine_version for some DBs) triggers destroy/create; plan marks with -/+ . Use lifecycle.prevent_destroy for critical resources, but know it will block legitimate changes.
- Imports: terraform import binds existing object to a resource address; always add the matching resource block first (or immediately after). Follow with a plan to ensure no unintended future replacement.
- terraform console: Great for quickly testing expressions, type conversions, templatefile rendering, and decoding JSON without applies; it loads current state & variables.
- Output Usage: Only expose what downstream modules or humans need. Over-exposed outputs risk leaking sensitive architecture details.
- Large Refactors: Use terraform state mv & state rm/import to avoid destructive churn when renaming or splitting modules.
- Backwards Compatibility: Updating module versions—read changelog; apply in lower environment with plan review; expect outputs/types possibly changing which can break dependents.
Keep these in mind to avoid subtle plan churn, unintended recreations, and state drift early in your Terraform adoption.
Standard Project Structure (Mono-Repo Example)
infrastructure/
envs/
prod/
main.tf # Root config: calls modules, sets backend (prod bucket/table)
variables.tf
outputs.tf
providers.tf # Required provider + version + features blocks
backend.tf # (Optional) or partial backend config via CLI flags
staging/
...
modules/
network/
main.tf
variables.tf
outputs.tf
README.md
app_service/
main.tf
variables.tf
outputs.tf
global/
iam/
... # Sometimes a separate root for global, rarely destroyed
Notes:
- Separate root modules per environment (distinct state) vs workspaces for true identical replicas.
- Keep modules small, single responsibility, version pinned when pulled from registry or git.
Must‑Know Commands
Init & Hygiene:
terraform init # Install providers, configure backend
terraform init -upgrade # Update provider versions within constraints
terraform fmt -recursive # Standard formatting
terraform validate # Static validation
terraform providers lock -platform=linux_amd64 -platform=darwin_arm64
Planning & Applying:
terraform plan # Preview (reads remote state & refreshes)
terraform plan -out=plan.bin # Save plan for a later apply
terraform apply # Plan + confirm + execute
terraform apply plan.bin # Apply previously saved plan (CI)
terraform destroy # Tear everything down (careful)
State & Drift:
terraform state list # Show tracked resources
terraform state show <addr> # Inspect specific resource state
terraform refresh # (Deprecated shortcut) prefer plan for refresh
terraform plan -refresh-only # Only reconcile state (no changes applied)
terraform apply -refresh-only # Persist refreshed state
Import & Move:
terraform import <addr> <id> # Attach existing infra to state
terraform state mv <src> <dst> # Refactor addresses (rename/move in code first)
terraform state rm <addr> # Detach orphan/no-longer-managed resource
Workspaces (use sparingly):
terraform workspace list
terraform workspace new feature-x
terraform workspace select prod
Debug & Inspect:
terraform console # Evaluate expressions with loaded state & vars
terraform graph | dot -Tpng > graph.png
TF_LOG=TRACE terraform plan # Deep debug (usually not committed)
Backends & Remote State
S3 Backend (common):
terraform {
backend "s3" {
bucket = "company-terraform-state"
key = "network/prod/terraform.tfstate"
region = "us-east-1"
use_lockfile = true # Enable state locking
encrypt = true
}
}
Guidelines:
- Enable locking (DynamoDB, GCS consistency lock, etc.).
- Do not manually edit state files.
- Use data sources or outputs + remote state (terraform_remote_state) only when necessary; prefer module composition first.
Modules
Principles:
- Inputs (variables.tf) define configurable surface; outputs (outputs.tf) expose consumed values.
- Avoid leaking provider details (e.g., pass region in root, not module).
- Version pin external modules (registry):
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.0"
name = var.project
cidr = var.vpc_cidr
azs = var.azs
}
- For internal modules, use relative source:
source = "../modules/network".
Variables & Outputs
Variables:
variable "project" {
type = string
description = "Project name prefix"
}
variable "tags" {
type = map(string)
default = {}
description = "Common resource tags"
}
Outputs:
output "vpc_id" {
value = aws_vpc.main.id
description = "ID of the VPC"
}
Loading values:
- terraform.tfvars or *.auto.tfvars (auto-loaded).
- CLI:
-var="project=demo" -var-file=prod.tfvars. - Environment:
TF_VAR_project=demo.
Providers
terraform {
required_version = ">= 1.5, < 2.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.60"
}
}
}
provider "aws" {
region = var.region
default_tags {
tags = var.tags
}
}
Tips:
- Centralize provider config in root; modules consume the implicit provider.
- Use alias only when multiple credentials/regions needed.
Workspaces vs Separate Roots
Use separate root modules + backends for environments needing differing counts, features, or lifecycle policies. Workspaces fit homogeneous replicas (e.g., ephemeral feature environments) where only variable values differ.
Lifecycle & Meta-Arguments
Common:
resource "aws_s3_bucket" "logs" {
bucket = "${var.project}-logs"
lifecycle {
prevent_destroy = true # Protect critical resources
ignore_changes = [tags] # Accept external drift for specified attrs
}
}
depends_on rarely needed—Terraform infers graph via references.
Testing & Validation (Lightweight)
terraform validatefor syntax.- Use
terraform planin CI to enforce review of infra changes (fail on non-empty diff for approval gates). - Consider policy-as-code (OPA / Sentinel) for guardrails (out of scope here).
Best Practices (Condensed)
- One resource purpose per module; compose rather than monolith.
- Pin versions (providers & modules) with compatible (~>) ranges.
- Tag everything (cost, ownership, environment).
- Store state remotely + locked + versioned bucket.
- Keep secrets out of state (prefer external secret managers; some data sources may still leak values—sanity check outputs).
- Review plans; never blindly apply in prod from local without CI logs.
- Run
terraform fmt&validatepre-commit (add a hook if desired). - Document module inputs/outputs (README + examples block).
Troubleshooting Quick Wins
Issue / Action:
- Provider version conflict → Run
terraform init -upgradethen commit updated lock file. - Stuck state lock → Verify no active run; remove lock in backend (DynamoDB row) only if safe.
- Orphaned real resource (not in state) →
terraform importthen manage or delete via Terraform. - Wrong address after refactor → Use
terraform state mvbefore apply to avoid destroy/create. - Plan wants to recreate large resource (e.g., RDS) → Check changed immutable arguments; consider
lifecycle.ignore_changessparingly.
Minimal CI Outline
- terraform fmt -check
- terraform init -backend-config=... (inject secrets via env)
- terraform validate
- terraform plan -out=plan.bin
- (Manual or PR comment) review diff
- terraform apply -auto-approve plan.bin (after approval)
Reference Links
- Registry: https://registry.terraform.io
- Language: https://developer.hashicorp.com/terraform/language
- AWS Provider Docs: https://registry.terraform.io/providers/hashicorp/aws/latest/docs
Keep it lean: add only what you must manage; refactor modules when friction appears repeatedly.