Terraform — Theory
Terraform — Theory (interview deep-dive)
Section titled “Terraform — Theory (interview deep-dive)”State — why it matters
Section titled “State — why it matters”State is the thing that makes Terraform work and break.
- Maps your HCL resources to real cloud IDs.
- Tracks attributes Terraform discovered after apply.
- Without it, every
terraform planwould re-discover everything. - Plain JSON, may contain secrets (never commit local state).
State must be shared across team — use remote backend with locking.
State management commands
Section titled “State management commands”terraform state list— what’s managed.terraform state show <addr>— current attrs.terraform state mv— rename (refactor without recreate).terraform state rm— forget without destroying (thenimportelsewhere).terraform import <addr> <id>— bring existing resource under management.terraform refresh— re-read real state into local copy.
When real infra changes outside Terraform, state goes stale. terraform plan will show drift; apply overrides back to config (or import the change).
Drift causes: console clicks, other tools, auto-scaling.
Detection: scheduled terraform plan in CI; tools like driftctl.
Workspaces vs separate state files
Section titled “Workspaces vs separate state files”- Workspace — same config, different state (
terraform workspace new prod). Quick to switch, easy to forget which one you’re in. Good for short-lived envs. - Separate state files / dirs — fully isolated configs per env. Recommended for prod. Use directory structure or Terragrunt.
A common pattern:
infra/ modules/{vpc, app, db}/ envs/ dev/main.tf stg/main.tf prod/main.tfEach env dir has its own backend block and references modules.
Modules
Section titled “Modules”Best practices:
- One responsibility per module (e.g., a VPC, a service).
- Pin version (
~> 5). - Don’t expose every variable; provide sensible defaults.
- Use outputs for stable contracts.
- Avoid deep module nesting (max 2-3 levels).
- Standard modules: terraform-aws-modules registry has battle-tested ones.
Plan and apply mechanics
Section titled “Plan and apply mechanics”plan:
- Read state (refresh).
- Diff against config (and provider’s resource schema).
- Compute action graph: create / update (in-place or replace) / destroy.
- Output:
+ ~ -/+ -symbols.
apply executes the graph respecting dependencies. Implicit deps via reference (aws_instance.x.id creates ordering); explicit via depends_on.
Implicit vs explicit dependencies
Section titled “Implicit vs explicit dependencies”# implicit — Terraform sees the referenceresource "aws_eip" "ip" { instance = aws_instance.web.id }
# explicit — for non-reference relationships (e.g., IAM eventual consistency)resource "aws_lambda_function" "fn" { ... depends_on = [aws_iam_role_policy_attachment.logs]}What forces replacement?
Section titled “What forces replacement?”- Changes to fields marked
ForceNewin provider schema (e.g.,aws_instance.subnet_id). - Plan shows
# forces replacement. - Risk: outage during replace. Mitigate with
create_before_destroyif resource supports parallel.
Provisioners
Section titled “Provisioners”local-exec, remote-exec, file. Avoid if alternatives exist (Ansible, cloud-init). They run only at create/destroy, not on changes.
Sensitive data
Section titled “Sensitive data”- Mark variables/outputs
sensitive = trueto redact from logs. - State still contains values in plaintext! Encrypt backend, restrict access.
- Pull secrets at runtime from KMS / Vault / Secrets Manager — don’t bake them into TF.
Provider versioning
Section titled “Provider versioning”terraform { required_version = ">= 1.6" required_providers { aws = { source = "hashicorp/aws", version = "~> 5.0" } }}Lock file .terraform.lock.hcl — commit it.
Terraform vs alternatives
Section titled “Terraform vs alternatives”- Terraform: HCL, broad provider, mature, simple model.
- Pulumi: real programming languages (TS, Go, Python). Easier loops/conditionals. Steeper for non-coders.
- OpenTofu: open-source fork of Terraform after BSL license change. Drop-in replacement.
- CDK / CDKTF: typed code generating CloudFormation/Terraform.
- Ansible: imperative, config mgmt — different problem domain.
- CloudFormation: AWS-only, slow, sometimes still required.
- Crossplane: K8s-native IaC.
Common interview Qs
Section titled “Common interview Qs”- What’s in state, and why does it matter? Resource ID + attributes; needed to compute diff.
- Two engineers run apply concurrently — what happens? Without locking → corrupt state. Use S3+DynamoDB lock.
- You imported an existing bucket; plan still wants to recreate. Config attribute mismatches reality. Adjust HCL or import flags.
- A resource is gone manually. terraform plan? Plan will re-create. Use
state rmif intentional. - Module versioning — pin or float? Pin for prod (
~> X.Y); float for libraries. countvsfor_each— when?for_eachfor stable keys;countfor symmetric replicas. Avoid mid-list deletes withcount.- How do you organize multi-env infra? Module per concern; env dirs reference modules with different vars; per-env state.
- Secrets in TF — how? Don’t put them in state. Pull at runtime via data sources from Secrets Manager / Vault.
- Refactoring resource without destroy?
terraform state mvfor renames;moved {}block (TF 1.1+) for declarative moves. - TF apply taking too long. Parallelism (
-parallelism), break monolithic state into smaller, use targeted plan, providers slow APIs.
Common pitfalls
Section titled “Common pitfalls”- Local state in production.
- Massive single state file — slow plans, blast radius.
- Hand edits via console then forgot to import.
- Provisioners on every resource.
- Hard-coded IDs (use data sources or remote state outputs).
- Unpinned providers / modules.
terraform destroyin prod by accident — guard withprevent_destroy+ IAM.- Letting state drift; never running plan.