TL;DR #
- Use Terraform (or OpenTofu) for anything you recreate more than once: networks, databases, clusters, buckets, DNS.
- Remote state is not optional — set it up before you have a reason to regret skipping it.
- Keep modules flat and boring — one repo, one state per environment, no clever abstractions until you feel real pain.
- Run Terraform in CI — a plan on PR, apply on merge. Never from a laptop in prod.
Who this guide is for #
This is for teams of 3–10 engineers that want repeatable infrastructure without a full platform engineering function. You’re probably already using a cloud provider, maybe clicking around the console, and you’ve heard you should be doing “IaC” but aren’t sure where to start or how far to take it.
The goal is a setup that works, doesn’t require a Terraform expert to operate, and can grow with you.
Why bother with Terraform at this stage #
The honest answer: because console-clicking doesn’t scale and “it works on my account” is not documentation.
Terraform gives you:
- Repeatability: recreate staging that looks like prod in minutes.
- Auditability: infrastructure changes go through the same review process as code.
- Safety: plan before apply; review before destroy.
- Handover: new team members can understand what exists and why.
Even a small team benefits from these — especially when the person who set up the database three years ago has left.
Core concepts (fast recap) #
If you already know Terraform, skip this.
- Provider: the plugin that talks to a cloud API (AWS, GCP, Azure, Hetzner…).
- Resource: a thing you’re managing (a database, a bucket, a DNS record).
- State: a file (
.tfstate) that tracks what Terraform manages. This is precious. - Plan: Terraform compares state to your config and shows what it will change.
- Apply: executes the plan. Changes happen.
The mental model: Terraform holds a map of the world. You describe what the world should look like. It figures out the diff and makes it so.
Remote state: do this first #
By default, state is stored locally. That’s fine for one person experimenting. It’s a problem the moment two people share infrastructure or you lose your laptop.
Set up remote state before you share the repo.
Use your cloud provider’s object storage:
- AWS: S3 bucket + DynamoDB table for state locking.
- GCP: GCS bucket.
- Azure: Azure Storage.
- Hetzner / other: Terraform Cloud or a self-managed S3-compatible backend works.
A minimal S3 backend block:
terraform {
backend "s3" {
bucket = "your-company-tfstate"
key = "prod/terraform.tfstate"
region = "eu-central-1"
dynamodb_table = "terraform-state-lock"
encrypt = true
}
}Rules:
- One bucket, one key per environment (
staging/terraform.tfstate,prod/terraform.tfstate). - Enable versioning on the bucket so you can recover from bad applies.
- Encrypt the state (it often contains secrets).
- Never commit
.tfstateto git. Add it to.gitignore.
Project structure for a small team #
Keep it simple. One common pattern that works:
terraform/
staging/
main.tf
variables.tf
outputs.tf
backend.tf
prod/
main.tf
variables.tf
outputs.tf
backend.tf
modules/
database/
networking/Why separate directories per environment instead of workspaces?
- Each environment has its own state file. A
terraform destroyin staging can’t touch prod. - Easier to review: you see exactly what’s different between environments.
- Easier for CI: staging and prod have independent pipelines.
Workspaces are fine for very simple cases, but directories are less surprising.
What to manage with Terraform (and what not to) #
Good fits for Terraform #
- Networks, subnets, security groups, firewall rules.
- Managed databases (Postgres, MySQL), Redis.
- Object storage buckets and policies.
- DNS records.
- Kubernetes clusters (the cluster itself, not what runs inside it).
- IAM roles, service accounts, access policies.
- CDN, load balancers, TLS certificates.
Not worth Terraforming early on #
- Every console setting you’ll change once.
- Application config and feature flags (use your app’s config system).
- Kubernetes workloads (use Helm/Kustomize instead).
- Short-lived resources you create and destroy constantly.
The rule: if recreating it from scratch would take more than 10 minutes to get right, put it in Terraform.
CI/CD integration: never apply from a laptop #
The single biggest improvement you can make to your Terraform workflow is running it in CI.
Minimum setup:
- On pull request: run
terraform planand post the output as a comment. - On merge to main: run
terraform applyautomatically (for staging) or with manual approval (for prod).
This means:
- Changes are reviewed before they happen.
- There’s an audit trail of who applied what and when.
- Nobody applies from a personal terminal with local credentials.
GitHub Actions example (minimal):
name: Terraform
on:
pull_request:
paths:
- 'terraform/**'
push:
branches: [main]
paths:
- 'terraform/**'
jobs:
terraform:
runs-on: ubuntu-latest
defaults:
run:
working-directory: terraform/staging
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- name: Terraform Init
run: terraform init
- name: Terraform Plan
if: github.event_name == 'pull_request'
run: terraform plan -no-color
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: terraform apply -auto-approveProduction applies should require manual approval — use GitHub Actions environments with required reviewers, or a separate workflow that must be manually triggered.
Variables and secrets in Terraform #
Don’t hardcode environment-specific values or secrets in .tf files.
Use variables.tf for inputs:
variable "db_password" {
description = "Database password"
type = string
sensitive = true
}Pass values via:
terraform.tfvarsfiles (one per environment, not committed to git if they contain secrets).- Environment variables:
TF_VAR_db_password=... - CI secrets injected as env vars at runtime.
For the actual secrets (passwords, API keys) themselves: Terraform manages their existence but ideally doesn’t hold the values long-term. Use your cloud’s secrets manager for the runtime values and let Terraform provision the secret resource (the slot), not necessarily the value.
Modules: when to use them #
A module is just a reusable chunk of Terraform config.
Use modules when you genuinely repeat the same pattern — a database with its security group and backup policy, or a service with its load balancer and DNS record.
Don’t create modules prematurely. A common mistake is building a module for one resource you have once. Start with flat files, extract a module when you write the same config a second or third time.
Rule of thumb: if you can read the whole environment in one main.tf in under 10 minutes, you don’t need more abstraction yet.
Handling drift #
Drift is when the real infrastructure doesn’t match the Terraform state — someone clicked in the console, a resource was changed outside Terraform, or a migration happened manually.
Practical approach:
- Run
terraform planregularly (in CI or a scheduled job) and alert on unexpected changes. - Treat drift as a bug to fix, not a feature.
- Use
terraform importto bring unmanaged resources under Terraform control when needed.
The easiest way to avoid drift: make Terraform the only path for infrastructure changes, and remove console write access for production where possible.
Common anti-patterns to avoid #
- State in git. Don’t. Use remote state.
- One state for everything. Separate staging and prod. A bad plan against prod should be blocked by process, not luck.
- Applying from personal laptops with personal credentials. Use CI and machine credentials.
- Over-modularizing before you feel the pain. Premature abstraction in Terraform is just as painful as in code.
- Storing sensitive output in state without encryption. Always encrypt state at rest.
- No state locking. Without DynamoDB (or equivalent), two concurrent applies can corrupt state.
Quick checklist for a healthy Terraform setup #
- Remote state with encryption and versioning enabled.
- State locking configured.
.tfstateandterraform.tfvars(if containing secrets) in.gitignore.- Separate state files for staging and prod.
terraform planruns automatically on PR.terraform applyruns in CI, not from personal terminals.- Variables used for all environment-specific values.
- No hardcoded secrets in
.tffiles.
If you can tick these boxes, you have a solid foundation.
Terraform vs OpenTofu #
OpenTofu is a community-driven fork of Terraform, created after HashiCorp changed the license. It’s drop-in compatible for most use cases.
For a small team today:
- Terraform is still the default with the widest ecosystem, provider support, and documentation.
- OpenTofu is a sensible choice if the licensing change matters to you or your company policy requires OSS.
Either works. Pick one and standardize.
Related reads #
- Minimal DevOps stack guide
- Best CI/CD setup for small teams
- Open source DevOps tools
- Secrets management for small teams