
























The team has 15 services. Each one’s Terraform is a copy-modify of a previous service’s Terraform. When AWS changes a default or a security policy needs updating, somebody has to touch 15 directories. New services take a week to bootstrap because they’re cargo-cult-copying everything from the most recent service that “looked similar.”
The fix is internal Terraform modules, a reusable building block per common pattern. A new service becomes 10 lines of HCL calling the right modules. Updates propagate by bumping a version. The module is the platform.
This post is the module design that scales: composition, versioning, testing, and the four traps that catch teams the first time.
A module solves one concern. Examples of right-sized modules:
ecs-service: runs a container on ECS Fargate with the right log group, IAM role, autoscaling.rds-postgres: Postgres instance with backups, alarms, secret in Secrets Manager.sqs-queue: queue plus dead-letter queue plus consumer IAM policy.cloudfront-spa: static site bucket, CloudFront, ACM cert, origin access control.Each is small (200-500 lines of HCL), has a focused interface, and is composable.
A bad module:
everything-for-our-app: provisions the whole stack of one specific service. Not reusable.aws-resource-wrapper: adds nothing on top of aws_* resources. Indirection without value.multi-environment-config: switches behavior based on var.env. Confusing; replace with separate calls.A module exposes inputs (variables) and outputs:
# modules/ecs-service/variables.tf
variable "name" { type = string }
variable "image" { type = string }
variable "cpu" { type = number; default = 256 }
variable "memory" { type = number; default = 512 }
variable "env" {
type = map(string)
default = {}
}
variable "secrets" {
type = map(string)
default = {}
}
variable "vpc_id" { type = string }
variable "subnet_ids" { type = list(string) }
# modules/ecs-service/outputs.tf
output "service_name" { value = aws_ecs_service.this.name }
output "task_role_arn" { value = aws_iam_role.task.arn }
output "log_group_name" { value = aws_cloudwatch_log_group.this.name }
The interface is the contract. Input names are stable across versions. Adding inputs is non-breaking; renaming or removing them is.
# infra/prod/api/main.tf
module "api" {
source = "git::https://github.com/example/tf-modules.git//ecs-service?ref=v1.4.0"
name = "api"
image = "ghcr.io/example/api:abc123"
cpu = 1024
memory = 2048
vpc_id = data.aws_vpc.main.id
subnet_ids = data.aws_subnets.private.ids
env = {
NODE_ENV = "production"
}
secrets = {
DATABASE_URL = data.aws_secretsmanager_secret.db.arn
}
}
Ten lines plus inputs. The module does the rest: task definition, service, log group, IAM, alarms.
?ref=v1.4.0 is the version pin. Treat modules as versioned dependencies; bump deliberately.
Three approaches:
1. Tagged Git releases. git::https://github.com/.../tf-modules.git//ecs-service?ref=v1.4.0. Standard, simple, free.
2. Terraform Registry. Public for open-source, private for paid (Terraform Cloud / Enterprise). Better discovery, version listings.
3. Module-per-repo with semver. Each module has its own repo with semver tags. More overhead, cleaner ownership.
For most teams, (1) is enough. A monorepo with tagged versions, calls reference ?ref=v.... Promote modules to (2) or (3) only if scale demands.
A common temptation: keep adding inputs to a module to handle every case. The module grows from 50 inputs to 200. Configuration becomes overwhelming.
The better pattern is composition: small modules that combine. Instead of one giant service module with options for “should we have a queue,” “should we have a database,” “should we have a cron”:
# infra/prod/api/main.tf
module "api" { source = "...//ecs-service" ; ... }
module "api_queue" { source = "...//sqs-queue" ; ... }
module "api_db" { source = "...//rds-postgres" ; ... }
module "api_cron" { source = "...//cloudwatch-cron" ; ... }
Each module is small. The service file describes what this service has by what modules it calls. New patterns add new modules; existing ones stay focused.
Terraform tests catch breaking changes before they propagate. Three approaches:
1. terraform plan against a fixture. Each module has a test/ directory with a sample call. CI runs terraform plan against a known state file and asserts the plan matches expected.
2. Terratest. Go-based test framework. Provisions real infrastructure, runs assertions, tears down. Slow (minutes) but high-confidence.
func TestEcsService(t *testing.T) {
options := &terraform.Options{
TerraformDir: "../examples/basic",
}
defer terraform.Destroy(t, options)
terraform.InitAndApply(t, options)
serviceName := terraform.Output(t, options, "service_name")
assert.Contains(t, serviceName, "test-")
}
3. terraform-compliance. Asserts against the plan: “every RDS instance must have backups enabled,” “every S3 bucket must have encryption.”
For most modules, (1) is enough. (2) for modules that provision complex infrastructure where misconfigurations are subtle. (3) as a policy gate across all modules.
1. Modules that wrap a single resource. A module that just wraps aws_lambda_function adds nothing. Use the resource directly.
2. Modules that change behavior based on environment. if var.env == "prod" { backup = true }. Now reading the module requires understanding three branches. Better: separate “prod-grade” and “dev-grade” presets, or call the module with explicit inputs from each environment.
3. Modules that grow without versioning. “Just push to main; everyone is on the latest.” No way to update one consumer without updating all. Tag every change.
4. Modules with no examples. A module with 30 inputs and no example call is unusable. Every module’s repo should have an examples/ directory with realistic invocations.
If you have a dedicated platform team:
If you don’t (a team of 10 sharing infra responsibility):
The pattern works either way. The point is that modules are infrastructure code, owned by someone, with the same review discipline as application code.
For a typical AWS-based team, start with these:
ecs-service: runs a container.rds-postgres: managed Postgres with sensible defaults.s3-bucket: versioned, encrypted, blocked from public.sqs-queue: queue + DLQ.lambda-function: Lambda with log group, IAM role.cloudfront-spa: static site delivery.route53-record: DNS record with validation.Most apps need a subset of these. Composing 4-6 of them describes most services.
A module abstracts the resources, not the workflow. Provisioning a database is one Terraform call; running migrations is something else. Modules don’t run migrations.
Some teams build complementary tooling around modules:
internal-cli new-service) that scaffolds the Terraform call and the application repo.terraform plan, opens a PR, runs DB migrations after apply.These layer on top of modules; they don’t replace them.
Internal Terraform modules turn “spin up a new service” from a week of work into ten lines of HCL. The investment is real: designing, testing, versioning modules takes engineering time. The payoff is real too. Every service after the first is faster, more consistent, and easier to update.
Build small focused modules, compose them, version them, test them, document examples. Avoid the four traps (single-resource wrappers, env-conditional logic, no versioning, no examples). The team that does this has infrastructure as a product, not as a copy-paste exercise.
The kind of platform-engineering discipline that turns infrastructure code from a per-service liability into a reusable asset (versioned modules, composition patterns, testing) is the kind of long-haul DevOps work Yojji’s teams build into the platforms they ship for clients.
Yojji is an international custom software development company founded in 2016, with teams across Europe, the US, and the UK. They specialize in the JavaScript ecosystem, cloud platforms (AWS, Azure, GCP), and Terraform-based infrastructure, including the module design and platform-engineering work that decides whether new services take a week or an hour to bootstrap.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。