惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
博客园 - 聂微东
IT之家
IT之家
The Cloudflare Blog
L
LangChain Blog
Last Week in AI
Last Week in AI
T
Tailwind CSS Blog
P
Proofpoint News Feed
aimingoo的专栏
aimingoo的专栏
G
Google Developers Blog
T
The Blog of Author Tim Ferriss
博客园 - 叶小钗
I
Intezer
Martin Fowler
Martin Fowler
MongoDB | Blog
MongoDB | Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
T
ThreatConnect
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
小众软件
小众软件
T
The Exploit Database - CXSecurity.com
H
Help Net Security
T
Tenable Blog
WordPress大学
WordPress大学
F
Future of Privacy Forum
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
NISL@THU
NISL@THU
The Register - Security
The Register - Security
A
About on SuperTechFans
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
MyScale Blog
MyScale Blog
Malwarebytes
Malwarebytes
博客园_首页
T
Threatpost
C
CERT Recently Published Vulnerability Notes
Know Your Adversary
Know Your Adversary
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
C
CXSECURITY Database RSS Feed - CXSecurity.com
Blog — PlanetScale
Blog — PlanetScale
Recorded Future
Recorded Future
大猫的无限游戏
大猫的无限游戏
K
Kaspersky official blog
月光博客
月光博客
Jina AI
Jina AI
S
Securelist
Hugging Face - Blog
Hugging Face - Blog
G
GRAHAM CLULEY
腾讯CDC
S
Secure Thoughts
V
V2EX - 技术

DEV Community

Switching Users in Linux (su, sudo) AI 智能体的鲁莽速度 Quick Win Card #01 — Ton backlog.md t'a menti (la cure en 30 secondes) Quick Win Card #01 — Your backlog.md lied to you (a 30-second cure) How to Manage an IT Team: Structure, Scaling, and Daily Workflows That Work Speccing Is the New Coding CAC 250만 원을 뚫기 위해 퍼널 세 곳을 뜯어고친 3개월 Creating My First Token on Solana Devnet as a Web2 Developer Five Salesforce Reports Every Nonprofit Leadership Team Should Have Beyond the West: What Eastern AI Models Mean for Enterprises, Developers, and Digital Sovereignty Class and Pseudo Class Git & GitLab Basics 고객은 우리를 사기꾼으로 봤다: 아무도 믿지 않는 신사업을 단 둘이서 검증한 3개월 Cron Not Working on Mac? How to Fix the macOS Sleep Trap with launchd Cache Everything: Advanced Caching Strategies in Vue 3 & Nuxt 4 Deploy a Node.js App to STACKIT Kubernetes Engine With Managed Redis & PostgreSQL Slopsquatting & Remote Prompts: Why I Built a 38,000 Ticker Engine with Zero NPM Dependencies 05/20: TCP/IP vs OSI Model: The Ultimate Comparison My New Adventures in IT # Mitigating Market Inefficiency in eSports: A Stochastic Approach to EA Sports FC25 Modeling Don't let a billion RAG docs drown your 25-result pipeline Experienced devs are slower with AI tools. Nobody wants to admit it. I built an MCP-native OSINT framework that lets AI agents investigate from your terminal AWS Nitro Enclaves vs Intel TDX: Why Attestation Root Matters for Regulated Workloads Vibe Coding: Revolution or Risk in Software Development? - SmarterArticles S1E6 JSON Schema Explained: Validate Your API Data Before It Breaks Production Harness Tells Your Agent What to Do. GUI Agents Let It Actually Do It. Is AI actually replacing developers? Customizing Docker Images: Write Your First Dockerfile (2026) €40 n8n vs 28% weekly Anthropic quota. Which /goal layer should you actually run? Reviving glyph-v8: From a Forgotten Prototype to STRIDE - a Field-Aware Integer Coder 04/20: Data Encapsulation: How a Message Becomes Bits on the Wire Hướng Dẫn Thiết Lập Reasoning Proxy DeepSeek V4-Pro với Cursor (2026) Sofi Log #012: Agentic GDP — Solana Pay.sh & x402 Protocol Spec Input Types, Attributes, Self-Closing Tags, Hover Effect Absolute vs Relative Paths File Types (Regular, Directory, Link, Device, Socket, Pipe) From Arduino IDE to AVR GCC | AVR Bare Metal #1 Using Bitcoin as collateral without wrapping it: the design of a BTC collateral vault Unreal Engine 5 Skill System Architecture using GAS and GameplayTags 5 Things I Wish I Knew Before Building with Hermes Agent Thoughts on Codingame 2026 Spring challenge OUT WITH THE OLD IN WITH THE NEW Why are simple 1099 tax calculators online so horribly bloated? So I built my own "Why You're Not Getting Callbacks (It's Not Your Skills)" # How I Built a Retail Demand Forecasting App with Python and Streamlit Why We Deliberately Crush Lithium Batteries (UN38.3 Crush Testing Explained) Command History & Completion The Three-Body Problem: AI Code, Supply Chain Attacks, and the Talent Exodus 로컬 LLM 셋업 가이드 (v27) Building Better .NET Worker Services with Cursor Rules Generate Professional PDF Invoices via REST API — JSON In, PDF Out Redis: Big Keys Destroem o Desempenho Compartilhado Agentic AI for Cybersecurity: Autonomous Threat Detection and Response How to Automate Android Without Appium Cron vs systemd daemon: which one for Node.js? Designing XSLT transforms with parameters and multiple inputs I Downloaded Gemma4:e2b On My Macbook in 2 steps Building an Autonomous SRE Agent: From Raw Telemetry to Safe, AI-Driven Remediation The EU AI Act in 2026: Reading the Law After the Omnibus I had zero coding knowledge. Here is "RetroTube", a 2010 YouTube sandbox prototype I built using AI! How to Validate Environment Variables in TypeScript (and Why You Should) I Built a CLI Tool That Writes Better Git Commits Than I Do Transfer Fees, Metadata, and Soulbound Tokens: My First Real Token Experiments on Solana Stop Using Fetch() in React: A Better Way To Call Your Backend Creando un Tetris con JavaScript VI: Complicando el juego. DeepSeek's API Price Cut Changed My Claude Code and ChatGPT Math [Boost] Perl 🐪 Weekly #774 - Perl is too HOT How to Track AI Usage Without Losing Revenue (Complete Guide) 77 Rules Later: What Graduating Our First Stack Actually Looked Like RAG 시스템 실전 구축 (v26) When Premature Scaling Leads to Operator Burnout Multi-Repo Microservice Changes Are a Coordination Problem. I Solved It With AI Agent Teams. The Next Frontier: How Multi-Agent Systems are Redefining Productivity The Kimwolf Bust Just Outed Android Webcams as Botnet Fodder — Here's the Question Every Repurposed-Phone Camera Setup Has to Answer I'm an autonomous AI agent. I shipped 18 fixes to myself in one session. Building a Secure Future with Zero Trust Security Architecture Asynchronous Functions in Dart How I migrated magic-link login from Resend to AWS SES + Lambda five days before launch Edge Computing He creado una empresa ficticia IT/OT para poder encontrar sus vulnerabilidades y reforzar su seguridad en sus activos críticos Why I Built @editora/react I built a tiny UGC script generator because hooks are the hardest part The Phone Is Becoming the New Terminal Why Most AI Music Tools Feel Wrong to Developers Goroutines vs. Promises: Why Go and JavaScript Look at Concurrency Completely Differently How I Use Antigravity 2.0 to Navigate Open-Source Codebases and Make Better Technical Decisions Understanding Basic HTML & CSS Concepts for Beginners Go Error Handling: Annoying or Awesome? Your To-Do List Doesn't Know You — So I Gave Mine Three Brains Shell Basics (Bash, Zsh, Sh) Free MongoDB GUI Tool for Developers, Students, and Teams Designing High-Performance Blockchain Indexers Choosing Models for an Agentic Chat App on Amazon Bedrock How Smart Growth Teams Automate Their Marketing Stack in 2026 (Without Hiring More People) What I Learned About Memory-Augmented AI Agents Seven Docker Tips Every Engineer Should Know (from Docker Captains) Welcome to the Fast-Food Era of Testing: Over-Weight by Tests How to use Claude in vscode?
Terraform + Terragrunt + Ansible: A Hands-On Learning Journey
Taha Yağız G · 2026-05-25 · via DEV Community

I recently got interview feedback that changed how I approach learning:

"You've used these tools, but the technical depth wasn't there."

Instead of just reading documentation, I decided to build a real multi-environment infrastructure setup from scratch — dev, staging, and prod — using Terraform, Terragrunt, and Ansible. This post is a walkthrough of what I built, why each decision was made, and what I actually learned along the way.


The Problem with Single-Environment Thinking

Up until this point, my Terraform workflow looked like this:

write main.tf → terraform apply → done

Enter fullscreen mode Exit fullscreen mode

That works fine for a single environment. But in a real company, code never goes directly to production. There's always a pipeline:

  • Dev — developers experiment here, things can break, no real users
  • Staging — production mirror, QA tests here before release
  • Prod — real users, real traffic, every mistake costs something

When you try to scale your single main.tf to three environments, three problems appear immediately.

Problem 1: Code duplication. You copy main.tf into environments/dev, environments/staging, and environments/prod. Now you have three identical files. When you add a new resource to dev, you have to manually copy it to the other two. Forget once — your environments silently drift apart.

Problem 2: State file collisions. Terraform saves the current state of your infrastructure to a file called terraform.tfstate. If all three environments write to the same S3 path, a dev apply can overwrite the prod state. Infrastructure gone.

Problem 3: No access control. Without IAM isolation, any engineer with AWS credentials can accidentally run terragrunt apply in the wrong environment.

These are the three problems this lab is designed to solve.


Project Architecture

Here's the full directory structure we're building:

terraform-ansible/
├── _base
│   ├── main.tf        # single Terraform entry point, used by all environments
│   └── modules
│       ├── ec2
│       │   ├── main.tf
│       │   ├── outputs.tf
│       │   └── variables.tf
│       ├── sg
│       │   ├── main.tf
│       │   ├── outputs.tf
│       │   └── variables.tf
│       └── vpc
│           ├── main.tf
│           ├── outputs.tf
│           └── variables.tf
├── ansible
│   ├── ansible.cfg
│   ├── group_vars
│   │   ├── env_dev.yml
│   │   ├── env_prod.yml
│   │   └── env_staging.yml
│   ├── inventory
│   │   └── aws_ec2.yml   # dynamic inventory — AWS tag based
│   ├── playbooks
│   │   └── provision.yml
│   └── roles
│       ├── common
│       │   └── tasks
│       │       └── main.yml
│       └── webserver
│           ├── handlers
│           │   └── main.yml
│           └── tasks
│               └── main.yml
└── live
    ├── dev
    │   └── terragrunt.hcl # dev-specific values
    ├── prod
    │   └── terragrunt.hcl # prod-specific values
    ├── staging
    │   └── terragrunt.hcl # staging-specific values
    └── terragrunt.hcl     # root config — S3 backend, state locking

Enter fullscreen mode Exit fullscreen mode

The flow looks like this:

terragrunt apply (live/dev)
       │
       ├── reads live/terragrunt.hcl        → generates backend.tf automatically
       ├── reads live/dev/terragrunt.hcl    → gets environment-specific inputs
       ├── runs _base/main.tf               → provisions VPC, SG, EC2
       └── triggers null_resource           → runs Ansible playbook automatically

Enter fullscreen mode Exit fullscreen mode


Step 1: Terraform Modules — Reusable Infrastructure Components

Modules are Terraform's way of packaging reusable infrastructure. Instead of writing the same VPC configuration in every environment, you write it once as a module and call it with different parameters.

Each module follows the same three-file pattern:

  • variables.tf — what inputs the module accepts
  • main.tf — what resources it creates
  • outputs.tf — what values it exposes to the caller

Here's the EC2 module as an example:

modules/ec2/variables.tf

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
}

variable "environment" {
  type = string
}

variable "subnet_id" {
  type = string
}

variable "sg_id" {
  type = string
}

variable "key_name" {
  description = "SSH key pair name"
  type        = string
}

Enter fullscreen mode Exit fullscreen mode

modules/ec2/main.tf

data "aws_ami" "amazon_linux" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["al2023-ami-*-x86_64"]
  }
}

resource "aws_instance" "main" {
  ami                    = data.aws_ami.amazon_linux.id
  instance_type          = var.instance_type
  subnet_id              = var.subnet_id
  vpc_security_group_ids = [var.sg_id]
  key_name               = var.key_name

  tags = {
    Name        = "${var.environment}-server"
    Environment = var.environment
    ManagedBy   = "terraform"
    Project     = "terraform-lab"
  }
}

Enter fullscreen mode Exit fullscreen mode

modules/ec2/outputs.tf

output "instance_id" {
  value = aws_instance.main.id
}

output "public_ip" {
  value = aws_instance.main.public_ip
}

Enter fullscreen mode Exit fullscreen mode

The VPC and Security Group modules follow the same pattern. The key insight: modules are just functions. They take inputs, create resources, and return outputs.


Step 2: _base/main.tf — The Single Entry Point

All three environments use this exact file. It calls the modules and accepts all variable values from outside — from Terragrunt:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

variable "environment"   { type = string }
variable "vpc_cidr"      { type = string }
variable "instance_type" { type = string }
variable "key_name"      { type = string  default = "terraform-lab-key" }
variable "region"        { type = string  default = "eu-central-1" }

module "vpc" {
  source      = "../modules/vpc"
  vpc_cidr    = var.vpc_cidr
  environment = var.environment
}

module "sg" {
  source      = "../modules/sg"
  vpc_id      = module.vpc.vpc_id
  environment = var.environment
}

module "ec2" {
  source        = "../modules/ec2"
  instance_type = var.instance_type
  environment   = var.environment
  subnet_id     = module.vpc.subnet_id
  sg_id         = module.sg.sg_id
  key_name      = var.key_name
}

resource "null_resource" "ansible_provision" {
  depends_on = [module.ec2]

  triggers = {
    instance_id = module.ec2.instance_id
  }

  provisioner "local-exec" {
    command = <<-EOT
      echo "Waiting for instance to be ready..."
      sleep 30
      cd /path/to/ansible && \
      ansible-playbook playbooks/provision.yml -e "target_env=${var.environment}"
    EOT
  }
}

output "instance_id" { value = module.ec2.instance_id }
output "public_ip"   { value = module.ec2.public_ip }
output "vpc_id"      { value = module.vpc.vpc_id }

Enter fullscreen mode Exit fullscreen mode

Notice that _base/main.tf has no hardcoded values — no instance type, no CIDR block, no environment name. Everything comes from outside. This is what makes it reusable across environments.


Step 3: Terragrunt — Solving the Multi-Environment Problem

Terragrunt is a thin wrapper around Terraform. It doesn't replace Terraform — it just removes the need to duplicate main.tf across environments by injecting environment-specific values at runtime.

Think of _base/main.tf as a function. Terragrunt calls that function with different arguments for each environment.

Root config

live/terragrunt.hcl is written once and inherited by all environments:

locals {
  env = basename(get_terragrunt_dir())
  # get_terragrunt_dir() returns the current directory path
  # basename() extracts just the last segment: "dev", "staging", or "prod"
  # so env is automatically set from the folder name — no hardcoding needed
}

remote_state {
  backend = "s3"
  config = {
    bucket         = "your-tfstate-bucket"
    key            = "${local.env}/terraform.tfstate"
    region         = "eu-central-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
    # backend.tf is generated automatically before every apply
    # you never write it manually
  }
}

Enter fullscreen mode Exit fullscreen mode

The key field is the critical part. When you run from live/dev, local.env becomes "dev", so the state is saved to dev/terraform.tfstate. From live/prod, it goes to prod/terraform.tfstate. State isolation is automatic.

Per-environment config

Each environment only contains what's different — the input values:

live/dev/terragrunt.hcl

include "root" {
  path = find_in_parent_folders()
  # inherits everything from live/terragrunt.hcl
}

terraform {
  source = "../../_base"
  # points to the shared main.tf
}

inputs = {
  environment   = "dev"
  vpc_cidr      = "10.0.0.0/16"
  instance_type = "t3.micro"
  key_name      = "terraform-lab-key"
}

Enter fullscreen mode Exit fullscreen mode

live/prod/terragrunt.hcl

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "../../_base"   # same main.tf
}

inputs = {
  environment   = "prod"
  vpc_cidr      = "10.2.0.0/16"
  instance_type = "t3.medium"   # only the values differ
  key_name      = "terraform-lab-key"
}

Enter fullscreen mode Exit fullscreen mode

To deploy:

# Deploy only dev
cd live/dev && terragrunt apply

# Plan all environments at once
cd live && terragrunt run-all plan

# Apply all environments at once
cd live && terragrunt run-all apply

Enter fullscreen mode Exit fullscreen mode


Step 4: Ansible — Post-Provisioning Configuration

Terraform answers the question: "Does this EC2 instance exist?"

Ansible answers the question: "Is nginx installed on that instance and configured correctly?"

These are two different problems. Terraform manages infrastructure state. Ansible manages configuration state. You need both.

Dynamic inventory

Instead of hardcoding IP addresses, Ansible discovers instances by their AWS tags:

ansible/inventory/aws_ec2.yml

plugin: amazon.aws.aws_ec2

regions:
  - eu-central-1

filters:
  tag:ManagedBy:
    - terraform
  instance-state-name:
    - running

keyed_groups:
  - key: tags.Environment
    prefix: env
    separator: "_"

hostnames:
  - tag:Name
  - public-ip-address

compose:
  ansible_host: public_ip_address
  environment: tags.Environment

Enter fullscreen mode Exit fullscreen mode

Any running instance tagged with ManagedBy: terraform is automatically discovered. Instances are grouped by their Environment tag — so dev instances land in the env_dev group, prod in env_prod, and so on. Even if the IP address changes after a destroy/apply cycle, the inventory stays correct.

Roles

ansible/roles/common/tasks/main.yml — runs on every instance:

---
- name: Update all packages
  ansible.builtin.dnf:
    name: "*"
    state: latest

- name: Install base tools
  ansible.builtin.dnf:
    name: [git, htop, vim, wget]
    state: present

- name: Create deploy user
  ansible.builtin.user:
    name: deploy
    shell: /bin/bash
    groups: wheel
    append: yes

- name: Grant deploy user sudo access
  ansible.builtin.copy:
    dest: /etc/sudoers.d/deploy
    content: "deploy ALL=(ALL) NOPASSWD:ALL"
    mode: "0440"

- name: Set timezone
  ansible.builtin.timezone:
    name: Europe/Istanbul

Enter fullscreen mode Exit fullscreen mode

ansible/roles/webserver/tasks/main.yml — installs and configures nginx:

---
- name: Install nginx
  ansible.builtin.dnf:
    name: nginx
    state: present

- name: Start and enable nginx
  ansible.builtin.systemd:
    name: nginx
    state: started
    enabled: yes
    daemon_reload: yes

- name: Create environment-specific index.html
  ansible.builtin.copy:
    dest: /usr/share/nginx/html/index.html
    content: |
      <h1>{{ app_environment }} environment</h1>
      <p>Instance: {{ ansible_facts['hostname'] }}</p>
      <p>IP: {{ ansible_facts['default_ipv4']['address'] }}</p>
    mode: "0644"
  notify: nginx restart

Enter fullscreen mode Exit fullscreen mode

Playbook

---
- name: Instance provisioning
  hosts: "env_{{ target_env }}"
  become: true
  vars:
    app_environment: "{{ tags.Environment }}"

  roles:
    - common
    - webserver

Enter fullscreen mode Exit fullscreen mode

Run against a specific environment:

# Only dev
ansible-playbook playbooks/provision.yml -e "target_env=dev"

# Only prod
ansible-playbook playbooks/provision.yml -e "target_env=prod"

Enter fullscreen mode Exit fullscreen mode

Idempotency test

One of Ansible's core properties is idempotency — running the same playbook twice should produce the same result. The second run should show changed=0:

# First run
ansible-playbook playbooks/provision.yml -e "target_env=dev"
# → ok=10  changed=8  failed=0

# Second run — nothing changes
ansible-playbook playbooks/provision.yml -e "target_env=dev"
# → ok=10  changed=0  failed=0

Enter fullscreen mode Exit fullscreen mode

changed=0 on the second run confirms idempotency is working.


Step 5: Connecting Everything — One Command to Rule Them All

With null_resource in _base/main.tf, running terragrunt apply automatically triggers Ansible after the EC2 instance is ready:

terragrunt apply
    ↓
VPC created
    ↓
Security Group created
    ↓
EC2 instance running
    ↓
null_resource triggers (depends_on = [module.ec2])
    ↓
sleep 30 (wait for SSH to be ready)
    ↓
ansible-playbook runs automatically
    ↓
nginx installed, configured, running

Enter fullscreen mode Exit fullscreen mode

From a single command, you get a fully provisioned and configured server.


Step 6: Proving It Works — IAM Isolation & Drift Testing

IAM isolation

A dev engineer should not be able to touch prod state files. We enforce this with IAM policies:

{
  "Effect": "Allow",
  "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
  "Resource": "arn:aws:s3:::your-tfstate-bucket/dev/*"
}

Enter fullscreen mode Exit fullscreen mode

The dev IAM user can only read/write to dev/* in S3. Attempting to write to prod/*:

AWS_ACCESS_KEY_ID=dev-key AWS_SECRET_ACCESS_KEY=dev-secret \
  aws s3 cp test.txt s3://your-tfstate-bucket/prod/test.txt

# An error occurred (AccessDenied) when calling the PutObject operation

Enter fullscreen mode Exit fullscreen mode

Human error blocked at the policy level.

Drift test

Add a new tag to modules/ec2/main.tf:

tags = {
  Name        = "${var.environment}-server"
  Environment = var.environment
  ManagedBy   = "terraform"
  Project     = "terraform-lab"    # new tag
}

Enter fullscreen mode Exit fullscreen mode

Run run-all plan to see the change propagated to all three environments simultaneously:

cd live && terragrunt run-all plan

# Plan: 0 to add, 1 to change, 0 to destroy  (dev)
# Plan: 0 to add, 1 to change, 0 to destroy  (staging)
# Plan: 0 to add, 1 to change, 0 to destroy  (prod)

Enter fullscreen mode Exit fullscreen mode

One file changed. Three environments updated. No manual copying, no risk of forgetting one.


Key Takeaways

After building this from scratch, here's what actually clicked for me:

Terraform and Ansible solve different problems. Terraform manages infrastructure state — "does this resource exist in AWS?" Ansible manages configuration state — "is nginx installed and running on that server?" You need both because provisioning a server and configuring it are fundamentally different concerns.

Terragrunt's value isn't magic — it's discipline. The single _base/main.tf enforces consistency. You can't accidentally configure staging differently from prod because there's only one source of truth. Configuration drift becomes structurally impossible rather than just unlikely.

IAM policy is the last line of defense. Engineers make mistakes. The cd live/prod && terragrunt apply accident will happen eventually. When it does, the question is whether your infrastructure or your IAM policy catches it first.

Idempotency is a property you verify, not assume. Running the playbook twice and checking for changed=0 isn't just a test — it's how you know your automation is actually reliable.


All code from this lab is available on GitHub. If you spot something that could be done better, I'd genuinely love to hear it in the comments.