DevOps & CI/CD

Infrastructure as Code

Managing and provisioning infrastructure through machine-readable configuration files rather than manual processes, enabling version control.

What Is Infrastructure as Code?

Infrastructure as Code (IaC) is the practice of managing and provisioning computing infrastructure — servers, networks, databases, load balancers, and other resources — through machine-readable definition files rather than manual configuration through web consoles or interactive command-line sessions. With IaC, the desired state of your infrastructure is described declaratively in configuration files that can be versioned, reviewed, tested, and shared like any other source code.

Before IaC, provisioning a new server meant logging into a cloud console, clicking through configuration screens, and manually installing software. This approach was slow, error-prone, and impossible to reproduce consistently. If a server needed to be rebuilt — due to failure, scaling, or migration — the operations team had to retrace the manual steps, often from memory or outdated documentation. IaC eliminates this fragility by treating infrastructure definitions as first-class code artifacts.

The IaC movement gained momentum with tools like Puppet and Chef in the mid-2000s and accelerated with the introduction of Terraform by HashiCorp in 2014 and AWS CloudFormation. Today, IaC is considered a fundamental DevOps practice, with tools available for every major cloud provider and infrastructure platform.

How It Works

IaC tools fall into two broad categories: declarative and imperative. Declarative tools let you describe the desired end state, and the tool figures out what changes are needed. Imperative tools require you to specify the exact steps to reach the desired state. Most modern IaC tools favor the declarative approach.

Here is an example of infrastructure defined with Terraform:

# main.tf - Define a production web application infrastructure
provider "aws" {
  region = "us-east-1"
}

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true

  tags = {
    Name        = "production-vpc"
    Environment = "production"
  }
}

resource "aws_ecs_service" "web" {
  name            = "web-service"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.web.arn
  desired_count   = 3
  launch_type     = "FARGATE"

  load_balancer {
    target_group_arn = aws_lb_target_group.web.arn
    container_name   = "web"
    container_port   = 3000
  }
}

resource "aws_rds_instance" "database" {
  engine               = "postgres"
  engine_version       = "15.4"
  instance_class       = "db.r6g.large"
  allocated_storage    = 100
  storage_encrypted    = true
  deletion_protection  = true
  backup_retention_period = 30
}

The IaC workflow typically follows these steps:

  1. Write — Define the desired infrastructure state in configuration files.
  2. Plan — Run the tool to preview what changes will be made (e.g., terraform plan).
  3. Review — Examine the plan, ideally as part of a pull request review process.
  4. Apply — Execute the changes to create, modify, or destroy resources.
  5. Store state — The tool records the current state of infrastructure so it can compute diffs on the next run.

The plan-and-review step is critical. It allows teams to see exactly what will happen before any changes are made, preventing surprises like accidental deletion of a production database. When combined with version control and code review, IaC provides an audit trail of every infrastructure change — who made it, when, why, and what was approved.

Why It Matters

IaC fundamentally changes the reliability and speed of infrastructure management. When infrastructure is defined in code, creating a new environment is as simple as running a script. Need a staging environment identical to production? Apply the same configuration with different parameters. Need to spin up a disaster recovery site? The infrastructure definition is already written.

This reproducibility is essential for disaster recovery. If an entire region goes down, a team with IaC can rebuild their infrastructure from scratch in minutes. A team relying on manual configuration might spend hours or days reconstructing their environment from memory and screenshots.

IaC also enables infrastructure changes to go through the same review process as application code. A Terraform change that opens a new port or increases instance sizes can be reviewed in a pull request, with the plan output attached for inspection. This peer review catches misconfigurations, security oversights, and cost implications before they reach production.

From a cost management perspective, IaC makes infrastructure visible and auditable. When all resources are defined in code, it is straightforward to identify unused resources, right-size instances, and enforce tagging policies that tie resources to cost centers.

Best Practices

  • Store all infrastructure code in version control. Every infrastructure definition should be in a Git repository, subject to the same branching, review, and merge practices as application code. This creates a complete history of infrastructure changes.

  • Use modules for reusability. Define common patterns — like a load-balanced web service or a secured database — as reusable modules. This reduces duplication and ensures consistency across environments and teams.

  • Manage state carefully. IaC tools like Terraform maintain a state file that maps configuration to real resources. Store this state remotely (in S3, GCS, or a dedicated backend) with locking enabled to prevent concurrent modifications that could corrupt state.

  • Validate before applying. Always run a plan step before applying changes and review the output carefully. Automate validation with policy tools like Open Policy Agent (OPA) or Sentinel to enforce security and compliance rules.

  • Keep secrets out of configuration files. Never hardcode passwords, API keys, or certificates in IaC files. Use secrets managers like AWS Secrets Manager, HashiCorp Vault, or environment-specific variable files that are excluded from version control.

Common Mistakes

  • Manual changes to IaC-managed resources. Making changes through the console or CLI bypasses the IaC tool, causing configuration drift — the actual state no longer matches the defined state. This leads to unpredictable behavior during the next apply. All changes must flow through the IaC workflow.

  • Not testing infrastructure code. IaC is code, and like all code, it should be tested. Use tools like terraform validate, tflint, or checkov for static analysis, and consider integration tests that apply configurations to ephemeral environments and verify the results.

  • Monolithic infrastructure definitions. Putting all infrastructure for an entire organization in a single Terraform root module creates blast radius problems and merge conflicts. Decompose infrastructure into logical components — networking, compute, databases, monitoring — each managed independently with explicit interfaces between them.

Related Terms

Learn More

Related Articles

Free Newsletter

Stay ahead with AI dev tools

Weekly insights on AI code review, static analysis, and developer productivity. No spam, unsubscribe anytime.

Join developers getting weekly AI tool insights.