DevOps & CI/CD

Blue-Green Deployment

A deployment strategy maintaining two identical production environments (blue and green), switching traffic between them for zero-downtime releases.

What Is Blue-Green Deployment?

Blue-green deployment is a release strategy that reduces downtime and risk by running two identical production environments, conventionally called “blue” and “green.” At any given time, one environment serves all live production traffic (the active environment) while the other sits idle or is used for staging the next release (the inactive environment). When a new version is ready, it is deployed to the inactive environment, validated, and then traffic is switched over. If problems are discovered after the switch, traffic can be routed back to the previous environment instantly.

The technique was popularized by Martin Fowler and Jez Humble and has become a standard deployment pattern for applications that require zero downtime and fast rollback capabilities. It is conceptually the simplest of the progressive deployment strategies — there is no gradual traffic shifting or percentage-based routing, just a clean switch from one environment to another.

Blue-green deployments are particularly well-suited for applications where downtime is unacceptable — e-commerce platforms during peak shopping seasons, financial services applications, healthcare systems, and any service with strict SLA requirements. The ability to rollback in seconds rather than minutes or hours provides a safety net that justifies the additional infrastructure cost.

How It Works

The blue-green deployment model relies on a traffic routing layer — typically a load balancer, DNS, or service mesh — that can switch all incoming traffic from one environment to the other. The workflow proceeds as follows:

  1. Blue is live. The current production version runs in the blue environment, serving all traffic.
  2. Deploy to green. The new version is deployed to the green environment, which receives no production traffic.
  3. Validate green. Run smoke tests, health checks, and any manual verification against the green environment.
  4. Switch traffic. Update the load balancer or DNS to route all traffic from blue to green.
  5. Green is live. The new version now serves all production traffic.
  6. Blue becomes standby. The previous version remains running in blue, ready for an instant rollback if needed.

Here is an example using AWS infrastructure to implement blue-green deployments:

# AWS ALB target group switching for blue-green deployment
# blue-green-deploy.sh

#!/bin/bash
set -euo pipefail

NEW_VERSION=$1
CLUSTER="production"
ALB_LISTENER_ARN="arn:aws:elasticloadbalancing:us-east-1:123456789:listener/app/prod-alb/abc123"

# Determine which environment is currently active
ACTIVE_TG=$(aws elbv2 describe-rules --listener-arn $ALB_LISTENER_ARN \
  --query 'Rules[0].Actions[0].TargetGroupArn' --output text)

if [[ $ACTIVE_TG == *"blue"* ]]; then
  DEPLOY_TG="green"
  ROLLBACK_TG="blue"
else
  DEPLOY_TG="blue"
  ROLLBACK_TG="green"
fi

echo "Active: $ROLLBACK_TG | Deploying to: $DEPLOY_TG"

# Deploy new version to inactive environment
aws ecs update-service \
  --cluster $CLUSTER \
  --service "web-$DEPLOY_TG" \
  --task-definition "web:$NEW_VERSION" \
  --desired-count 3

# Wait for deployment to stabilize
aws ecs wait services-stable --cluster $CLUSTER --services "web-$DEPLOY_TG"

# Run smoke tests against inactive environment
./scripts/smoke-test.sh "https://$DEPLOY_TG.internal.example.com"

# Switch traffic to new environment
aws elbv2 modify-rule \
  --rule-arn $RULE_ARN \
  --actions Type=forward,TargetGroupArn=$DEPLOY_TG_ARN

echo "Traffic switched to $DEPLOY_TG. Rollback available via $ROLLBACK_TG."

The rollback process is the mirror image of the deployment: update the routing layer to point back to the previous environment. Because the old version is still running and warm, the rollback takes seconds — there is no need to rebuild, redeploy, or restart anything.

After the new version has been running successfully for a sufficient period (typically hours or days), the old environment can be updated to the new version as well, preparing both environments for the next deployment cycle.

Why It Matters

Blue-green deployment provides two properties that are difficult to achieve with other strategies: true zero-downtime deployment and instant rollback.

Zero downtime is achieved because the switch happens at the routing layer. Users are never directed to an environment that is in the process of starting up or shutting down. One moment they are hitting blue, the next they are hitting green — the transition is invisible.

Instant rollback is possible because the previous version remains running in the old environment. Unlike a rolling deployment where the old version is gradually replaced and destroyed, blue-green keeps the old version intact and operational. If a problem is detected 30 minutes after the switch, routing traffic back takes seconds. This safety net fundamentally changes the team’s relationship with risk — deployments become low-stakes events because the escape hatch is always available.

Blue-green deployments also simplify pre-production validation. The inactive environment is a perfect staging area that mirrors production exactly — same infrastructure, same configuration, same scale. Testing against this environment provides much higher confidence than testing against a separate staging environment that may differ from production in subtle but important ways.

Best Practices

  • Ensure database compatibility across both environments. The most challenging aspect of blue-green deployment is managing database schema changes. Both the blue and green versions must be able to work with the same database simultaneously. Use expand-and-contract migration patterns: add new columns before deploying code that uses them, and remove old columns only after both environments have moved past the code that references them.

  • Automate the switch process. The traffic cutover should be a single command or automated step, not a series of manual changes across multiple systems. Any manual step in the switch process is a potential source of error during a high-pressure rollback.

  • Monitor closely after switching. The period immediately after the traffic switch is the highest-risk window. Have dashboards and alerts ready, and establish clear criteria for triggering a rollback. Define a bake time — a minimum period (such as 30 minutes) of healthy metrics before considering the deployment complete.

  • Keep both environments truly identical. Differences in instance types, configuration, or scale between blue and green environments defeat the purpose of the pattern. Use infrastructure as code to ensure both environments are provisioned identically.

  • Plan for session and cache management. When traffic switches from blue to green, in-flight user sessions and warm caches are on the old environment. Use externalized session stores (Redis, database-backed sessions) and shared caches to ensure users experience a seamless transition.

Common Mistakes

  • Doubling infrastructure costs unnecessarily. Blue-green requires two full production environments, which can be expensive. Mitigate this by scaling down the inactive environment between deployments and scaling it up only when preparing for a release. Cloud auto-scaling makes this practical.

  • Neglecting database migration strategy. The most common source of blue-green deployment failures is incompatible database changes. A migration that renames a column will break the old version if a rollback is needed. Always ensure backward-compatible schema changes and test that both versions can operate against the same database simultaneously.

  • Leaving the old environment running indefinitely. After a successful deployment, teams sometimes forget to clean up or update the old environment. Over time, the old environment drifts further from the current version, and its resources consume budget without providing value. Establish a process for recycling the old environment after each successful deployment.

Related Terms

Learn More

Related Articles

Free Newsletter

Stay ahead with AI dev tools

Weekly insights on AI code review, static analysis, and developer productivity. No spam, unsubscribe anytime.

Join developers getting weekly AI tool insights.