Change Failure Rate

What Is Change Failure Rate?

Change failure rate (CFR) is one of the four key DORA (DevOps Research and Assessment) metrics that measures software delivery performance. It calculates the percentage of deployments to production that result in a degraded service requiring remediation — a rollback, a hotfix, a forward fix, or a patch. The formula is straightforward: divide the number of failed deployments by the total number of deployments in a given period.

Change failure rate answers a fundamental question: how often does pushing code to production break something? A team deploying ten times per week with one failure has a 10% change failure rate. A team deploying twice per week with one failure has a 50% change failure rate. The metric does not measure the severity of failures or the time to recover from them — those are captured by other DORA metrics like mean time to recovery. CFR specifically measures the reliability of the release process.

DORA’s annual State of DevOps reports classify engineering organizations into performance tiers based on their change failure rate. Elite performers maintain a CFR between 0% and 5%. High performers fall between 6% and 15%. Medium performers range from 16% to 30%. Low performers exceed 30%, meaning more than one in three deployments causes a production issue. These benchmarks give teams a target and a way to compare their performance against industry standards.

How It Works

Change failure rate is calculated using deployment and incident data from production systems:

Change Failure Rate = (Number of Failed Deployments / Total Deployments) × 100

Example:
  Total deployments in March: 40
  Deployments causing incidents: 3
  CFR = (3 / 40) × 100 = 7.5%

What counts as a “failed deployment” is the critical definition to get right. Teams should establish clear criteria. Common definitions include:

Any deployment that triggers a rollback
Any deployment that requires a hotfix within 24 hours
Any deployment that causes a customer-facing incident
Any deployment that degrades key performance indicators (latency, error rate, availability)

# Example: Tracking CFR with deployment event metadata
deployment_events:
  - id: deploy-2026-03-01-01
    timestamp: "2026-03-01T14:00:00Z"
    service: user-service
    result: success

  - id: deploy-2026-03-01-02
    timestamp: "2026-03-01T16:30:00Z"
    service: payment-service
    result: failure
    failure_type: rollback
    incident_id: INC-4521
    root_cause: "Null pointer in new payment validation logic"

  - id: deploy-2026-03-02-01
    timestamp: "2026-03-02T10:00:00Z"
    service: user-service
    result: success

Engineering analytics platforms like Sleuth, LinearB, and Propelo automate CFR tracking by integrating with deployment tools (GitHub Actions, ArgoCD, Jenkins) and incident management systems (PagerDuty, Opsgenie, Incident.io). They correlate deployment timestamps with incident timestamps to automatically identify which deployments caused failures.

More sophisticated analyses break CFR down by service, team, deployment window, and change size to identify patterns. A team might discover that their overall CFR is 8%, but deployments on Fridays have a 25% failure rate, or that changes to the database layer fail three times more often than frontend changes.

Why It Matters

Change failure rate is a critical indicator of engineering quality and process health.

Confidence in deployments. A low change failure rate means the team can deploy with confidence. When deployments rarely fail, teams are more willing to deploy frequently, which in turn enables smaller changes, faster feedback, and shorter lead times. High CFR creates a vicious cycle: teams deploy less often because deployments are risky, which leads to larger changes, which are even more likely to fail.

Deployment frequency enabler. Change failure rate and deployment frequency are intimately connected. DORA’s research consistently shows that elite performers achieve both high deployment frequency and low change failure rate. These metrics do not trade off against each other — the practices that reduce CFR (smaller changes, better testing, automated rollbacks) also enable more frequent deployment.

Quality signal for code review. A rising change failure rate may indicate that code reviews are not catching issues that matter. If failures are caused by bugs that should have been caught during review, the team needs to improve review depth, focus areas, or tooling. AI code review tools can supplement human reviewers by consistently checking for common failure patterns.

Business impact quantification. Each failed deployment has a direct cost: engineering time spent on remediation, customer impact during the outage, opportunity cost of delayed feature work, and potential revenue loss. By tracking CFR, teams can quantify the business impact of quality investments like better testing, improved code review, and safer deployment practices.

Trust with stakeholders. Business stakeholders who see a low change failure rate gain confidence in the engineering team’s ability to deliver reliably. This trust translates into greater autonomy, less pressure for risk-averse processes, and better cross-functional relationships.

Best Practices

Define “failure” clearly and consistently. Ambiguity in what counts as a failed deployment makes CFR data unreliable. Document your definition, get team alignment, and apply it consistently. Include near-misses (caught by canary deployments or feature flags) if they would have caused production impact without the safety mechanism.
Track CFR alongside deployment frequency. A low CFR is meaningless if the team only deploys once a month. A team that deploys daily with a 5% CFR is performing much better than a team that deploys monthly with a 5% CFR, because they are delivering value faster while maintaining the same reliability.
Conduct blameless postmortems for every failed deployment. Each failure is a learning opportunity. Document what went wrong, why it was not caught by tests or code review, and what changes will prevent similar failures. Feed these insights back into your testing strategy and review checklist.
Invest in pre-production quality gates. Automated tests, staging environments, canary deployments, and feature flags all reduce change failure rate by catching issues before they reach production. The cost of these investments is typically a fraction of the cost of production failures.
Measure CFR per service or team. Aggregate CFR can mask problems. A team-level breakdown reveals which services or teams need additional support, better testing infrastructure, or process improvements.

Common Mistakes

Excluding certain types of failures. Teams sometimes exclude “minor” failures, infrastructure-related issues, or failures caught by automated rollbacks from their CFR calculation. This artificially lowers the metric and hides quality problems. Include all failures that required remediation, regardless of severity.
Optimizing CFR by reducing deployment frequency. If the team deploys less often to achieve a lower failure count, the CFR percentage may improve, but the underlying quality has not changed — and the team is now delivering value more slowly. Focus on improving the quality of each deployment, not on reducing the total number of deployments.
Treating CFR as a performance metric for individuals. Failed deployments are systemic outcomes influenced by testing infrastructure, review processes, deployment tooling, and team practices. Attributing failures to individual developers creates blame culture and discourages deployments. Use CFR to improve processes, not to evaluate people.