Metrics

DORA Metrics

Four key metrics from the DORA team for measuring software delivery: deployment frequency, lead time, change failure rate, and mean time to recovery.

What Are DORA Metrics?

DORA metrics are four key measures of software delivery performance identified by the DORA (DevOps Research and Assessment) team through years of rigorous academic research. The four metrics are: deployment frequency, lead time for changes, change failure rate, and mean time to recovery (also called failed deployment recovery time in recent reports). Together, these metrics provide a balanced, evidence-based view of how effectively an engineering organization delivers software.

The DORA research program began in 2014, led by Dr. Nicole Forsgren, Jez Humble, and Gene Kim. Through annual surveys of tens of thousands of technology professionals, the team used statistical methods — including cluster analysis and structural equation modeling — to identify the practices and capabilities that drive software delivery performance. Their findings were published in the annual State of DevOps Report and later in the book Accelerate: The Science of Lean Software and DevOps (2018).

The research produced a critical insight: these four metrics reliably distinguish elite-performing organizations from low performers, and high performance on these metrics correlates with better business outcomes — including profitability, market share, and customer satisfaction. DORA metrics have since become the industry standard for measuring engineering effectiveness, adopted by organizations ranging from startups to Fortune 500 enterprises.

How It Works

The four DORA metrics capture two dimensions of delivery performance: throughput (how fast you ship) and stability (how reliably you ship).

Throughput metrics:

  1. Deployment Frequency — How often the team deploys code to production. Elite teams deploy on demand, multiple times per day.
  2. Lead Time for Changes — The time from code commit to code running in production. Elite teams achieve lead times under one hour.

Stability metrics:

  1. Change Failure Rate — The percentage of deployments that cause a failure in production (requiring a rollback, hotfix, or patch). Elite teams maintain a change failure rate between 0% and 15%.
  2. Mean Time to Recovery (MTTR) — How long it takes to restore service after a production failure. Elite teams recover in under one hour.

The DORA framework classifies teams into four performance clusters:

| Metric                  | Elite          | High           | Medium          | Low              |
|-------------------------|----------------|----------------|-----------------|------------------|
| Deployment Frequency    | On demand      | Daily to weekly | Weekly to monthly | Monthly to biannual |
| Lead Time for Changes   | < 1 hour       | 1 day to 1 week | 1 week to 1 month | 1 month to 6 months |
| Change Failure Rate     | 0-15%          | 16-30%         | 16-30%          | 16-30%           |
| Mean Time to Recovery   | < 1 hour       | < 1 day        | < 1 day         | 1 week to 1 month |

To measure DORA metrics, teams typically instrument their delivery pipeline and incident management systems:

# Example: Collecting DORA metrics data points
# Track deployments (for Deployment Frequency and Lead Time)
deployment_event:
  service: "payments-api"
  version: "abc123f"
  environment: "production"
  deployed_at: "2026-03-15T14:30:00Z"
  commits:
    - sha: "abc123f"
      committed_at: "2026-03-15T13:45:00Z"
    - sha: "def456a"
      committed_at: "2026-03-15T12:20:00Z"

# Track failures (for Change Failure Rate)
failure_event:
  service: "payments-api"
  deployment_version: "abc123f"
  detected_at: "2026-03-15T14:45:00Z"
  type: "rollback"

# Track recovery (for MTTR)
recovery_event:
  service: "payments-api"
  incident_id: "INC-4521"
  started_at: "2026-03-15T14:45:00Z"
  resolved_at: "2026-03-15T15:10:00Z"
  resolution: "rolled back to previous version"

Engineering analytics platforms such as Sleuth, LinearB, Faros AI, Jellyfish, and Google’s own Four Keys open-source project can compute DORA metrics automatically by integrating with CI/CD systems, version control, and incident management tools.

Why It Matters

DORA metrics resolved a long-standing debate in software engineering: whether speed and stability are trade-offs. Traditional thinking held that deploying more frequently would lead to more failures — that teams must choose between moving fast and maintaining reliability. The DORA research proved this wrong. Elite teams deploy more frequently and have lower failure rates. Speed and stability reinforce each other.

This finding has profound implications for how organizations manage software teams. Rather than slowing down releases to improve quality, the evidence shows that investing in automation, testing, and deployment practices enables teams to ship faster while simultaneously reducing failures. The practices that drive high deployment frequency — small batch sizes, automated testing, trunk-based development — are the same practices that reduce change failure rates.

DORA metrics also provide a common language for engineering leaders to communicate with business stakeholders. Instead of debating subjective assessments of team performance, organizations can track concrete, measurable indicators. A CTO can report that the organization improved lead time from one week to one day, or that change failure rate dropped from 25% to 10% — statements that are objective, comparable, and tied to business impact.

The balanced nature of the four metrics prevents gaming. Improving deployment frequency at the expense of stability (deploying broken code faster) would show up as a degradation in change failure rate or MTTR. Improving stability by deploying less often would degrade deployment frequency and lead time. The four metrics together create a holistic view that resists manipulation.

Best Practices

  • Start by measuring where you are. Before setting improvement targets, establish baseline measurements for all four metrics. Many teams are surprised to discover that their lead time is days, not hours, or that their change failure rate is higher than they assumed. Accurate baselines ground improvement efforts in reality.

  • Improve the system, not the metric. DORA metrics are indicators of underlying capabilities. To improve deployment frequency, invest in CI/CD automation and smaller batch sizes. To improve lead time, reduce code review wait times and automate testing. To reduce change failure rate, strengthen test coverage and implement progressive delivery. To improve MTTR, build observability and automate incident response. Focus on the capabilities, and the metrics will follow.

  • Track metrics per team, not per organization. Organization-wide averages mask the reality that different teams may be at very different performance levels. Per-team tracking identifies which teams are excelling and which need support, enabling targeted improvement efforts.

  • Review trends, not snapshots. A single week’s metrics can be misleading — a deploy freeze, a holiday, or a major incident can skew the numbers. Track metrics over rolling 30- or 90-day windows and focus on trends. Sustained improvement matters more than any individual data point.

  • Use DORA metrics in retrospectives. Include metric trends in sprint or monthly retrospectives. When lead time increases, discuss what caused the slowdown. When change failure rate spikes, investigate the root cause. Metrics inform conversations; they do not replace them.

Common Mistakes

  • Treating DORA metrics as performance evaluations for individuals. DORA metrics measure team and system performance, not individual developer productivity. Using them to evaluate or rank individual engineers creates perverse incentives — developers will game the metrics, avoid risky but valuable work, or stop collaborating. Always measure at the team level.

  • Focusing on one metric at the expense of others. Optimizing deployment frequency while ignoring change failure rate leads to shipping broken code faster. Optimizing MTTR without addressing change failure rate means you recover quickly but break things too often. The four metrics form a balanced system — improve them together.

  • Comparing across fundamentally different contexts. A regulated financial services team operating under SOX compliance will naturally have different deployment frequency than a consumer mobile app team. DORA performance tiers provide useful benchmarks, but the most meaningful comparisons are against your own historical performance. Track your improvement trajectory rather than obsessing over how you compare to elite performers in different industries.

Related Terms

Learn More

Related Articles

Free Newsletter

Stay ahead with AI dev tools

Weekly insights on AI code review, static analysis, and developer productivity. No spam, unsubscribe anytime.

Join developers getting weekly AI tool insights.